As infection rates waned following the first wave of the COVID-19 pandemic in mid-July 2020, researchers at the Perelman School of Medicine set out to sequence the genome of SARS-CoV-2 in Philadelphia to better understand its spatial and temporal dynamics within the community. Sequencing the genome of a virus can reveal its variants, and in so doing, where and when these variants appeared.
In the US and elsewhere, genome sequencing has allowed researchers and epidemiologists to track COVID-19 with great accuracy over time. During the first wave of the pandemic, genetic sequencing identified the D614G variant as the principal strain worldwide (it has since seen competition from other variants originating in the UK and South Africa). D614G emerged in Europe in early 2020, and is not identical with the Wuhan strain (Wuhan-Hu-1) that first appeared in China.
The researchers at Penn, who included Drs. Susan R. Weiss and Frederick Bushman and members of their laboratories, collected 52 samples from 27 hospitalized patients in Philadelphia over a two-month period, and obtained two samples of the reference isolate (USA-WA1-2020) from the first US COVID patient in Seattle, WA. They then performed full-genome sequencing by reverse transcription of the viral RNA on the local samples. Patients whose genome sequences did not have least 95% coverage of the USA-WA1-2020 reference were not accepted for analysis. (“Coverage” is the number of sample nucleotide base sequences aligned to a specific locus in a reference genome.)
After a series of analyses, all genomes from Philadelphia were found to encode the D614G spike polymorphism and a series of other sequence variations, and all contained further polymorphisms distinguishing them from the USA WA1-2020 reference isolate.
When SARS-CoV-2 genomes from Philadelphia were compared to global sequences at several time points in the epidemic, the closest-matches occurred between the Philadelphia polymorphisms and variants circulating in New York City before the disease appeared locally (22 of 27 subjects). Other sequences aligned most frequently with sequences from Massachusetts (1 subject), Sweden (1 subject), California (2 subjects), and New Jersey (1 subject).
None of the subjects with sequences linked to Sweden, Massachusetts, or California had known direct contact with these locations prior to illness, suggesting community spread as the source of infection. There was also evidence that closely related lineages were circulating within the Philadelphia community.
The study, which also assessed viral polymorphisms in Philadelphia in the context of patient outcomes, longitudinal variation of viral sequences within subjects between time points and whether polymorphisms were accumulating in the binding site, (suggestive of evolution to drug resistance) is available at mBio, the online journal of the American Society for Microbiology.