Blog

MOTIVATION: Chromatin conformation capture experiments (CCC), such as Hi-C and Capture Hi-C (CHiC) work to elucidate the three-dimensional organization of the genome and the underlying epigenetic regulatory structures within. CCC experiments produce large amounts of FASTQ sequencing data with a substantial amount of technical noise and require sophisticated computational pipelines in order to extract meaningful results. Large-scale CCC data repositories like 4D Nucleome and ENCODE mostly provide raw contact information but lack annotated, statistically significant interaction data suitable for downstream genetic and genomic analyses.
RESULTS: Here, we present CHARMER, an end-to-end pipeline integrated across multiple CCC assay types (HiC, CHiC) which generates statistically significant, harmonized, queryable, chromatin interactions in a consistent BED-like format across cell/tissue types and CCC assays.
AVAILABILITY: CHARMER is freely available at https://bitbucket.org/wanglab-upenn/CHARMER and harmonized chromatin interaction data will be available in the upcoming version of the FILER database (https://lisanwanglab.org/FILER).

Copy number variants (CNVs) are DNA gains or losses involving >50 base pairs. Assessing CNV effects on disease risk requires consideration of several factors. First, there are no natural definitions for CNV loci. Second, CNV effects can depend on dosage and length. Third, CNV effects can be more accurately estimated when all CNV events in a genomic region are analyzed together to assess their joint effects. We propose a new framework for association analysis that directly models an individual’s entire CNV profile within a genomic region. This framework represents an individual’s CNVs using a CNV profile curve to capture variations in CNV length and dosage and to bypass the need to predefine CNV loci. CNV effects are estimated at each genome position, making the results comparable across different studies. To jointly estimate the effects of all CNVs, we use a Lasso penalty to select CNVs associated with the trait and integrate a weighted L2-fusion penalty to encourage similar effects of adjacent CNVs when supported by the data. Simulations show that the proposed model can more effectively identify causal CNVs while maintaining false positive rates comparable to baseline methods and yield more precise effect-size estimates across different settings. When applied to CNV derived from whole genome sequencing data of the Alzheimer’s Disease Sequencing Project, the proposed methods identify additional CNVs associated with Alzheimer’s Disease (AD). These identified CNVs overlap with several known AD-risk genes and are significantly enriched by biological processes related to neuron structures and functions crucial in AD development.

BACKGROUND: Blood-derived mitochondrial DNA copy number (mtDNA-CN) is a proxy measurement of mitochondrial function in the peripheral and central systems. Abnormal mtDNA-CN not only indicates impaired mtDNA replication and transcription machinery but also dysregulated biological processes such as energy and lipid metabolism. However, the relationship between mtDNA-CN and Alzheimer disease (AD) is unclear.
METHODS: We performed two-sample Mendelian randomization (MR) using publicly available summary statistics from GWAS for mtDNA-CN and AD to investigate the causal relationship between mtDNA-CN and AD. We estimated mtDNA-CN using whole-genome sequence data from blood and brain samples of 13,799 individuals from the Alzheimer’s Disease Sequencing Project. Linear and Cox proportional hazards models adjusting for age, sex, and study phase were used to assess the association of mtDNA-CN with AD. The association of AD biomarkers and serum metabolites with mtDNA-CN in blood was evaluated in Alzheimer’s Disease Neuroimaging Initiative using linear regression. We conducted a causal mediation analysis to test the natural indirect effects of mtDNA-CN change on AD risk through the significantly associated biomarkers and metabolites.
RESULTS: MR analysis suggested a causal relationship between decreased blood-derived mtDNA-CN and increased risk of AD (OR = 0.68; P = 0.013). Survival analysis showed that decreased mtDNA-CN was significantly associated with higher risk of conversion from mild cognitive impairment to AD (HR = 0.80; P = 0.002). We also identified significant associations of mtDNA-CN with brain FDG-PET (β = 0.103; P = 0.022), amyloid-PET (β = 0.117; P = 0.034), CSF amyloid-β (Aβ) 42/40 (β=-0.124; P = 0.017), CSF t-Tau (β = 0.128; P = 0.015), p-Tau (β = 0.140; P = 0.008), and plasma NFL (β=-0.124; P = 0.004) in females. Several lipid species, amino acids, biogenic amines in serum were also significantly associated with mtDNA-CN. Causal mediation analyses showed that about a third of the effect of mtDNA-CN on AD risk was mediated by plasma NFL (P = 0.009), and this effect was more significant in females (P < 0.005).
CONCLUSIONS: Our study indicates that mtDNA-CN measured in blood is predictive of AD and is associated with AD biomarkers including plasma NFL particularly in females. Further, we illustrate that decreased mtDNA-CN possibly increases AD risk through dysregulation of mitochondrial lipid metabolism and inflammation.

INTRODUCTION: Alzheimer’s disease (AD) is a common disorder of the elderly that is both highly heritable and genetically heterogeneous.
METHODS: We investigated the association of AD with both common variants and aggregates of rare coding and non-coding variants in 13,371 individuals of diverse ancestry with whole genome sequencing (WGS) data.
RESULTS: Pooled-population analyses of all individuals identified genetic variants at apolipoprotein E (APOE) and BIN1 associated with AD (p < 5 × 10-8). Subgroup-specific analyses identified a haplotype on chromosome 14 including PSEN1 associated with AD in Hispanics, further supported by aggregate testing of rare coding and non-coding variants in the region. Common variants in LINC00320 were observed associated with AD in Black individuals (p = 1.9 × 10-9). Finally, we observed rare non-coding variants in the promoter of TOMM40 distinct of APOE in pooled-population analyses (p = 7.2 × 10-8).
DISCUSSION: We observed that complementary pooled-population and subgroup-specific analyses offered unique insights into the genetic architecture of AD.
HIGHLIGHTS: We determine the association of genetic variants with Alzheimer's disease (AD) using 13,371 individuals of diverse ancestry with whole genome sequencing (WGS) data. We identified genetic variants at apolipoprotein E (APOE), BIN1, PSEN1, and LINC00320 associated with AD. We observed rare non-coding variants in the promoter of TOMM40 distinct of APOE.

Progressive supranuclear palsy (PSP), a rare Parkinsonian disorder, is characterized by problems with movement, balance, and cognition. PSP differs from Alzheimer’s disease (AD) and other diseases, displaying abnormal microtubule-associated protein tau by both neuronal and glial cell pathologies. Genetic contributors may mediate these differences; however, the genetics of PSP remain underexplored. Here we conduct the largest genome-wide association study (GWAS) of PSP which includes 2779 cases (2595 neuropathologically-confirmed) and 5584 controls and identify six independent PSP susceptibility loci with genome-wide significant (P < 5 × 10-8) associations, including five known (MAPT, MOBP, STX6, RUNX2, SLCO1A2) and one novel locus (C4A). Integration with cell type-specific epigenomic annotations reveal an oligodendrocytic signature that might distinguish PSP from AD and Parkinson's disease in subsequent studies. Candidate PSP risk gene prioritization using expression quantitative trait loci (eQTLs) identifies oligodendrocyte-specific effects on gene expression in half of the genome-wide significant loci, and an association with C4A expression in brain tissue, which may be driven by increased C4A copy number. Finally, histological studies demonstrate tau aggregates in oligodendrocytes that colocalize with C4 (complement) deposition. Integrating GWAS with functional studies, epigenomic and eQTL analyses, we identify potential causal roles for variation in MOBP, STX6, RUNX2, SLCO1A2, and C4A in PSP pathogenesis.

INTRODUCTION: The objective of this pilot study was to establish the feasibility of recruiting older Vietnamese Americans for research addressing genetic and nongenetic risk factors for Alzheimer disease (AD).
METHODS: Twenty-six Vietnamese Americans were recruited from communities in San Diego. A Community Advisory Board provided cultural and linguistic advice. Bilingual/bicultural staff measured neuropsychological, neuropsychiatric, lifestyle, and medical/neurological functioning remotely. Saliva samples allowed DNA extraction. A consensus team reviewed clinical data to determine a diagnosis of normal control (NC), mild cognitive impairment (MCI), or dementia. Exploratory analyses addressed AD risk by measuring subjective cognitive complaints (SCC), depression, and vascular risk factors (VRFs).
RESULTS: Twenty-five participants completed the study (mean age=73.8 y). Eighty percent chose to communicate in Vietnamese. Referrals came primarily from word of mouth within Vietnamese communities. Diagnoses included 18 NC, 3 MCI, and 4 dementia. Participants reporting SCC acknowledged more depressive symptoms and had greater objective cognitive difficulty than those without SCC. Eighty-eight percent of participants reported at least 1 VRF.
DISCUSSION: This pilot study supports the feasibility of conducting community-based research in older Vietnamese Americans. Challenges included developing linguistically and culturally appropriate cognitive and neuropsychiatric assessment tools. Exploratory analyses addressing nongenetic AD risk factors suggest topics for future study.

BACKGROUND: Progressive supranuclear palsy (PSP) is a rare neurodegenerative disease characterized by the accumulation of aggregated tau proteins in astrocytes, neurons, and oligodendrocytes. Previous genome-wide association studies for PSP were based on genotype array, therefore, were inadequate for the analysis of rare variants as well as larger mutations, such as small insertions/deletions (indels) and structural variants (SVs).
METHOD: In this study, we performed whole genome sequencing (WGS) and conducted association analysis for single nucleotide variants (SNVs), indels, and SVs, in a cohort of 1,718 cases and 2,944 controls of European ancestry. Of the 1,718 PSP individuals, 1,441 were autopsy-confirmed and 277 were clinically diagnosed.
RESULTS: Our analysis of common SNVs and indels confirmed known genetic loci at MAPT, MOBP, STX6, SLCO1A2, DUSP10, and SP1, and further uncovered novel signals in APOE, FCHO1/MAP1S, KIF13A, TRIM24, TNXB, and ELOVL1. Notably, in contrast to Alzheimer’s disease (AD), we observed the APOE ε2 allele to be the risk allele in PSP. Analysis of rare SNVs and indels identified significant association in ZNF592 and further gene network analysis identified a module of neuronal genes dysregulated in PSP. Moreover, seven common SVs associated with PSP were observed in the H1/H2 haplotype region (17q21.31) and other loci, including IGH, PCMT1, CYP2A13, and SMCP. In the H1/H2 haplotype region, there is a burden of rare deletions and duplications (P = 6.73 × 10-3) in PSP.
CONCLUSIONS: Through WGS, we significantly enhanced our understanding of the genetic basis of PSP, providing new targets for exploring disease mechanisms and therapeutic interventions.

INTRODUCTION: Despite a two-fold risk, individuals of African ancestry have been underrepresented in Alzheimer’s disease (AD) genomics efforts.
METHODS: Genome-wide association studies (GWAS) of 2,903 AD cases and 6,265 controls of African ancestry. Within-dataset results were meta-analyzed, followed by functional genomics analyses.
RESULTS: A novel AD-risk locus was identified in MPDZ on chromosome (chr) 9p23 (rs141610415, MAF = 0.002, p = 3.68×10-9). Two additional novel common and nine rare loci were identified with suggestive associations (P < 9×10-7). Comparison of association and linkage disequilibrium (LD) patterns between datasets with higher and lower degrees of African ancestry showed differential association patterns at chr12q23.2 (ASCL1), suggesting that this association is modulated by regional origin of local African ancestry.
DISCUSSION: These analyses identified novel AD-associated loci in individuals of African ancestry and suggest that degree of African ancestry modulates some associations. Increased sample sets covering as much African genetic diversity as possible will be critical to identify additional loci and deconvolute local genetic ancestry effects.
HIGHLIGHTS: Genetic ancestry significantly impacts risk of Alzheimer's Disease (AD). Although individuals of African ancestry are twice as likely to develop AD, they are vastly underrepresented in AD genomics studies. The Alzheimer's Disease Genetics Consortium has previously identified 16 common and rare genetic loci associated with AD in African American individuals. The current analyses significantly expand this effort by increasing the sample size and extending ancestral diversity by including populations from continental Africa. Single variant meta-analysis identified a novel genome-wide significant AD-risk locus in individuals of African ancestry at the MPDZ gene, and 11 additional novel loci with suggestive genome-wide significance at p < 9×10-7. Comparison of African American datasets with samples of higher degree of African ancestry demonstrated differing patterns of association and linkage disequilibrium at one of these loci, suggesting that degree and/or geographic origin of African ancestry modulates the effect at this locus. These findings illustrate the importance of increasing number and ancestral diversity of African ancestry samples in AD genomics studies to fully disentangle the genetic architecture underlying AD, and yield more effective ancestry-informed genetic screening tools and therapeutic interventions.

Detecting structural variants (SVs) in whole-genome sequencing poses significant challenges. We present a protocol for variant calling, merging, genotyping, sensitivity analysis, and laboratory validation for generating a high-quality SV call set in whole-genome sequencing from the Alzheimer’s Disease Sequencing Project comprising 578 individuals from 111 families. Employing two complementary pipelines, Scalpel and Parliament, for SV/indel calling, we assessed sensitivity through sample replicates (N = 9) with in silico variant spike-ins. We developed a novel metric, D-score, to evaluate caller specificity for deletions. The accuracy of deletions was evaluated by Sanger sequencing. We generated a high-quality call set of 152,301 deletions of diverse sizes. Sanger sequencing validated 114 of 146 detected deletions (78.1%). Scalpel excelled in accuracy for deletions ≤100 bp, whereas Parliament was optimal for deletions >900 bp. Overall, 83.0% and 72.5% of calls by Scalpel and Parliament were validated, respectively, including all 11 deletions called by both Parliament and Scalpel between 101 and 900 bp. Our flexible protocol successfully generated a high-quality deletion call set and a truth set of Sanger sequencing-validated deletions with precise breakpoints spanning 1-17,000 bp.

OBJECTIVE: This study examines whether individualism weakens the effectiveness of the COVID-19 vaccine eligibility expansions in the United States in 2021, and assesses the associated social benefits or costs associated with individualism.
METHODS: We construct a county-level composite individualism index as a proxy of culture and the fraction of vaccine eligible population as a proxy of vaccination campaign (mean: 41.34%). We estimate whether the COVID-19 vaccine eligibility policy is less effective in promoting vaccine coverage, reducing in COVID-19 related hospitalization and death using a linear two-way fixed effect model in a sample of 2866 counties for the period between early December 2020 and July 1, 2021. We also test whether individualism shapes people’s attitudes towards vaccine using a linear probability model in a sample of 625,308 individuals aged 18-65 (mean age: 43.3; 49% male; 59.1% non-Hispanic white, 19.1% Hispanic, 12% African American; 5.9% Asian) from the Household Pulse Survey.
RESULTS: The effects of expanded vaccine eligibility are diminished in counties with greater individualism, as evidenced by lower effectiveness in increasing vaccination rates and reducing outpatient doctor visits primarily for COVID-related symptoms and COVID deaths. Moreover, our results show that this cultural influence on attitudes towards vaccine is more pronounced among the less educated, but unrelated to race.
CONCLUSION: Assuming an average level of vaccine eligibility policies and an average intensity of individualism across the nation, we calculate that the average social cost associated with an individualistic culture amid the pandemic is approximately $50.044 billion, equivalent to 1.32% of the total U.S. health care spending in 2019. Our paper suggests that strategies to promote public policy compliance should be tailored to accommodate cultural and social contexts.