We identified rare coding variants associated with Alzheimer’s disease in a three-stage case-control study of 85,133 subjects. In stage 1, we genotyped 34,174 samples using a whole-exome microarray. In stage 2, we tested associated variants (P < 1 × 10-4) in 35,962 independent samples using de novo genotyping and imputed genotypes. In stage 3, we used an additional 14,997 samples to test the most significant stage 2 associations (P < 5 × 10-8) using imputed genotypes. We observed three new genome-wide significant nonsynonymous variants associated with Alzheimer's disease: a protective variant in PLCG2 (rs72824905: p.Pro522Arg, P = 5.38 × 10-10, odds ratio (OR) = 0.68, minor allele frequency (MAF)cases = 0.0059, MAFcontrols = 0.0093), a risk variant in ABI3 (rs616338: p.Ser209Phe, P = 4.56 × 10-10, OR = 1.43, MAFcases = 0.011, MAFcontrols = 0.008), and a new genome-wide significant variant in TREM2 (rs143332484: p.Arg62His, P = 1.55 × 10-14, OR = 1.67, MAFcases = 0.0143, MAFcontrols = 0.0089), a known susceptibility gene for Alzheimer's disease. These protein-altering changes are in genes highly expressed in microglia and highlight an immune-related protein-protein interaction network enriched for previously identified risk genes in Alzheimer's disease. These genetic findings provide additional evidence that the microglia-mediated innate immune response contributes directly to the development of Alzheimer's disease.
Blog
INTRODUCTION: Genetic loci for Alzheimer’s disease (AD) have been identified in whites of European ancestry, but the genetic architecture of AD among other populations is less understood.
METHODS: We conducted a transethnic genome-wide association study (GWAS) for late-onset AD in Stage 1 sample including whites of European Ancestry, African-Americans, Japanese, and Israeli-Arabs assembled by the Alzheimer’s Disease Genetics Consortium. Suggestive results from Stage 1 from novel loci were followed up using summarized results in the International Genomics Alzheimer’s Project GWAS dataset.
RESULTS: Genome-wide significant (GWS) associations in single-nucleotide polymorphism (SNP)-based tests (P < 5 × 10-8) were identified for SNPs in PFDN1/HBEGF, USP6NL/ECHDC3, and BZRAP1-AS1 and for the interaction of the (apolipoprotein E) APOE ε4 allele with NFIC SNP. We also obtained GWS evidence (P < 2.7 × 10-6) for gene-based association in the total sample with a novel locus, TPBG (P = 1.8 × 10-6).
DISCUSSION: Our findings highlight the value of transethnic studies for identifying novel AD susceptibility loci.
RNA molecules are often altered post-transcriptionally by the covalent modification of their nucleotides. These modifications are known to modulate the structure, function, and activity of RNAs. When reverse transcribed into cDNA during RNA sequencing library preparation, atypical (modified) ribonucleotides that affect Watson-Crick base pairing will interfere with reverse transcriptase (RT), resulting in cDNA products with mis-incorporated bases or prematurely terminated RNA products. These interactions with RT can therefore be inferred from mismatch patterns in the sequencing reads, and are distinguishable from simple base-calling errors, single-nucleotide polymorphisms (SNPs), or RNA editing sites. Here, we describe a computational protocol for the in silico identification of modified ribonucleotides from RT-based RNA-seq read-out using the High-throughput Analysis of Modified Ribonucleotides (HAMR) software. HAMR can identify these modifications transcriptome-wide with single nucleotide resolution, and also differentiate between different types of modifications to predict modification identity. Researchers can use HAMR to identify and characterize RNA modifications using RNA-seq data from a variety of common RT-based sequencing protocols such as Poly(A), total RNA-seq, and small RNA-seq.
We performed an exome-wide association analysis in 1393 late-onset Alzheimer’s disease (LOAD) cases and 8141 controls from the CHARGE consortium. We found that a rare variant (P155L) in TM2D3 was enriched in Icelanders (~0.5% versus <0.05% in other European populations). In 433 LOAD cases and 3903 controls from the Icelandic AGES sub-study, P155L was associated with increased risk and earlier onset of LOAD [odds ratio (95% CI) = 7.5 (3.5-15.9), p = 6.6×10-9]. Mutation in the Drosophila TM2D3 homolog, almondex, causes a phenotype similar to loss of Notch/Presenilin signaling. Human TM2D3 is capable of rescuing these phenotypes, but this activity is abolished by P155L, establishing it as a functionally damaging allele. Our results establish a rare TM2D3 variant in association with LOAD susceptibility, and together with prior work suggests possible links to the β-amyloid cascade.
BACKGROUND: Infant acute lymphoblastic leukemia (ALL) has never occurred in families except for the ∼100% concordant cases in monozygous twins attributed to twin-to-twin metastases. We report the first kindred with infant ALL in non-twin siblings. The siblings were diagnosed with MLL-rearranged (MLL-R) ALL 26 months apart. The second affected sibling had an unaffected dichorionic monozygous co-twin. Both had fatal outcomes.
PROCEDURES: Translocations were characterized by karyotype, FISH, multiplex FISH, and MLL breakpoint cluster region (bcr) Southern blot analysis. Breakpoint junctions and fusion transcripts were cloned by PCR. TP53 mutation and NADPH quinone oxidorecuctase 1 (NQO1) C609T analyses were performed, and pedigree history and parental occupations were ascertained. The likelihood of chance occurrence of infant ALL in non-twin siblings was computed based on a binomial distribution. Zygosity was determined by single nucleotide polymorphism (SNP) array.
RESULTS: The translocations were not related or vertically transmitted. The complex karyotype of the proband’s ALL had chromosome 2, 3, 4, and 11 abnormalities causing a 5′-MLL-AFF1-3′ fusion and a non-productive rearrangement of 3’MLL with a chromosome 3q intergenic region. The affected twin’s ALL exhibited a simple t(4;11). The complex karyotype of the proband’s ALL suggested a genotoxic insult, but no exposure was identified. There was no germline TP53 mutation. The NQO1 C609T risk allele was absent. The likelihood of infant ALL occurring in non-twin siblings by chance alone is one in 1.198 × 10(9) families.
CONCLUSIONS: Whether because of a deleterious transplacental exposure, novel predisposition syndrome, or exceedingly rare chance occurrence, MLL-R infant ALL can occur in non-twin siblings. The discordant occurrence of infant ALL in the monozygous twins was likely because they were dichorionic.
OBJECTIVE: To identify a causative variant(s) that may contribute to Alzheimer disease (AD) in African Americans (AA) in the ATP-binding cassette, subfamily A (ABC1), member 7 (ABCA7) gene, a known risk factor for late-onset AD.
METHODS: Custom capture sequencing was performed on ∼150 kb encompassing ABCA7 in 40 AA cases and 37 AA controls carrying the AA risk allele (rs115550680). Association testing was performed for an ABCA7 deletion identified in large AA data sets (discovery n = 1,068; replication n = 1,749) and whole exome sequencing of Caribbean Hispanic (CH) AD families.
RESULTS: A 44-base pair deletion (rs142076058) was identified in all 77 risk genotype carriers, which shows that the deletion is in high linkage disequilibrium with the risk allele. The deletion was assessed in a large data set (531 cases and 527 controls) and, after adjustments for age, sex, and APOE status, was significantly associated with disease (p = 0.0002, odds ratio [OR] = 2.13 [95% confidence interval (CI): 1.42-3.20]). An independent data set replicated the association (447 cases and 880 controls, p = 0.0117, OR = 1.65 [95% CI: 1.12-2.44]), and joint analysis increased the significance (p = 1.414 × 10(-5), OR = 1.81 [95% CI: 1.38-2.37]). The deletion is common in AA cases (15.2%) and AA controls (9.74%), but in only 0.12% of our non-Hispanic white cohort. Whole exome sequencing of multiplex, CH families identified the deletion cosegregating with disease in a large sibship. The deleted allele produces a stable, detectable RNA strand and is predicted to result in a frameshift mutation (p.Arg578Alafs) that could interfere with protein function.
CONCLUSIONS: This common ABCA7 deletion could represent an ethnic-specific pathogenic alteration in AD.
BACKGROUND: RNA molecules fold into complex three-dimensional shapes, guided by the pattern of hydrogen bonding between nucleotides. This pattern of base pairing, known as RNA secondary structure, is critical to their cellular function. Recently several diverse methods have been developed to assay RNA secondary structure on a transcriptome-wide scale using high-throughput sequencing. Each approach has its own strengths and caveats, however there is no widely available tool for visualizing and comparing the results from these varied methods.
METHODS: To address this, we have developed Structure Surfer, a database and visualization tool for inspecting RNA secondary structure in six transcriptome-wide data sets from human and mouse ( http://tesla.pcbi.upenn.edu/strucuturesurfer/ ). The data sets were generated using four different high-throughput sequencing based methods. Each one was analyzed with a scoring pipeline specific to its experimental design. Users of Structure Surfer have the ability to query individual loci as well as detect trends across multiple sites.
RESULTS: Here, we describe the included data sets and their differences. We illustrate the database’s function by examining known structural elements and we explore example use cases in which combined data is used to detect structural trends.
CONCLUSIONS: In total, Structure Surfer provides an easy-to-use database and visualization interface for allowing users to interrogate the currently available transcriptome-wide RNA secondary structure information for mammals.
Alzheimer’s disease (AD) is a complex genetic disorder with no effective treatments. More than 20 common markers have been identified, which are associated with AD. Recently, several rare variants have been identified in Amyloid Precursor Protein (APP), Triggering Receptor Expressed On Myeloid Cells 2 (TREM2) and Unc-5 Netrin Receptor C (UNC5C) that affect risk for AD. Despite the many successes, the genetic architecture of AD remains unsolved. We used Genome-wide Complex Trait Analysis to (1) estimate phenotypic variance explained by genetics; (2) calculate genetic variance explained by known AD single nucleotide polymorphisms (SNPs); and (3) identify the genomic locations of variation that explain the remaining unexplained genetic variance. In total, 53.24% of phenotypic variance is explained by genetics, but known AD SNPs only explain 30.62% of the genetic variance. Of the unexplained genetic variance, approximately 41% is explained by unknown SNPs in regions adjacent to known AD SNPs, and the remaining unexplained genetic variance outside these regions.
INTRODUCTION: African-American (AA) individuals have a higher risk for late-onset Alzheimer’s disease (LOAD) than Americans of primarily European ancestry (EA). Recently, the largest genome-wide association study in AAs to date confirmed that six of the Alzheimer’s disease (AD)-related genetic variants originally discovered in EA cohorts are also risk variants in AA; however, the risk attributable to many of the loci (e.g., APOE, ABCA7) differed substantially from previous studies in EA. There likely are risk variants of higher frequency in AAs that have not been discovered.
METHODS: We performed a comprehensive analysis of genetically determined local and global ancestry in AAs with regard to LOAD status.
RESULTS: Compared to controls, LOAD cases showed higher levels of African ancestry, both globally and at several LOAD relevant loci, which explained risk for AD beyond global differences.
DISCUSSION: Exploratory post hoc analyses highlight regions with greatest differences in ancestry as potential candidate regions for future genetic analyses.
Small non-coding RNAs (sncRNAs) are highly abundant RNAs, typically <100 nucleotides long, that act as key regulators of diverse cellular processes. Although thousands of sncRNA genes are known to exist in the human genome, no single database provides searchable, unified annotation, and expression information for full sncRNA transcripts and mature RNA products derived from these larger RNAs. Here, we present the Database of small human noncoding RNAs (DASHR). DASHR contains the most comprehensive information to date on human sncRNA genes and mature sncRNA products. DASHR provides a simple user interface for researchers to view sequence and secondary structure, compare expression levels, and evidence of specific processing across all sncRNA genes and mature sncRNA products in various human tissues. DASHR annotation and expression data covers all major classes of sncRNAs including microRNAs (miRNAs), Piwi-interacting (piRNAs), small nuclear, nucleolar, cytoplasmic (sn-, sno-, scRNAs, respectively), transfer (tRNAs), and ribosomal RNAs (rRNAs). Currently, DASHR (v1.0) integrates 187 smRNA high-throughput sequencing (smRNA-seq) datasets with over 2.5 billion reads and annotation data from multiple public sources. DASHR contains annotations for ∼ 48,000 human sncRNA genes and mature sncRNA products, 82% of which are expressed in one or more of the curated tissues. DASHR is available at http://lisanwanglab.org/DASHR.