Blog

The surprising observation that virtually the entire human genome is transcribed means we know little about the function of many emerging classes of RNAs, except their astounding diversities. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their abilities to classify the various collections of non-coding RNAs (ncRNAs). To address this, we developed Classification of RNAs by Analysis of Length (CoRAL), a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length and cleavage specificity to distinguish between different ncRNA populations. We evaluated CoRAL using genome-wide small RNA sequencing data sets from four human tissue types and were able to classify six different types of RNAs with ∼80% cross-validation accuracy. Analysis by CoRAL revealed that microRNAs, small nucleolar and transposon-derived RNAs are highly discernible and consistent across all human tissue types assessed, whereas long intergenic ncRNAs, small cytoplasmic RNAs and small nuclear RNAs show less consistent patterns. The ability to reliably annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using small RNA sequencing data in less well-characterized organisms.

Cellular senescence is accompanied by dramatic changes in chromatin structure and gene expression. Using Saccharomyces cerevisiae mutants lacking telomerase (tlc1Δ) to model senescence, we found that with critical telomere shortening, the telomere-binding protein Rap1 (repressor activator protein 1) relocalizes to the upstream promoter regions of hundreds of new target genes. The set of new Rap1 targets at senescence (NRTS) is preferentially activated at senescence, and experimental manipulations of Rap1 levels indicate that it contributes directly to NRTS activation. A notable subset of NRTS includes the core histone-encoding genes; we found that Rap1 contributes to their repression and that histone protein levels decline at senescence. Rap1 and histones also display a target site-specific antagonism that leads to diminished nucleosome occupancy at the promoters of up-regulated NRTS. This antagonism apparently impacts the rate of senescence because underexpression of Rap1 or overexpression of the core histones delays senescence. Rap1 relocalization is not a simple consequence of lost telomere-binding sites, but rather depends on the Mec1 checkpoint kinase. Rap1 relocalization is thus a novel mechanism connecting DNA damage responses (DDRs) at telomeres to global changes in chromatin and gene expression while driving the pace of senescence.

Enhancer elements are essential for tissue-specific gene regulation during mammalian development. Although these regulatory elements are often distant from their target genes, they affect gene expression by recruiting transcription factors to specific promoter regions. Because of this long-range action, the annotation of enhancer element-target promoter pairs remains elusive. Here, we developed a novel analysis methodology that takes advantage of Hi-C data to comprehensively identify these interactions throughout the human genome. To do this, we used a geometric distribution-based model to identify DNA-DNA interaction hotspots that contact gene promoters with high confidence. We observed that these promoter-interacting hotspots significantly overlap with known enhancer-associated histone modifications and DNase I hypersensitive sites. Thus, we defined thousands of candidate enhancer elements by incorporating these features, and found that they have a significant propensity to be bound by p300, an enhancer binding transcription factor. Furthermore, we revealed that their target genes are significantly bound by RNA Polymerase II and demonstrate tissue-specific expression. Finally, we uncovered that these elements are generally found within 1 Mb of their targets, and often regulate multiple genes. In total, our study presents a novel high-throughput workflow for confident, genome-wide discovery of enhancer-target promoter pairs, which will significantly improve our understanding of these regulatory interactions.

IMPORTANCE: Genetic variants associated with susceptibility to late-onset Alzheimer disease are known for individuals of European ancestry, but whether the same or different variants account for the genetic risk of Alzheimer disease in African American individuals is unknown. Identification of disease-associated variants helps identify targets for genetic testing, prevention, and treatment.
OBJECTIVE: To identify genetic loci associated with late-onset Alzheimer disease in African Americans.
DESIGN, SETTING, AND PARTICIPANTS: The Alzheimer Disease Genetics Consortium (ADGC) assembled multiple data sets representing a total of 5896 African Americans (1968 case participants, 3928 control participants) 60 years or older that were collected between 1989 and 2011 at multiple sites. The association of Alzheimer disease with genotyped and imputed single-nucleotide polymorphisms (SNPs) was assessed in case-control and in family-based data sets. Results from individual data sets were combined to perform an inverse variance-weighted meta-analysis, first with genome-wide analyses and subsequently with gene-based tests for previously reported loci.
MAIN OUTCOMES AND MEASURES: Presence of Alzheimer disease according to standardized criteria.
RESULTS: Genome-wide significance in fully adjusted models (sex, age, APOE genotype, population stratification) was observed for a SNP in ABCA7 (rs115550680, allele = G; frequency, 0.09 cases and 0.06 controls; odds ratio [OR], 1.79 [95% CI, 1.47-2.12]; P = 2.2 × 10(-9)), which is in linkage disequilibrium with SNPs previously associated with Alzheimer disease in Europeans (0.8 < D' < 0.9). The effect size for the SNP in ABCA7 was comparable with that of the APOE ϵ4-determining SNP rs429358 (allele = C; frequency, 0.30 cases and 0.18 controls; OR, 2.31 [95% CI, 2.19-2.42]; P = 5.5 × 10(-47)). Several loci previously associated with Alzheimer disease but not reaching significance in genome-wide analyses were replicated in gene-based analyses accounting for linkage disequilibrium between markers and correcting for number of tests performed per gene (CR1, BIN1, EPHA1, CD33; 0.0005 < empirical P < .001).
CONCLUSIONS AND RELEVANCE: In this meta-analysis of data from African American participants, Alzheimer disease was significantly associated with variants in ABCA7 and with other genes that have been associated with Alzheimer disease in individuals of European ancestry. Replication and functional validation of this finding is needed before this information is used in clinical settings.

Survival in infants younger than 1 year who have acute lymphoblastic leukemia (ALL) is inferior whether MLL is rearranged (R) or germline (G). MLL translocations confer chemotherapy resistance, and infants experience excess complications. We characterized in vitro sensitivity to the pan-antiapoptotic BCL-2 family inhibitor obatoclax mesylate in diagnostic leukemia cells from 54 infants with ALL/bilineal acute leukemia because of the role of prosurvival BCL-2 proteins in resistance, their imbalanced expression in infant ALL, and evidence of obatoclax activity with a favorable toxicity profile in early adult leukemia trials. Overall, half maximal effective concentrations (EC50s) were lower than 176 nM (the maximal plasma concentration [Cmax] with recommended adult dose) in 76% of samples, whether in MLL-AF4, MLL-ENL, or other MLL-R or MLL-G subsets, and regardless of patients’ poor prognostic features. However, MLL status and partner genes correlated with EC50. Combined approaches including flow cytometry, Western blot, obatoclax treatment with death pathway inhibition, microarray analyses, and/or electron microscopy indicated a unique killing mechanism involving apoptosis, necroptosis, and autophagy in MLL-AF4 ALL cell lines and primary MLL-R and MLL-G infant ALL cells. This in vitro obatoclax activity and its multiple killing mechanisms across molecular cytogenetic subsets provide a rationale to incorporate a similarly acting compound into combination strategies to combat infant ALL.

To characterize the role of rare complete human knockouts in autism spectrum disorders (ASDs), we identify genes with homozygous or compound heterozygous loss-of-function (LoF) variants (defined as nonsense and essential splice sites) from exome sequencing of 933 cases and 869 controls. We identify a 2-fold increase in complete knockouts of autosomal genes with low rates of LoF variation (≤ 5% frequency) in cases and estimate a 3% contribution to ASD risk by these events, confirming this observation in an independent set of 563 probands and 4,605 controls. Outside the pseudoautosomal regions on the X chromosome, we similarly observe a significant 1.5-fold increase in rare hemizygous knockouts in males, contributing to another 2% of ASDs in males. Taken together, these results provide compelling evidence that rare autosomal and X chromosome complete gene knockouts are important inherited risk factors for ASD.

Although neuritic plaques and neurofibrillary tangles in older adults are correlated with cognitive impairment and severity of dementia, it has long been recognized that the relationship is imperfect, as some people exhibit normal cognition despite high levels of Alzheimer’s disease (AD) pathology. We compared the cellular, synaptic, and biochemical composition of midfrontal cortices in female subjects from the Religious Orders Study who were stratified into three subgroups: (1) pathological AD with normal cognition (“AD-Resilient”), (2) pathological AD with AD-typical dementia (“AD-Dementia”), and (3) pathologically normal with normal cognition (“Normal Comparison”). The AD-Resilient group exhibited preserved densities of synaptophysin-labeled presynaptic terminals and synaptopodin-labeled dendritic spines compared with the AD-Dementia group, and increased densities of glial fibrillary acidic protein astrocytes compared with both the AD-Dementia and Normal Comparison groups. Further, in a discovery-type antibody microarray protein analysis, we identified a number of candidate protein abnormalities that were associated with a particular diagnostic group. These data characterize cellular and synaptic features and identify novel biochemical targets that may be associated with resilient cognitive brain aging in the setting of pathological AD.

The frequency and clinical and pathological characteristics associated with the Gly206Ala presenilin 1 (PSEN1) mutation in Puerto Rican and non-Puerto Rican Hispanics were evaluated at the University of Pennsylvania’s Alzheimer’s Disease Center. DNAs from all cohort subjects were genotyped for the Gly206Ala PSEN1 mutation. Carriers and non-carriers with neurodegenerative disease dementias were compared for demographic, clinical, psychometric, and biomarker variables. Nineteen (12.6%) of 151 unrelated subjects with dementia were discovered to carry the PSEN1 Gly206Ala mutation. Microsatellite marker genotyping determined a common ancestral haplotype for all carriers. Carriers were all of Puerto Rican heritage with significantly younger age of onset, but otherwise were clinically and neuropsychologically comparable to those of non-carriers with AD. Three subjects had extensive topographic and biochemical biomarker assessments that were also typical of non-carriers with AD. Neuropathological examination in one subject revealed severe, widespread plaque and tangle pathology without other meaningful disease lesions. The PSEN1 Gly206Ala mutation is notably frequent in unrelated Puerto Rican immigrants with dementia in Philadelphia. Considered together with the increased prevalence and mortality of AD reported in Puerto Rico, these high rates may reflect hereditary risk concentrated in the island which warrants further study.

To discover susceptibility genes of late-onset Alzheimer’s disease (LOAD), we conducted a 3-stage genome-wide association study (GWAS) using three populations: Japanese from the Japanese Genetic Consortium for Alzheimer Disease (JGSCAD), Koreans, and Caucasians from the Alzheimer Disease Genetic Consortium (ADGC). In Stage 1, we evaluated data for 5,877,918 genotyped and imputed SNPs in Japanese cases (n = 1,008) and controls (n = 1,016). Genome-wide significance was observed with 12 SNPs in the APOE region. Seven SNPs from other distinct regions with p-values <2×10(-5) were genotyped in a second Japanese sample (885 cases, 985 controls), and evidence of association was confirmed for one SORL1 SNP (rs3781834, P = 7.33×10(-7) in the combined sample). Subsequent analysis combining results for several SORL1 SNPs in the Japanese, Korean (339 cases, 1,129 controls) and Caucasians (11,840 AD cases, 10,931 controls) revealed genome wide significance with rs11218343 (P = 1.77×10(-9)) and rs3781834 (P = 1.04×10(-8)). SNPs in previously established AD loci in Caucasians showed strong evidence of association in Japanese including rs3851179 near PICALM (P = 1.71×10(-5)) and rs744373 near BIN1 (P = 1.39×10(-4)). The associated allele for each of these SNPs was the same as in Caucasians. These data demonstrate for the first time genome-wide significance of LOAD with SORL1 and confirm the role of other known loci for LOAD in Japanese. Our study highlights the importance of examining associations in multiple ethnic populations.

Several recent gene expression studies identified hundreds of genes that are correlated with age in brain and other tissues in human. However, these studies used linear models of age correlation, which are not well equipped to model abrupt changes associated with particular ages. We developed a computational algorithm for age estimation in which the expression of each gene is treated as a dichotomized biomarker for whether the subject is older or younger than a particular age. In addition, for each age-informative gene our algorithm identifies the age threshold with the most drastic change in expression level, which allows us to associate genes with particular age periods. Analysis of human aging brain expression datasets from three frontal cortex regions showed that different pathways undergo transitions at different ages, and the distribution of pathways and age thresholds varies across brain regions. Our study reveals age-correlated expression changes at particular age points and allows one to estimate the age of an individual with better accuracy than previously published methods.