Blog

Posttranscriptional chemical modification of RNA bases is a widespread and physiologically relevant regulator of RNA maturation, stability, and function. While modifications are best characterized in short, noncoding RNAs such as tRNAs, growing evidence indicates that mRNAs and long noncoding RNAs (lncRNAs) are likewise modified. Here, we apply our high-throughput annotation of modified ribonucleotides (HAMR) pipeline to identify and classify modifications that affect Watson-Crick base pairing at three different levels of the Arabidopsis thaliana transcriptome (polyadenylated, small, and degrading RNAs). We find this type of modifications primarily within uncapped, degrading mRNAs and lncRNAs, suggesting they are the cause or consequence of RNA turnover. Additionally, modifications within stable mRNAs tend to occur in alternatively spliced introns, suggesting they regulate splicing. Furthermore, these modifications target mRNAs with coherent functions, including stress responses. Thus, our comprehensive analysis across multiple RNA classes yields insights into the functions of covalent RNA modifications in plant transcriptomes.

INTRODUCTION: The dynamic range of cerebrospinal fluid (CSF) amyloid β (Aβ1-42) measurement does not parallel to cognitive changes in Alzheimer’s disease (AD) and cognitively normal (CN) subjects across different studies. Therefore, identifying novel proteins to characterize symptomatic AD samples is important.
METHODS: Proteins were profiled using a multianalyte platform by Rules Based Medicine (MAP-RBM). Due to underlying heterogeneity and unbalanced sample size, we combined subjects (344 AD and 325 CN) from three cohorts: Alzheimer’s Disease Neuroimaging Initiative, Penn Center for Neurodegenerative Disease Research of the University of Pennsylvania, and Knight Alzheimer’s Disease Research Center at Washington University in St. Louis. We focused on samples whose cognitive and amyloid status was consistent. We performed linear regression (accounted for age, gender, number of APOE e4 alleles, and cohort variable) to identify amyloid-related proteins for symptomatic AD subjects in this largest ever CSF-based MAP-RBM study. ANOVA and Tukey’s test were used to evaluate if these proteins were related to cognitive impairment changes as measured by mini-mental state examination (MMSE).
RESULTS: Seven proteins were significantly associated with Aβ1-42 levels in the combined cohort (false discovery rate adjusted P < .05), of which lipoprotein a (Lp(a)), prolactin (PRL), resistin, and vascular endothelial growth factor (VEGF) have consistent direction of associations across every individual cohort. VEGF was strongly associated with MMSE scores, followed by pancreatic polypeptide and immunoglobulin A (IgA), suggesting they may be related to staging of AD.
DISCUSSION: Lp(a), PRL, IgA, and tissue factor/thromboplastin have never been reported for AD diagnosis in previous individual CSF-based MAP-RBM studies. Although some of our reported analytes are related to AD pathophysiology, others' roles in symptomatic AD samples worth further explorations.

Whales have 1000-fold more cells than humans and mice have 1000-fold fewer; however, cancer risk across species does not increase with the number of somatic cells and the lifespan of the organism. This observation is known as Peto’s paradox. How much would evolution have to change the parameters of somatic evolution in order to equalize the cancer risk between species that differ by orders of magnitude in size? Analysis of previously published models of colorectal cancer suggests that a two- to three-fold decrease in the mutation rate or stem cell division rate is enough to reduce a whale’s cancer risk to that of a human. Similarly, the addition of one to two required tumour-suppressor gene mutations would also be sufficient. We surveyed mammalian genomes and did not find a positive correlation of tumour-suppressor genes with increasing body mass and longevity. However, we found evidence of the amplification of TP53 in elephants, MAL in horses and FBXO31 in microbats, which might explain Peto’s paradox in those species. Exploring parameters that evolution may have fine-tuned in large, long-lived organisms will help guide future experiments to reveal the underlying biology responsible for Peto’s paradox and guide cancer prevention in humans.

Corticobasal degeneration (CBD) is a neurodegenerative disorder affecting movement and cognition, definitively diagnosed only at autopsy. Here, we conduct a genome-wide association study (GWAS) in CBD cases (n=152) and 3,311 controls, and 67 CBD cases and 439 controls in a replication stage. Associations with meta-analysis were 17q21 at MAPT (P=1.42 × 10(-12)), 8p12 at lnc-KIF13B-1, a long non-coding RNA (rs643472; P=3.41 × 10(-8)), and 2p22 at SOS1 (rs963731; P=1.76 × 10(-7)). Testing for association of CBD with top progressive supranuclear palsy (PSP) GWAS single-nucleotide polymorphisms (SNPs) identified associations at MOBP (3p22; rs1768208; P=2.07 × 10(-7)) and MAPT H1c (17q21; rs242557; P=7.91 × 10(-6)). We previously reported SNP/transcript level associations with rs8070723/MAPT, rs242557/MAPT, and rs1768208/MOBP and herein identified association with rs963731/SOS1. We identify new CBD susceptibility loci and show that CBD and PSP share a genetic risk factor other than MAPT at 3p22 MOBP (myelin-associated oligodendrocyte basic protein).

We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate–slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

We implemented a high-throughput identification pipeline for promoter interacting enhancer element to streamline the workflow from mapping raw Hi-C reads, identifying DNA-DNA interacting fragments with high confidence and quality control, detecting histone modifications and DNase hypersensitive enrichments in putative enhancer elements, to ultimately extracting possible intra- and inter-chromosomal enhancer-target gene relationships.
AVAILABILITY AND IMPLEMENTATION: This software package is designed to run on high-performance computing clusters with Oracle Grid Engine. The source code is freely available under the MIT license for academic and nonprofit use. The source code and instructions are available at the Wang lab website (http://wanglab.pcbi.upenn.edu/hippie/). It is also provided as an Amazon Machine Image to be used directly on Amazon Cloud with minimal installation.
CONTACT: lswang@mail.med.upenn.edu or bdgregor@sas.upenn.edu
SUPPLEMENTARY INFORMATION: Supplementary Material is available at Bioinformatics online.

IMPORTANCE: Recently, a rare variant in the amyloid precursor protein gene (APP) was described in a population from Iceland. This variant, in which alanine is replaced by threonine at position 673 (A673T), appears to protect against late-onset Alzheimer disease (AD). We evaluated the frequency of this variant in AD cases and cognitively normal controls to determine whether this variant will significantly contribute to risk assessment in individuals in the United States.
OBJECTIVE: To determine the frequency of the APP A673T variant in a large group of elderly cognitively normal controls and AD cases from the United States and in 2 case-control cohorts from Sweden.
DESIGN, SETTING, AND PARTICIPANTS: Case-control association analysis of variant APP A673T in US and Swedish white individuals comparing AD cases with cognitively intact elderly controls. Participants were ascertained at multiple university-associated medical centers and clinics across the United States and Sweden by study-specific sampling methods. They were from case-control studies, community-based prospective cohort studies, and studies that ascertained multiplex families from multiple sources.
MAIN OUTCOMES AND MEASURES: Genotypes for the APP A673T variant were determined using the Infinium HumanExome V1 Beadchip (Illumina, Inc) and by TaqMan genotyping (Life Technologies).
RESULTS: The A673T variant genotypes were evaluated in 8943 US AD cases, 10 480 US cognitively normal controls, 862 Swedish AD cases, and 707 Swedish cognitively normal controls. We identified 3 US individuals heterozygous for A673T, including 1 AD case (age at onset, 89 years) and 2 controls (age at last examination, 82 and 77 years). The remaining US samples were homozygous for the alanine (A673) allele. In the Swedish samples, 3 controls were heterozygous for A673T and all AD cases were homozygous for the A673 allele. We also genotyped a US family previously reported to harbor the A673T variant and found a mother-daughter pair, both cognitively normal at ages 72 and 84 years, respectively, who were both heterozygous for A673T; however, all individuals with AD in the family were homozygous for A673.
CONCLUSIONS AND RELEVANCE: The A673T variant is extremely rare in US cohorts and does not play a substantial role in risk for AD in this population. This variant may be primarily restricted to Icelandic and Scandinavian populations.

Hippocampal sclerosis of aging (HS-Aging) is a common high-morbidity neurodegenerative condition in elderly persons. To understand the risk factors for HS-Aging, we analyzed data from the Alzheimer’s Disease Genetics Consortium and correlated the data with clinical and pathologic information from the National Alzheimer’s Coordinating Center database. Overall, 268 research volunteers with HS-Aging and 2,957 controls were included; detailed neuropathologic data were available for all. The study focused on single-nucleotide polymorphisms previously associated with HS-Aging risk: rs5848 (GRN), rs1990622 (TMEM106B), and rs704180 (ABCC9). Analyses of a subsample that was not previously evaluated (51 HS-Aging cases and 561 controls) replicated the associations of previously identified HS-Aging risk alleles. To test for evidence of gene-gene interactions and genotype-phenotype relationships, pooled data were analyzed. The risk for HS-Aging diagnosis associated with these genetic polymorphisms was not secondary to an association with either Alzheimer disease or dementia with Lewy body neuropathologic changes. The presence of multiple risk genotypes was associated with a trend for additive risk for HS-Aging pathology. We conclude that multiple genes play important roles in HS-Aging, which is a distinctive neurodegenerative disease of aging.

TAR DNA-binding protein 43 (TDP-43) is normally a nuclear RNA-binding protein that exhibits a range of functions including regulation of alternative splicing, RNA trafficking, and RNA stability. However, in amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration with TDP-43 inclusions (FTLD-TDP), TDP-43 is abnormally phosphorylated, ubiquitinated, and cleaved, and is mislocalized to the cytoplasm where it forms distinctive aggregates. We previously developed a mouse model expressing human TDP-43 with a mutation in its nuclear localization signal (ΔNLS-hTDP-43) so that the protein preferentially localizes to the cytoplasm. These mice did not exhibit a significant number of cytoplasmic aggregates, but did display dramatic changes in gene expression as measured by microarray, suggesting that cytoplasmic TDP-43 may be associated with a toxic gain-of-function. Here, we analyze new RNA-sequencing data from the ΔNLS-hTDP-43 mouse model, together with published RNA-sequencing data obtained previously from TDP-43 antisense oligonucleotide (ASO) knockdown mice to investigate further the dysregulation of gene expression in the ΔNLS model. This analysis reveals that the transcriptomic effects of the overexpression of the ΔNLS-hTDP-43 transgene are likely due to a gain of cytoplasmic function. Moreover, cytoplasmic TDP-43 expression alters transcripts that regulate chromatin assembly, the nucleolus, lysosomal function, and histone 3′ untranslated region (UTR) processing. These transcriptomic alterations correlate with observed histologic abnormalities in heterochromatin structure and nuclear size in transgenic mouse and human brains.

The genetic architecture of autism spectrum disorder involves the interplay of common and rare variants and their impact on hundreds of genes. Using exome sequencing, here we show that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, plus a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic formation, transcriptional regulation and chromatin-remodelling pathways. These include voltage-gated ion channels regulating the propagation of action potentials, pacemaking and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodellers-most prominently those that mediate post-translational lysine methylation/demethylation modifications of histones.