{"id":15,"date":"2021-10-13T17:14:57","date_gmt":"2021-10-13T17:14:57","guid":{"rendered":"https:\/\/www.lisanwanglab.org\/yyee\/?page_id=15"},"modified":"2022-01-04T16:22:00","modified_gmt":"2022-01-04T21:22:00","slug":"software","status":"publish","type":"page","link":"https:\/\/www.lisanwanglab.org\/yyee\/software\/","title":{"rendered":"Software\/Database"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">ADVP: Alzheimer&#8217;s Disease Variant Portal <\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Alzheimer\u2019s Disease Variants Portal (ADVP) is a harmonized collection of high-quality and suggestive genetic association findings curated from the literature. This resource allows the public community to easily&nbsp;<strong>browse<\/strong>,&nbsp;<strong>search<\/strong>&nbsp;and&nbsp;<strong>understand<\/strong>&nbsp;Alzheimer\u2019s Disease genetics reported across &gt;80 cohorts and 8 populations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">ADVP aims to answer questions such as:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>What are the population-specific variants associated with Alzheimer&#8217;s Disease (AD)?<\/li><li>What genes are reported to be associated with AD risk?<\/li><li>What genetic variants are reported to be associated with AD endophenotypes and neuropathology?<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This work is described in &#8230; . ADVP is a valuable resource for investigators to quickly and systematically explore high-confidence AD genetic findings and provides insights into population-specific AD genetic architecture. ADVP is continually maintained and enhanced by NIAGADS and is freely accessible at <a href=\"https:\/\/advp.niagads.org\">https:\/\/advp.niagads.org<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">FILER: a framework for harmonizing and querying large-scale functional genomics knowledge<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Functional genomics repository (FILER) is a functional genomics database developed by NIAGADS with the most comprehensive harmonized, extensible, indexed, searchable human functional genomics data collection across &gt;20 data sources. Features include: 1) One place to access all data, query by tissue\/cell type, biosample type, assay, data type, data collection; 2) Useful for reproducible research; 3) Integration with high-throughput genetic and genomic analysis workflows; 4) Harmonized data, uniform, consistent data formats. This work is described in &#8230; . For examples on how to use FILER please check the <a rel=\"noreferrer noopener\" href=\"https:\/\/tf.lisanwanglab.org\/FILER\/filer_tutorial.pdf\" target=\"_blank\">FILER webserver tutorial<\/a> For installing and using a stand-alone FILER instance, please refer to the <a href=\"http:\/\/bitbucket.org\/wanglab-upenn\/FILER\" target=\"_blank\" rel=\"noreferrer noopener\">FILER bitbucket repository<\/a>. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">SparkINFERNO: a scalable high-throughput pipeline for inferring molecular mechanisms of non-coding genetic variants<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Spark-based INFERence of the molecular mechanisms of NOn-coding genetic variants (SparkINFERNO), is a scalable bioinformatics pipeline characterizing non-coding GWAS association findings. SparkINFERNO prioritizes causal variants underlying GWAS association signals and reports relevant regulatory elements, tissue contexts and plausible target genes they affect. To achieve this, the SparkINFERNO algorithm integrates GWAS summary statistics with large-scale collection of functional genomics datasets spanning enhancer activity, transcription factor binding, expression quantitative trait loci and other functional datasets across more than 400 tissues and cell types. Scalability is achieved by an underlying API implemented using Apache Spark and Giggle-based genomic indexing. We evaluated SparkINFERNO on large GWASs and show that SparkINFERNO is more than 60 times efficient and scales with data size and amount of computational resources. This work is described in&nbsp;<a href=\"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/12\/3879\/5824793\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Kuksa et al. (Bioinformatics, 2020)<\/em><\/a>. SparkINFERNO runs on clusters or a single server with Apache Spark environment, and is available at&nbsp;<a href=\"https:\/\/bitbucket.org\/wanglab-upenn\/SparkINFERNO\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>&nbsp;or&nbsp;<a href=\"https:\/\/hub.docker.com\/r\/wanglab\/spark-inferno\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">HiPR: High-throughput probabilistic RNA structure inference<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">HiPR is a novel method for RNA structure prediction at single-nucleotide resolution that combines high-throughput structure probing data (DMS-seq, DMS-MaPseq) with a novel probabilistic folding algorithm. On validation data spanning a variety of RNA classes, HiPR often increases accuracy for predicting RNA structures, giving researchers new tools to study RNA structure. This work is described in&nbsp;<a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC7327253\/\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Kuksa et al. (Comput Struct Biotechnol J., 2020)<\/em><\/a>. The webserver and complete instructions can be found&nbsp;<a href=\"https:\/\/www.lisanwanglab.org\/HIPR\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>&nbsp;or&nbsp;<a href=\"https:\/\/github.com\/wanglab-upenn\/HiPR\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">VCPA: genomic variant calling pipeline and data management tool for Alzheimer\u2019s Disease Sequencing Project<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">VCPA is the official SNP\/Indel Variant Calling Pipeline and data management tool used for the analysis of whole genome and exome sequencing (WGS\/WES) for the Alzheimer\u2019s Disease Sequencing Project. VCPA consists of two independent but linkable components: pipeline and tracking database. The pipeline, implemented using the Workflow Description Language and fully optimized for the Amazon elastic compute cloud environment, includes steps from aligning raw sequence reads to variant calling using GATK. The tracking database allows users to view job running status in real time and visualize &gt;100 quality metrics per genome. VCPA is functionally equivalent to the CCDG\/TOPMed pipeline. Users can use the pipeline and the dockerized database to process large WGS\/WES datasets on Amazon cloud with minimal configuration. This work is described in&nbsp;<a href=\"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/10\/1768\/5142723\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Leung et al. (Bioinformatics, 2019)<\/em><\/a>. VCPA is freely available at&nbsp;<a href=\"https:\/\/www.niagads.org\/resources\/tools-and-software\/vcpa\" target=\"_blank\" rel=\"noreferrer noopener\">here.<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">SPAR: small RNA-seq portal for analysis of sequencing experiments<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Small RNA-seq Portal for Analysis of sequencing expeRiments (SPAR), is a user-friendly web server for interactive processing, analysis, annotation and visualization of small RNA sequencing data. SPAR supports sequencing data generated from various experimental protocols, including smRNA-seq, short total RNA sequencing, microRNA-seq, and single-cell small RNA-seq. Additionally, SPAR includes publicly available reference sncRNA datasets from our DASHR database and from ENCODE across 185 human tissues and cell types to produce highly informative small RNA annotations across all major small RNA types and other features such as co-localization with various genomic features, precursor transcript cleavage patterns, and conservation. SPAR allows the user to compare the input experiment against reference ENCODE\/DASHR datasets. SPAR currently supports analyses of human (hg19, hg38) and mouse (mm10) sequencing data. This work is described in&nbsp;<a href=\"https:\/\/academic.oup.com\/nar\/article\/46\/W1\/W36\/4992647\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Kuksa et al. (Nucleic Acids Research Web Server Issue, 2018)<\/em><\/a>. SPAR is freely available at&nbsp;<a href=\"https:\/\/www.lisanwanglab.org\/SPAR\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. If you prefer to run SPAR at your own site, please download stand-alone, offline version&nbsp;<a href=\"https:\/\/bitbucket.org\/wanglab-upenn\/workspace\/projects\/SPAR\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">DASHR2 &#8211; DASHR 2.0: integrated database of human small non-coding RNA genes and mature products<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">DASHR v2.0 database is the first that integrates human sncRNA gene and mature products profiles obtained from multiple RNA-seq protocols. Altogether, 185 tissues\/cell types and sncRNA annotations and &gt;800 curated experiments from ENCODE and GEO\/SRA across multiple RNA-seq protocols for both GRCh38\/hg38 and GRCh37\/hg19 assemblies are integrated in DASHR. Moreover, DASHR is the first to contain both known and novel, previously un-annotated sncRNA loci identified by unsupervised segmentation (13 times more loci with 1 678 800 total). Additionally, DASHR v2.0 adds &gt;3 200 000 annotations for non-small RNA genes and other genomic features (long-noncoding RNAs, mRNAs, promoters, repeats). This work is described in&nbsp;<a href=\"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/6\/1033\/5078466\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Kuksa et al. (Bioinformatics, 2018)<\/em><\/a>. The DASHR database and complete instructions can be found&nbsp;<a href=\"https:\/\/lisanwanglab.org\/DASHRv2\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">DASHR &#8211; Database of small human noncoding RNA<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The DASHR database provides information about small non-coding RNA (sncRNA) and their expression in different human tissues and cell types. The content of this database derives from curation, annotation, and computational analysis of small RNA sequencing data sets from multiple sources. Currently the database contains information about more than 46,000 sncRNAs in 42 normal human tissues and cell types from over 30 independent studies. This work is described in&nbsp;<a href=\"http:\/\/nar.oxfordjournals.org\/content\/early\/2015\/11\/06\/nar.gkv1188.full\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Leung et al. (Nucleic Acids Research Database Issue, 2016)<\/em><\/a>. The DASHR database and complete instructions can be found&nbsp;<a href=\"http:\/\/lisanwanglab.org\/DASHR\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<a href=\"http:\/\/github.com\/wanglab-upenn\/DASHR\" target=\"_blank\" rel=\"noreferrer noopener\">[Source code: DASHR]<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">CoRAL &#8211; Classification of RNAs by Analysis of Length<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">CoRAL is a machine learning tool \/ package that can predict the precursor class of small noncoding RNAs present in a high-throughput RNA-sequencing dataset. In addition to classification, it also produces information about the features that are the most important for discriminating different populations of small non-coding RNAs. This work is described in&nbsp;<a href=\"http:\/\/nar.oxfordjournals.org\/content\/41\/14\/e137.long\/\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Leung et al. (Nucleic Acids Research, 2013)<\/em><\/a>. Complete instructions and documentation can be found&nbsp;<a href=\"http:\/\/wanglab.pcbi.upenn.edu\/coral\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<a href=\"http:\/\/wanglab.pcbi.upenn.edu\/coral\/coral-1.1.1.tar.gz\" target=\"_blank\" rel=\"noreferrer noopener\">[Source code: CoRAL]<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">HAMR &#8211; High throughput Annotation of Modified Ribonucleotides<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">HAMR (High-throughput Annotation of Modified Ribonucleotides) is a web application that allows you to detect and classify modified nucleotides in RNA-seq data. HAMR scans RNA-sequencing data for sites showing potential signatures of nucleotide modification. Users can input particular genomic regions of interest (BED file format) and HAMR will output a table containing the list of sites with nucleotide patterns that deviate from expectation at a statistically significant rate. This work is described in&nbsp;<a href=\"http:\/\/rnajournal.cshlp.org\/content\/19\/12\/1684.long\/\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Ryvkin et al. (RNA, 2013)<\/em><\/a>. The webserver and complete instructions can be found&nbsp;<a href=\"http:\/\/tesla.pcbi.upenn.edu\/hamr\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<a href=\"https:\/\/github.com\/pryvkin\/hamr\/archive\/v1.2.0.tar.gz\" target=\"_blank\" rel=\"noreferrer noopener\">[Source code: HAMR]<\/a><\/p>\n\n\n\n<div class=\"wp-block-columns section contact-box has-background is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex\" style=\"background-color:#f1f1f1\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h2 class=\"has-medium-font-size wp-block-heading\">CONTACT INFORMATION<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">3700 Hamilton Walk<br>D101 Richards Medical Research Laboratories<br>Perelman School of Medicine<br>University of Pennsylvania<br>Philadelphia PA 19104<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><br><br><br><a href=\"https:\/\/www.google.com\/maps\/place\/3700+Hamilton+Walk,+University+of+Pennsylvania,+Philadelphia,+PA+19104\/@39.9496503,-75.1977488,17z\/data=!3m1!4b1!4m2!3m1!1s0x89c6c659180eecd3:0x9baf8f89d916314f\" target=\"_blank\" rel=\"noreferrer noopener\">View On Map<\/a><br><span class=\"has-icon telephone\">(215) 573-3729<\/span><br><a href=\"mailto:yyee@pennmedicine.upenn.edu\" target=\"_blank\" rel=\"noreferrer noopener\">yyee@pennmedicine.upenn.edu<\/a><\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>ADVP: Alzheimer&#8217;s Disease Variant Portal Alzheimer\u2019s Disease Variants Portal (ADVP) is a harmonized collection of high-quality and suggestive genetic association findings curated from the literature. This resource allows the public community to easily&nbsp;browse,&nbsp;search&nbsp;and&nbsp;understand&nbsp;Alzheimer\u2019s Disease genetics reported across &gt;80 cohorts and 8 populations. ADVP aims to answer questions such as: What are the population-specific variants associated [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-15","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.lisanwanglab.org\/yyee\/wp-json\/wp\/v2\/pages\/15","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lisanwanglab.org\/yyee\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.lisanwanglab.org\/yyee\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.lisanwanglab.org\/yyee\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lisanwanglab.org\/yyee\/wp-json\/wp\/v2\/comments?post=15"}],"version-history":[{"count":9,"href":"https:\/\/www.lisanwanglab.org\/yyee\/wp-json\/wp\/v2\/pages\/15\/revisions"}],"predecessor-version":[{"id":95,"href":"https:\/\/www.lisanwanglab.org\/yyee\/wp-json\/wp\/v2\/pages\/15\/revisions\/95"}],"wp:attachment":[{"href":"https:\/\/www.lisanwanglab.org\/yyee\/wp-json\/wp\/v2\/media?parent=15"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}