【佳学基因检测】基因解码数据源:转录本丰度的tximport算法
Transcript abundances and the tximport pipeline
Before we demonstrate how to align and then count RNA-seq fragments, we mention that a newer and faster alternative pipeline is to use transcript abundance quantification methods such as Sailfish8, Salmon9, kallisto10 or RSEM11 to estimate abundances without aligning reads, followed by the tximport package for assembling count and offset matrices for use with Bioconductor differential gene expression packages. We have added this as part of the revision process for this workflow, therefore the following material covers generation of count matrices through alignment and read/fragment counting. The advantages of using the transcript abundance quantifiers in conjunction with tximport to produce gene-level count matrices and normalizing offsets are: this approach corrects for any potential changes in gene length across samples (e.g., from differential isoform usage)12; some of these methods are substantially faster and require less memory and disk usage compared to alignment-based methods; and it is possible to avoid discarding those fragments that can align to multiple genes with homologous sequence13. Note that transcript abundance quantifiers skip the generation of large files which store read alignments, instead producing smaller files which store estimated abundances, counts, and effective lengths per transcript. For more details, see the manuscript describing the tximport approach14 and the tximport package vignette for software details. The entry point back into this workflow after using a transcript quantifier and tximport would be the section on the data object below.
(责任编辑:佳学基因)