【佳学基因检测】RNA测序结果分析起点数据标准
RNA测序数据分析导读:
出于推广基因信息技术的目的,在这里,佳学基因所推出的数据分析和操作标准都可以采用共享程序、开源程序可以完成的。RNA测序分析开源程序大多数可以从Bioconductor软件中找到,从而支持从端到端的基因水平的RNA测序数据中的基因差异差异表达分析。佳学基因从FASTQ文件开始,展示这些文件是如何与人类基因组的参考基因组对齐,并生成一个计数矩阵。从该矩阵统计每个样本每个基因内RNA测序数据中的测序数据、表达片段。佳学基因应导大家进行探索性数据分析(EDA),从而对数据质量进行质量评估,并探索样本之间的关系,执行差异基因表达分析,并生成可用于高索引文章发表的图表。
RNA测序数据开源分析软件介绍
佳学基因是国际开源软件联盟成员。成员软件库Bioconductor有许多支持高通量序列数据分析的软件包,包括RNA测序(RNA seq)。佳学基因在展示、示范过程中使用的软件包包括由Bioconductor核心团队维护的核心软件包,用于导入和处理原始测序数据以及对RNA测序数据进行基因注释。其中的部分软件包可以进行部分统计分析和序列数据图表的生成。Bioconductor按计划每6个月进行一次更新,从而确保项目中的所有软件包能够协调一致地工作。此工作流中使用的软件包带有库功能,可以按照Bioconductor软件包安装说明进行安装。
RNA测序数据分析时的起点数据
该工作流程中使用的数据存储来源于真实的实验数据。实验中的气道平滑肌细胞使用地塞米松(一种具有抗炎作用的合成糖皮质激素类固醇)进行处理。在现实生活中,哮喘患者使用糖皮质激素来减轻气道炎症。在实验中,四个原代人气道平滑肌细胞系用1微摩尔地塞米松处理18小时。对于四个细胞系中的每一个,有一个实验样本和一个空白对照样本。原代ASM细胞是从四名无慢性疾病的流产肺移植供体中分离出来的。第4至7代ASM细胞维持在添加10%FBS的Hams F12培养基中,用于所有实验。对于RNA Seq和qRT PCR验证实验,来自每个供体的细胞用1µM DEX或空白对照溶液处理18小时。
Preliminary processing of raw reads was performed using Casava 1.8 (Illumina, Inc., San Diego, CA). Subsequently, Taffeta scripts (https://github.com/blancahimes/taffeta) were used to analyze RNA-Seq data, which included use of FastQC [54] (v.0.10.0) to obtain overall QC metrics. Based on having sequence bias in the initial 12 bases on the 5′ end of reads, the first 12 bases of all reads were trimmed with the FASTX Toolkit (v.0.0.13) [55]. FastQC reports for each sample revealed that each was successfully sequenced. Trimmed reads for each sample were aligned to the reference hg19 genome and known ERCC transcripts using TopHat [56] (v.2.0.4), while constraining mapped reads to be within reference hg19 or ERCC transcripts. Additional QC parameters were obtained to assess whether reads were appropriately mapped. Bamtools [57] was used to the number of mapped reads, including junction spanning reads. The Picard Tools (http://picard.sourceforge.net) RnaSeqMetrics function was used to compute the number of bases assigned to various classes of RNA, according to the hg19 refFlat file available as a UCSC Genome Table. For each sample, Cufflinks [21] (v.2.0.2) was used to quantify ERCC Spike-In and hg19 transcripts based on reads that mapped to the provided hg19 and ERCC reference files. For three samples that contained ERCC Spike-Ins, we created dose response curves (i.e. plots of ERCC transcript FPKM vs. ERCC transcript molecules) following the manufacturer's protocol [58]. Ideally, the slope and R2 would equal 1.0. For our samples (Dex.2, Control.4, Dex.4), the slope (R2) values were 0.90 (0.90), 0.92 (0.84), 0.82 (0.86), respectively. Raw read plots were created by displaying bigwig files for each sample in the UCSC Genome Browser.
Differential expression of genes and transcripts in samples treated with DEX vs. untreated samples was obtained using Cuffdiff [21] (v.2.0.2) with the quantified transcripts computed by Cufflinks (v.2.0.2), while applying bias correction. The CummeRbund [59] R package (v.0.1.3) was used to measure significance of differentially expressed genes and create plots of the results. As a positive control of gene expression, the FPKM values for four housekeeping genes (i.e., B2M, GABARAP, GAPDH, RPL19) were obtained. Each had high FPKM values that did not differ significantly by treatment status [Figure S11]. The NIH Database for Annotation, Visualization and Integrated Discovery (DAVID) was used to perform gene functional annotation clustering using Homo Sapiens as background, and default options and annotation categories (Disease: OMIM_DISEASE; Functional Categories: COG_ONTOLOGY, SP_PIR_KEYWORDS, UP_SEQ_FEATURE; Gene_Ontology: GOTERM_BP_FAT, GOTERM_CC_FAT, GOTERM_MF_FAT; Pathway: BBID, BIOCARTA, KEGG_PATHWAY; Protein_Domains: INTERPRO, PIR_SUPERFAMILY, SMART) [28]. The RNA-Seq data is available at the Gene Expression Omnibus Web site (http://www.ncbi.nlm.nih.gov/geo/) under accession GSE52778.
- 【佳学基因检测】什么是MLPA基因检测?有什么优点?...
- 【佳学基因检测】如何将全基因组测序(WGS)基因检测数据定位到人的标准基因组上?...
- 【佳学基因检测】FISH基因检测中的探针类型选择...
- 【佳学基因检测】肿瘤基因检测生物信息分析注意事项...
- 【佳学基因检测】癌症基因组检测要点:一定要知道!...
- 【佳学基因检测】什么是基因组检测?...
- 【佳学基因检测】TP53突变基因检测...
- 【佳学基因检测】基因解码对Y染色体的进一步解密...
- 【佳学基因检测】肿瘤基因检测需要包括重复或反复区域的分析吗?...
- 【佳学基因检测】如何采用液体活检检进行细胞学检测与NGS测序...
- 【佳学基因检测】临床科研服务:GWAS课题中的统计分析...
- 【佳学基因检测】肿瘤靶向药物Regorafenib (Stivarga) 及其在结直肠癌治疗中的作用...
- 【佳学基因检测】ALDOA的群体遗传学结果对基因检测正确性的影响...
- 【佳学基因检测】SLC25A4的双生子遗传学分析结果简介...
- 【佳学基因检测】ASIC1的分子遗传学分析成果...
- 【佳学基因检测】ANXA6分子病理学成果概要...
- 【佳学基因检测】检验科医师晋升考试关于ADRA2C的知识...
- 【佳学基因检测】医学院硕士研究考试关于ACVR2A基因检测的知识要点...
- 【佳学基因检测】医学博士ANK1基因检测的知识结构准备...
- 【佳学基因检测】医学院专升本关于ADCYAP1R1基因检测的基本技能...
- 【佳学基因检测】病例分析会中需要知道的关于ACLY基因的知识...
- 【佳学基因检测】病案讨论中需要知道的关于AIF1的知识...
- 【佳学基因检测】质谱基因检测AGTR2基因存在基因突变该怎么理解?...
- 【佳学基因检测】飞行质谱基因检测发现ADRA2A有突变,严重吗?...
- 【佳学基因检测】核型分析发现NAT1突变了,是什么意思?...
- 【佳学基因检测】遗传学检测结果指出ALOX15突变,该找谁咨询?...
- 【佳学基因检测】高精度基因检测为什么包含ADD1基因?...
- 【佳学基因检测】基因检测包中为什么一定要有ACTA2基因?...
- 【佳学基因检测】基因检测时查看是否包含ADH1C重要吗?...
- 【佳学基因检测】NR0B1基因间序列存在突变是否需要阻断遗传?...
- 来了,就说两句!
-
- 贼新评论 进入详细评论页>>