[佳学基因]GWAS分析中的结构分析有什么作用
GWAS分析的一个关键步骤是研究人口结构(PS)。进行这项研究的主要原因是,由于具有不同的群体遗传史,不同的亚群体可能在整个基因组的许多多态性的等位基因频率上存在差异。如果群体具有不同的表型总体值,则两个群体之间频率不同的任何多态性都与表型相关,即使它们不是偶然的或强烈的
偶然多态性的连锁不平衡[7-9]。基因型数据的主成分分析(PCA)用于利用R。
主成分分析法所解释的群体结构仅限于纠正全球遗传变异水平上的虚假关联。因此,PS不能充分捕捉个体之间的相关性,在分析中还需要考虑基因型(K,亲属关系矩阵)之间的这种关系。不考虑PS、K以及表型和基因型效应之间的潜在混淆,可能导致GWAS分析中不现实的评估。
One crucial step in GWAS analysis is to study the population structure (PS). The main reason to perform this study is that, as a consequence of having different population genetic histories, distinct subpopulations could have differences in allele frequencies for many polymorphisms throughout the genome. If the populations have different overall values for the phenotype, any polymorphisms that differ in frequency between the two populations are associated with the phenotype even though they are not casual or in strong linkage disequilibrium with casual polymorphisms [7–9]. Principal component analysis (PCA) on genotypic data is used to visualize the structure of our populations using the function “svd()” in R. Population structure accounted by PCA is limited to correcting for spurious associations on a global level of genetic variation. Thereby, PS does not adequately capture the relatedness between individuals, and this relationship between genotypes (K, kinship matrix) needs also be taking into account on the analysis. Not taking into account of PS, K as well as a potential confounding between the phenotype and the genotype effects, could lead to unrealistic assessments in GWAS analysis.(责任编辑:佳学基因)