Reconstruction of a matrix of genotypic correlations between variants within a gene for joint analysis of imputed and sequenced data
- Autores: Svishcheva G.R.1,2, Kirichenko A.V.1, Belonogova N.M.1, Elgaeva E.E.1,3, Tsepilov A.Y.1, Zorkoltseva I.V.1, Axenovich T.I.1
- 
							Afiliações: 
							- Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences
- Vavilov Institute of General Genetics, Russian Academy of Sciences
- Novosibirsk State University
 
- Edição: Volume 60, Nº 7 (2024)
- Páginas: 91-99
- Seção: МАТЕМАТИЧЕСКИЕ МОДЕЛИ И МЕТОДЫ
- URL: https://ter-arkhiv.ru/0016-6758/article/view/667236
- DOI: https://doi.org/10.31857/S0016675824070089
- EDN: https://elibrary.ru/BHMPLU
- ID: 667236
Citar
Texto integral
 Acesso aberto
		                                Acesso aberto Acesso está concedido
						Acesso está concedido Acesso é pago ou somente para assinantes
		                                							Acesso é pago ou somente para assinantes
		                                					Resumo
When combining imputed and sequenced data in a single gene-based association analysis, the problem of reconstructing genetic correlation matrices arises. It is related to the fact that for a gene, we know the correlations between genotypes of all imputed variants and the correlations between genotypes of all sequenced variants, but we do not know the correlations between genotypes of variants, one of which is imputed and the other is sequenced. To recover these correlations, we propose an efficient method based on maximising the determinant of the matrix. This method has a number of useful properties and has an analytical solution for our task. Approbation of the proposed method was performed by comparing reconstructed and real correlation matrices constructed on individual genotypes from the UK biobank. Comparison of the results of gene-based association analysis performed by the SKAT, BT and PCA methods on reconstructed and real matrices, using modelled summary statistics and calculated summary statistics on real phenotypes, showed high quality of reconstruction and robustness of the method to different gene structures.
Sobre autores
G. Svishcheva
Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences; Vavilov Institute of General Genetics, Russian Academy of Sciences
							Autor responsável pela correspondência
							Email: gulsvi@mail.ru
				                					                																			                												                	Rússia, 							630090, Novosibirsk; 119991, Moscow						
A. Kirichenko
Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences
														Email: gulsvi@mail.ru
				                					                																			                												                	Rússia, 							630090, Novosibirsk						
N. Belonogova
Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences
														Email: gulsvi@mail.ru
				                					                																			                												                	Rússia, 							630090, Novosibirsk						
E. Elgaeva
Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences; Novosibirsk State University
														Email: gulsvi@mail.ru
				                					                																			                												                	Rússia, 							630090, Novosibirsk; 630090, Novosibirsk						
A. Tsepilov
Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences
														Email: gulsvi@mail.ru
				                					                																			                												                	Rússia, 							630090, Novosibirsk						
I. Zorkoltseva
Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences
														Email: gulsvi@mail.ru
				                					                																			                												                	Rússia, 							630090, Novosibirsk						
T. Axenovich
Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences
														Email: gulsvi@mail.ru
				                					                																			                												                	Rússia, 							630090, Novosibirsk						
Bibliografia
- Eichler E.E., Flint J., Gibson G. at al. Missing heritability and strategies for finding the underlying causes of complex disease // Nat. Rev. Genet. 2010. V. 11. № 6. P. 446–450. https://doi.org/10.1038/nrg2809
- Li B., Leal S.M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data // The Am. J. Hum. Genet. 2008. V. 83. № 3. P. 311−321. https://doi.org/10.1016/j.ajhg.2008.06.024
- Cirulli E.T. The increasing importance of gene-based analyses // PloS Genetics. 2016. V. 12. № 4. https://doi.org/10.1371/journal.pgen.1005852
- Kang G., Jiang B., Cui Y. Gene-based genomewide association analysis: A comparison study // Curr. Genomics. 2013. V. 14. № 4. P. 250–255. https://doi.org/10.2174/13892029113149990001
- Li Y., Willer C., Sanna S., Abecasis G. Genotype imputation // Ann. Rev. Genomics and Hum. Genet. 2009. V. 10. P. 387–406. https://doi.org/10.1146/annurev.genom.9.081307.164242
- Uffelmann E., Huang Q.Q., Munung N.S. et al. Genome-wide association studies // Nat. Rev. Methods Primers. 2021. V. 1. № 59. P. 1–21. https://doi.org/10.1038/s43586-021-00056-9
- Guo Y., Long J., He J. et al. Exome sequencing generates high quality data in non-target regions // BMC Genomics. 2012. V. 13. № 1. P. 1–10. https://doi.org/10.1186/1471-2164-13-194
- Clark M.J., Chen R., Lam H.Y. et al. Performance comparison of exome DNA sequencing technologies // Nat. Biotechnol. 2011. V. 29. № 10. P. 908–914. https://doi.org/10.1038/nbt.1975
- Stanley J.C., Wang M.D. Restrictions on the possible values of r12, given r13 and r23 // Educational and Psychol. Measurement. 1969. V. 29. № 3. P. 579–581.
- Glass G.V., Collins J.R. Geometric proof of the restriction on the possible values of rxy when rxz and ryz are fixed // Educational and Psychol. Measurement. 1970. V. 30. № 1. P. 37–39.
- Budden M., Hadavas P., Hoffman L., Pretz C. Generating valid 4 × 4 correlation matrices // Applied Mathemat. E-Notes. 2007. V. 7. P. 53–59.
- Glunt W., Hayden T., Johnson C.R., Tarazaga P. Positive definite completions and determinant maximization // Linear Algebra and its Applications. 1999. V. 288. P. 1–10. https://doi.org/10.1016/S0024-3795(98)10211-2
- Vandenberghe L., Boyd S., Wu S.-P. Determinant maximization with linear matrix inequality constraints // SIAM J. Matrix Analysis and Applications. 1998. V. 19. № 2. P. 499–533. https://doi.org/10.1137/S0895479896303430
- Georgescu D.I., Higham N.J., Peters G.W. Explicit solutions to correlation matrix completion problems, with an application to risk management and insurance // Royal Soc. Open Sci. 2018. V. 5. № 3. P. 172348.
- Grone R., Johnson C.R., Sá E.M., Wolkowicz H. Positive definite completions of partial Hermitian matrices // Linear Algebra and its Applications. 1984. V. 58. P. 109–124.
- Popescu O., Rose C., Popescu D.C. Maximizing the determinant for a special class of block-partitioned matrices // Mathem. Problems in Engineering. 2004. V. 2004. P. 49–61. https://doi.org/10.1155/S1024123X04307027
- Li B., Liu D.J., Leal S.M. Identifying rare variants associated with complex traits via sequencing // Curr. Protocols in Hum. Genet. 2013. V. 78. № 1. P. 1–26. https://doi.org/10.1002/0471142905.hg0126s78
- Wu M.C., Lee S., Cai T. et al. Rare-variant association testing for sequencing data with the sequence kernel association test // The Am. J. Hum. Genet. 2011. V. 89. № 1. P. 82–93. https://doi.org/10.1016/j.ajhg.2011.05.029
- Jiang L., Zheng Z., Fang H., Yang J. A generalized linear mixed model association tool for biobank-scale data // Nat. Genet. 2021. V. 53. № 11. P. 1616–1621. https://doi.org/10.1038/s41588-021-00954-4
- Svishcheva G.R. A generalized model for combining dependent SNP-level summary statistics and its extensions to statistics of other levels // Scientific Reports. 2019. V. 9. № 1. P. 1–8. https://doi.org/10.1038/s41598-019-41827-5
- Svishcheva G.R., Belonogova N.M., Zorkoltseva I.V. et al. Gene-based association tests using GWAS summary statistics // Bioinformatics. 2019. V. 35. № 19. P. 3701–3708. https://doi.org/10.1093/bioinformatics/btz172
- Belonogova N.M., Svishcheva G.R., Kirichenko A.V. et al. sumSTAAR: A flexible framework for gene-based association studies using GWAS summary statistics // PloS Comput. Biology. 2022. T. 18. № 6. https://doi.org/10.1371/journal.pcbi.1010172
- Тихонов А.Н. О решении некорректно поставленных задач и методе регуляризации // ДАН. 1963. Т. 151. № 3. C. 501–504.
Arquivos suplementares
 
				
			 
						 
						 
					 
						 
						 
									

 
  
  
  Enviar artigo por via de e-mail
			Enviar artigo por via de e-mail 
