The problem of selecting relevant descriptors in predicting the toxicity of chemicals
- Authors: Guseva E.A.1
- 
							Affiliations: 
							- Federal State Autonomous Educational Institution of Higher Education I.M. Sechenov First Moscow State Medical University of the Ministry of Health of the Russian Federation (Sechenov University)
 
- Issue: No 6 (2023)
- Pages: 413-417
- Section: Original articles
- Published: 15.01.2024
- URL: https://ter-arkhiv.ru/0869-7922/article/view/641547
- DOI: https://doi.org/10.47470/0869-7922-2023-31-6-413-417
- EDN: https://elibrary.ru/kijkhn
- ID: 641547
Cite item
Abstract
Introduction. Mathematical models are widely applicable in conducting toxicological studies and can be used to fill gaps that arise in the assessment of chemical safety. Most of the attention is paid to the study of algorithms for constructing models, rather than approaches to choosing the most informative features.
The purpose of this study is to highlight aspects of the problem of choosing useful variables during mathematical modeling.
Material and methods. SMILES and molecular descriptors for organothiophosphates were generated in the interactive Google Colaboratory environment based on the program code using the RDKit, Mordred software. Using the tools of the scikit-learn Ver. 1.2.2 library, features were selected by filtering and by recursive feature exclusion. The values of acute oral toxicity parameters were taken from official information sources about chemicals. The obtained models are subjected to an internal validation procedure to evaluate the performance of the models.
Results. It should be noted that models where recursive exclusion of features was used have better characteristics than models based on descriptors selected by the filtering method. In particular, the acute toxicity prediction model for organothiophosphates based on the decision tree method with recursive exclusion of features has a high coefficient of determination (R2=0,91713), a relatively small root-mean-square error (RMSE= 0,35099), as well as high values of the cross-validation coefficient of determination (Q2LOO= 0,79756).
Limitations. The results obtained can be used only in predicting the toxicity of the specified group of chemicals with a similar mechanism of action.
Conclusion. The use of mathematical modeling is a promising tool for assessing the toxicity of chemicals, which has a number of features: on the one hand, it is a quick and convenient resource for screening the toxicity of substances, on the other hand, the model needs to be trained based not only on reliable research data, but also to carry out a qualitative selection procedure for signs that make a significant contribution to the functioning of the prognostic model.
Compliance with ethical standards. The study does not require the submission of the conclusion of the Biomedical ethics committee or other documents.
Conflict of interest. Author declare no conflict of interest.
Funding. The study had no sponsorship.
Date of receipt: September 21, 2023 / Date of acceptance for printing: December 3, 2023 / Date of publication: December 29, 202
Keywords
About the authors
Ekaterina A. Guseva
Federal State Autonomous Educational Institution of Higher Education I.M. Sechenov First Moscow State Medical University of the Ministry of Health of the Russian Federation (Sechenov University)
							Author for correspondence.
							Email: guseva_e_a@staff.sechenov.ru
				                	ORCID iD: 0000-0001-8389-7981
				                																			                								
Assistant of the Department of Human Ecology and Environmental Hygiene of the Institute of Public Health named after F.F. Erisman, Sechenov First Moscow State Medical University of the Ministry of Health of Russia (Sechenov University), Moscow, 199911, Russian Federation
e-mail: guseva_e_a@staff.sechenov.ru
Russian FederationReferences
- Suhachev V.S., Ivanov S.M., Filimonov D.A., Porojkov V.V. Al’ternativnye metody issledovaniya. Komp’yuternaya ocenka ostroj toksichnosti dlya gryzunov. Laboratornye zhivotnye dlya nauchnyh issledovanij. 2019; 4. https://doi.org/10.29296/2618723X-2019-04-04 (in Russian)
- Gramatica P., Papa E., Sangion A. QSAR modeling of cumulative environmental end-points for the prioritization of hazardous chemicals. Environmental Science: Processes & Impacts. 2018; 20(1): 38–47. https://doi.org/10.1039/C7EM00519A
- Carrio P., Sanz F., Pastor M. Towards a unifying strategy for predicting toxicological endpoints based on structure. Archive of Toxicology. 2016; 90: 2445–460. https://doi.org/10.1007/s00204-015-1618-2
- Reyes A.B., Bayich V.B. In silico toxicology: computational methods for predicting chemical toxicity. Interdisciplinary reviews of Wiley: Computational Molecular Science. 2016; 6(2): 147–72. https://doi.org/10.1002/wcms.1240
- Villaverde J.J., Sevilla-Moran B., Lopez-Goti S., Alonso-Prados J.L., Sandin-España P. QSAR/QSPR models based on quantum chemistry for assessing the risk of pesticides in accordance with current European legislation. SAR and QSAR in environmental studies. 2020; 31(1): 49–72. https://doi.org/10.1080/1062936X.2019.1692368
- Zholdakova Z.I., Harchevnikova N.V. Sistema uskorennoj ocenki toksichnosti i opasnosti himicheskih veshchestv v vode.Zdorov’e naseleniya i sreda obitaniya. 2014, 8(257): 21–3. (in Russian)
- Idakwo G., Luttrell J., Chen M., et al. A review on machine learning methods for in silico toxicity prediction. Journal of environmental science and health. Part C. Environmental carcinogenesis & ecotoxicology reviews. 2018; 36(4): 169–91. https://doi.org/10.1080/10590501.2018.153711
- Moriwaki H., Tian Y.S., Kawashita N. et al. Mordred: a molecular descriptor calculator. J Cheminform 2018; 10, 4. https://doi.org/10.1186/s13321-018-0258-y
- Gallagher M.E. Toxicity testing requirements, methods and proposed alternatives. Environmental Law and Policy Journal. 2003; 26(2): 257–73.
- Dearden J.C., Cronin M. T.D., Kaiser K.L.E. How not to develop a quantitative structure–activity or structure–property relationship (QSAR/QSPR). SAR and QSAR in Environmental Research. 2009; 20(3–4): 241–66. https://doi.org/10.1080/10629360902949567
- Golbraikh A., Tropsha A. Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. Journal of Computer-Aided Molecular Design. 2002; 16(5/6): 357–69. https://doi.org/10.1023/A:1020869118689
- Frimayanti N., Yam M.L., Lee H.B., Othman R., Zain S.M., Rahman N.A. Validation of Quantitative Structure-Activity Relationship (QSAR) Model for Photosensitizer Activity Prediction. International Journal of Molecular Sciences. 2011; 12(12): 8626–44. https://doi.org/10.3390/ijms12128626
Supplementary files
 
				
			 
					 
						 
						 
						 
						 
									
 
  
  
  Email this article
			Email this article 
 Open Access
		                                Open Access Access granted
						Access granted Subscription or Fee Access
		                                							Subscription or Fee Access
		                                					
