Machine learning for the prediction of phenols cytotoxicity

(1) * Latifa Douali Mail (Department of Computer Science, Regional Centre of Training and Education (CRMEF) Marrakech-Safi, Morocco)
*corresponding author


Quantitative structure-activity relationships (QSAR) are relevant techniques that assist biologists and chemists in accelerating the drug design process and help understanding many biological and chemical mechanisms. Using classical statistical methods may affect the accuracy and the reliability of the developed QSAR models. This work aims to use a machine learning approach to establish a QSAR model for phenols cytotoxicity prediction. This issue concern many chemists and biologists. In this investigation, the dataset is diverse, and the cytotoxicity data are sparse. Multi-component description of the compounds has then been considered. A set of molecular descriptors fed the deep neural network (DNN) and served to train the DNN. The established DNN model was able to predict the cytotoxicity of the phenols at high precision. The correlation coefficient at the fitting stage was higher than other statistical methods reported in the literature or developed in the present work, specifically multiple linear regression (MLR) and shallow artificial neural networks (ANN), and was equal to 0.943. The predictive capability of the model, as estimated by the coefficient of determination on an external predictive dataset, was significantly high and was about 0.739. This finding could help implement many molecular descriptors relevant to describing the compounds, representing the effects governing the phenols' cytotoxicity toward Tetrahymena pyriformis, avoiding overfitting and outlier exclusion.


Cytotoxicity; Deep neural networks; Phenols; QSAR; Tetrahymena pyriformis; Risk assessment



Article metrics

Abstract views : 597 | PDF views : 305




Full Text



[1] G. W. Burton, Y. Le Page, E. J. Gabe, and K. U. Ingold, “Antioxidant activity of vitamin E and related phenols. Importance of stereoelectronic factors,” J. Am. Chem. Soc., vol. 102, no. 26, pp. 7791–7792, Dec. 1980, doi: 10.1021/ja00546a032.

[2] W. M. El-Husseiny, M. A.-A. El-Sayed, N. I. Abdel-Aziz, A. S. El-Azab, E. R. Ahmed, and A. A.-M. Abdel-Aziz, “Synthesis, antitumour and antioxidant activities of novel α,β-unsaturated ketones and related heterocyclic analogues: EGFR inhibition and molecular modelling study,” J. Enzyme Inhib. Med. Chem., vol. 33, no. 1, pp. 507–518, Jan. 2018, doi: 10.1080/14756366.2018.1434519.

[3] L. Zhao et al., “Nutshell Extracts of Xanthoceras sorbifolia : A New Potential Source of Bioactive Phenolic Compounds as a Natural Antioxidant and Immunomodulator,” J. Agric. Food Chem., vol. 66, no. 15, pp. 3783–3792, Apr. 2018, doi: 10.1021/acs.jafc.7b05590.

[4] G. Liu et al., “Antioxidant capacity of phenolic compounds separated from tea seed oil in vitro and in vivo,” Food Chem., vol. 371, p. 131122, Mar. 2022, doi: 10.1016/j.foodchem.2021.131122.

[5] K. Jomová et al., “A Switch between Antioxidant and Prooxidant Properties of the Phenolic Compounds Myricetin, Morin, 3′,4′-Dihydroxyflavone, Taxifolin and 4-Hydroxy-Coumarin in the Presence of Copper(II) Ions: A Spectroscopic, Absorption Titration and DNA Damage Study,” Molecules, vol. 24, no. 23, p. 4335, Nov. 2019, doi: 10.3390/molecules24234335.

[6] N. R. Gassman, “Induction of oxidative stress by bisphenol A and its pleiotropic effects,” Environ. Mol. Mutagen., vol. 58, no. 2, pp. 60–71, Mar. 2017, doi: 10.1002/em.22072.

[7] I.-H. Acir and K. Guenther, “Endocrine-disrupting metabolites of alkylphenol ethoxylates – A critical review of analytical methods, environmental occurrences, toxicity, and regulation,” Sci. Total Environ., vol. 635, pp. 1530–1546, Sep. 2018, doi: 10.1016/j.scitotenv.2018.04.079.

[8] W. W. Anku, M. A. Mamo, and P. P. Govender, “Phenolic Compounds in Water: Sources, Reactivity, Toxicity and Treatment Methods,” in Phenolic Compounds - Natural Sources, Importance and Applications, InTech, 2017, doi: 10.5772/66927.

[9] E. Papadaki, M. Z. Tsimidou, and F. T. Mantzouridou, “Changes in Phenolic Compounds and Phytotoxicity of the Spanish-Style Green Olive Processing Wastewaters by Aspergillus niger B60,” J. Agric. Food Chem., vol. 66, no. 19, pp. 4891–4901, May 2018, doi: 10.1021/acs.jafc.8b00918.

[10] M. Khoshnamvand, Z. Hao, O. O. Fadare, P. Hanachi, Y. Chen, and J. Liu, “Toxicity of biosynthesized silver nanoparticles to aquatic organisms of different trophic levels,” Chemosphere, vol. 258, p. 127346, Nov. 2020, doi: 10.1016/j.chemosphere.2020.127346.

[11] Y. Ma et al., “The adverse health effects of bisphenol A and related toxicity mechanisms,” Environ. Res., vol. 176, p. 108575, Sep. 2019, doi: 10.1016/j.envres.2019.108575.

[12] F. Bajot, M. T. D. Cronin, D. W. Roberts, and T. W. Schultz, “Reactivity and aquatic toxicity of aromatic compounds transformable to quinone-type Michael acceptors,” SAR QSAR Environ. Res., vol. 22, no. 1–2, pp. 51–65, Jan. 2011, doi: 10.1080/1062936X.2010.528449.

[13] S. Gautam, Samiksha, S. S. Chimni, S. Arora, and S. K. Sohal, “Toxic effects of purified phenolic compounds from Acacia nilotica against common cutworm,” Toxicon, vol. 203, pp. 22–29, Nov. 2021, doi: 10.1016/j.toxicon.2021.09.017.

[14] W. Wang, P. Xiong, H. Zhang, Q. Zhu, C. Liao, and G. Jiang, “Analysis, occurrence, toxicity and environmental health risks of synthetic phenolic antioxidants: A review,” Environ. Res., vol. 201, p. 111531, Oct. 2021, doi: 10.1016/j.envres.2021.111531.

[15] J. Moreman, O. Lee, M. Trznadel, A. David, T. Kudoh, and C. R. Tyler, “Acute Toxicity, Teratogenic, and Estrogenic Effects of Bisphenol A and Its Alternative Replacements Bisphenol S, Bisphenol F, and Bisphenol AF in Zebrafish Embryo-Larvae,” Environ. Sci. Technol., vol. 51, no. 21, pp. 12796–12805, Nov. 2017, doi: 10.1021/acs.est.7b03283.

[16] R. Chianese et al., “Bisphenol A in Reproduction: Epigenetic Effects,” Curr. Med. Chem., vol. 25, no. 6, pp. 748–770, Feb. 2018, doi: 10.2174/0929867324666171009121001.

[17] R. Garg, S. Kapur, and C. Hansch, “Radical toxicity of phenols: A reference point for obtaining perspective in the formulation of QSAR,” Med. Res. Rev., vol. 21, no. 1, pp. 73–82, Jan. 2001, doi: 10.1002/1098-1128(200101)21:1<73::AID-MED3>3.0.CO;2-5.

[18] L. Douali and D. Cherqaoui, “QSAR Studies of Non-Nucleoside Reverse Transcriptase Inhibitors: The Hydrophobic Effect,” Curr. Comput. Aided-Drug Des., vol. 2, no. 1, pp. 21–29, Mar. 2006, doi: 10.2174/157340906776056446.

[19] F. Ghasemi, A. Mehridehnavi, A. Pérez-Garrido, and H. Pérez-Sánchez, “Neural network and deep-learning algorithms used in QSAR studies: merits and drawbacks,” Drug Discov. Today, vol. 23, no. 10, pp. 1784–1790, Oct. 2018, doi: 10.1016/j.drudis.2018.06.016.

[20] C. HANSCH, P. P. MALONEY, T. FUJITA, and R. M. MUIR, “Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients,” Nature, vol. 194, no. 4824, pp. 178–180, Apr. 1962, doi: 10.1038/194178b0.

[21] C. Hansch, J. P. Björkroth, and A. Leo, “Hydrophobicity and Central Nervous System Agents: On the Principle of Minimal Hydrophobicity in Drug Design,” J. Pharm. Sci., vol. 76, no. 9, pp. 663–687, Sep. 1987, doi: 10.1002/jps.2600760902.

[22] C. Hansch, D. Hoekman, A. Leo, D. Weininger, and C. D. Selassie, “Chem-Bioinformatics: Comparative QSAR at the Interface between Chemistry and Biology,” Chem. Rev., vol. 102, no. 3, pp. 783–812, Mar. 2002, doi: 10.1021/cr0102009.

[23] S. A. Hiller, V. E. Golender, A. B. Rosenblit, L. A. Rastrigin, and A. B. Glaz, “Cybernetic methods of drug design. I. Statement of the problem—The perceptron approach,” Comput. Biomed. Res., vol. 6, no. 5, pp. 411–421, Oct. 1973, doi: 10.1016/0010-4809(73)90074-8.

[24] A. Cherkasov et al., “QSAR Modeling: Where Have You Been? Where Are You Going To?,” J. Med. Chem., vol. 57, no. 12, pp. 4977–5010, Jun. 2014, doi: 10.1021/jm4004285.

[25] L. Douali, D. Villemin, and D. Cherqaoui, “Neural Networks: Accurate Nonlinear QSAR Model for HEPT Derivatives,” J. Chem. Inf. Comput. Sci., vol. 43, no. 4, pp. 1200–1207, Jul. 2003, doi: 10.1021/ci034047q.

[26] G. Gini, F. Zanoli, A. Gamba, G. Raitano, and E. Benfenati, “Could deep learning in neural networks improve the QSAR models?,” SAR QSAR Environ. Res., vol. 30, no. 9, pp. 617–642, Sep. 2019, doi: 10.1080/1062936X.2019.1650827.

[27] G. Hinton et al., “Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups,” IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82–97, Nov. 2012, doi: 10.1109/MSP.2012.2205597.

[28] S.-C. Huang and T.-H. Le, “Introduction to TensorFlow 2,” Princ. Labs Deep Learn., pp. 1–26, 2021, doi: 10.1016/B978-0-323-90198-7.00014-8.

[29] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, 1998, doi: 10.1109/5.726791.

[30] G. B. Goh, N. O. Hodas, and A. Vishnu, “Deep learning for computational chemistry,” J. Comput. Chem., vol. 38, no. 16, pp. 1291–1307, Jun. 2017, doi: 10.1002/jcc.24764.

[31] S. Cohen, “The basics of machine learning: strategies and techniques,” Artif. Intell. Deep Learn. Pathol., pp. 13–40, 2021, doi: 10.1016/B978-0-323-67538-3.00002-6.

[32] Y. Yang, Z. Ye, Y. Su, Q. Zhao, X. Li, and D. Ouyang, “Deep learning for in vitro prediction of pharmaceutical formulations,” Acta Pharm. Sin. B, vol. 9, no. 1, pp. 177–185, Jan. 2019, doi: 10.1016/j.apsb.2018.09.010.

[33] T. B. Hughes, G. P. Miller, and S. J. Swamidass, “Modeling Epoxidation of Drug-like Molecules with a Deep Machine Learning Network,” ACS Cent. Sci., vol. 1, no. 4, pp. 168–180, Jul. 2015, doi: 10.1021/acscentsci.5b00131.

[34] J. Cotterill, N. Price, E. Rorije, and A. Peijnenburg, “Development of a QSAR model to predict hepatic steatosis using freely available machine learning tools,” Food Chem. Toxicol., vol. 142, p. 111494, Aug. 2020, doi: 10.1016/j.fct.2020.111494.

[35] F. Ghasemi, A. Mehridehnavi, A. Fassihi, and H. Pérez-Sánchez, “Deep neural network in QSAR studies using deep belief network,” Appl. Soft Comput., vol. 62, pp. 251–258, Jan. 2018, doi: 10.1016/j.asoc.2017.09.040.

[36] M. T. D. Cronin and T. W. Schultz, “Structure-toxicity relationships for phenols to Tetrahymena pyriformis,” Chemosphere, vol. 32, no. 8, pp. 1453–1468, Apr. 1996, doi: 10.1016/0045-6535(96)00054-9.

[37] J. A. Castillo-Garit, G. M. Casañola-Martin, S. J. Barigye, H. Pham-The, F. Torrens, and A. Torreblanca, “Machine learning-based models to predict modes of toxic action of phenols to Tetrahymena pyriformis,” SAR QSAR Environ. Res., vol. 28, no. 9, pp. 735–747, Sep. 2017, doi: 10.1080/1062936X.2017.1376705.

[38] C. Hansch, A. Jazirehi, S. B. Mekapati, R. Garg, and B. Bonavida, “QSAR of apoptosis induction in various cancer cells,” Bioorg. Med. Chem., vol. 11, no. 13, pp. 3015–3019, Jul. 2003, doi: 10.1016/S0968-0896(03)00184-6.

[39] C. Selassie and R. P. Verma, “QSAR of toxicology of substituted phenols,” J. Pestic. Sci., vol. 40, no. 1, pp. 1–12, 2015, doi: 10.1584/jpestics.D14-097.

[40] C. D. Selassie et al., “Comparative QSAR and the Radical Toxicity of Various Functional Groups,” Chem. Rev., vol. 102, no. 7, pp. 2585–2606, Jul. 2002, doi: 10.1021/cr940024m.

[41] I. V. Tetko et al., “Critical Assessment of QSAR Models of Environmental Toxicity against Tetrahymena pyriformis: Focusing on Applicability Domain and Overfitting by Variable Selection,” J. Chem. Inf. Model., vol. 48, no. 9, pp. 1733–1746, Sep. 2008, doi: 10.1021/ci800151m.

[42] M. T. . Cronin et al., “Comparative assessment of methods to develop QSARs for the prediction of the toxicity of phenols to Tetrahymena pyriformis,” Chemosphere, vol. 49, no. 10, pp. 1201–1221, Dec. 2002, doi: 10.1016/S0045-6535(02)00508-8.

[43] V. Ruusmann, S. Sild, and U. Maran, “QSAR DataBank repository: open and linked qualitative and quantitative structure–activity relationship models,” J. Cheminform., vol. 7, no. 1, p. 32, Dec. 2015, doi: 10.1186/s13321-015-0082-6.

[44] C. Hansch, S. C. McKarns, C. J. Smith, and D. J. Doolittle, “Comparative QSAR evidence for a free-radical mechanism of phenol-induced toxicity,” Chem. Biol. Interact., vol. 127, no. 1, pp. 61–72, Jun. 2000, doi: 10.1016/S0009-2797(00)00171-X.

[45] C. D. Selassie, T. V. DeSoyza, M. Rosario, H. Gao, and C. Hansch, “Phenol toxicity in leukemia cells: a radical process?,” Chem. Biol. Interact., vol. 113, no. 3, pp. 175–190, Jun. 1998, doi: 10.1016/S0009-2797(98)00027-1.

[46] L. Douali, D. Villemin, and D. Cherqaoui, “Comparative QSAR Based on Neural Networks for the Anti-HIV Activity of HEPT Derivatives,” Curr. Pharm. Des., vol. 9, no. 22, pp. 1817–1826, Aug. 2003, doi: 10.2174/1381612033454423.

[47] L. B. Kier and L. H. Hal, Molecular Connectivity in Chemistry and Drug Research. New York: Academic Press, 1976. Available:

[48] L. B. Kier and L. . Hall, “An Electrotopological-State Index for Atoms in Molecules,” Pharm. Res., vol. 7, pp. 801–807, 1990, doi: 10.1023/A:1015952613760.

[49] L. H. Hall and L. B. Kier, “Issues in representation of molecular structure,” J. Mol. Graph. Model., vol. 20, no. 1, pp. 4–18, Dec. 2001, doi: 10.1016/S1093-3263(01)00097-3.

[50] P. Gramatica, N. Chirico, E. Papa, S. Cassani, and S. Kovarich, “QSARINS: A new software for the development, analysis, and validation of QSAR MLR models,” J. Comput. Chem., vol. 34, no. 24, pp. 2121–2132, Sep. 2013, doi: 10.1002/jcc.23361.

[51] J. C. McGowan, “Molecular volumes and structural chemistry,” Recl. des Trav. Chim. des Pays-Bas, vol. 75, no. 2, pp. 193–208, Sep. 2010, doi: 10.1002/recl.19560750208.

[52] J. J. P. Stewart, “Optimization of parameters for semiempirical methods V: Modification of NDDO approximations and application to 70 elements,” J. Mol. Model., vol. 13, no. 12, pp. 1173–1213, Dec. 2007, doi: 10.1007/s00894-007-0233-4.

[53] J. J. P. Stewart, “Application of the PM6 method to modeling proteins,” J. Mol. Model., vol. 15, no. 7, pp. 765–805, Jul. 2009, doi: 10.1007/s00894-008-0420-y.

[54] F. Chollet, “Keras,” Online, 2015, [Online]. Available:

[55] M. Abadi et al., “TensorFlow: A system for large-scale machine learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016, pp. 265–283, [Online]. Available:

[56] G. M. Maggiora, “On Outliers and Activity Cliffs - Why QSAR Often Disappoints,” J. Chem. Inf. Model., vol. 46, no. 4, pp. 1535–1535, Jul. 2006, doi: 10.1021/ci060117s.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Advances in Intelligent Informatics
ISSN 2442-6571  (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
E: (paper handling issues) (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0