Predicting active compounds for lung cancer based on quantitative structure-activity relationships

Hamza Hanafi, Badr Dine Rossi Hassani, M’hamed Aït Kbir


Recently, advancements in computational and artificial intelligence (AI) methods have contributed in improving research results in the field of drug discovery. In fact, machine learning techniques have proven to be especially effective in this regard, aiding in the development of new drug variants and enabling more precise targeting of specific disease mechanisms. In this paper, we propose to use a quantitative structure-activity relationship-based approach for predicting active compounds related to non-small cell lung cancer. Our approach uses a neural network classifier that learns from sequential structures and chemical properties of molecules, as well as a gradient boosting tree classifier to conduct comparative analysis. To evaluate the contribution of each feature, we employ Shapley additive explanations (SHAP) summary plots to perform features selection. Our approach involves a dataset of active and non-active molecules collected from ChEMBL database. Our results show the effectiveness of the proposed approach when it comes to predicting accurately active compounds for lung cancer. Furthermore, our comparative analysis reveals important chemical structures that contribute to the effectiveness of the compounds. Thus, the proposed approach can greatly enhance the drug discovery pipeline and may lead to the development of new and effective treatments for lung cancer.


bioinformatics; diagnosis of cancer; drug discovery; drug target; lung cancer;

Full Text:



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).