BEIJING — Tsinghua University recently played host to the finals of the inaugural Global AI Drug R&D Algorithm Competition, a prestigious event that drew attention from academic and industry circles alike. After a fierce showdown among 878 teams hailing from universities, research institutions, and corporations across the globe, the joint team of IceKredit and Nanjing University emerged as a standout force, clinching the coveted third prize.
This pioneering competition, co-sponsored by Baidu PaddlePaddle, Tsinghua University School of Pharmacy, and Lingang Laboratory, garnered robust backing from the Chinese Pharmaceutical Association and other pivotal entities. A panel of distinguished experts and professors in the biopharmaceutical domain lent their insights, serving as the competition’s esteemed jury.
The competition garnered participation from 1105 individuals within 878 teams worldwide, amassing a total of 6080 algorithmic submissions. IceKredit’s collaboration with Nanjing University yielded a formidable contender. Following an intense three-month preliminary and semi-final phase, the partnership found itself locked in combat with 14 exceptional teams hailing from luminaries such as Microsoft Research Asia, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai Jiaotong University, Zhejiang University, and Xi’an Jiaotong University during the finals. The ultimate showdown encompassed on-site defenses and in-depth discussions covering competition question strategies, core theories, data analytics and manipulation, and algorithmic solutions. Out of this crucible, the IceKredit-Nanjing University alliance clinched the third spot with distinction.
The competition’s focal point rested on pressing matters, including small molecule drug development for addressing the novel coronavirus. The driving aim was to galvanize participants to harness the power of artificial intelligence in unveiling potential remedies for the virus. This involved leveraging deep learning and molecular docking techniques to predict and evaluate interactions between small molecules and key proteases. Additionally, participants scrutinized the potential of these molecules to thwart virus replication within cells, essentially seeking out promising drug candidates. This bold initiative aspired to elevate innovation in the realm of drug research and development, while fostering a robust foundation for future disease prevention and treatment.
The triumphant collaborative effort between IceKredit and the Christopher J. Butch research group at Nanjing University began with the preliminary rounds. Here, the team engaged a diverse array of conventional machine learning algorithms such as Bayesian docking, SVM, LightGBM, GBDT, along with advanced deep neural network models including Transformer-CNN, GCN, and D-MPNN. The team tried several different molecular representations, including 3D molecular conformation data, graph features, and molecular characterization methodologies like Morgan Fingerprinting, to anticipate enzyme activity. Notably, the SVM model trained with Morgan fingerprinting emerged as a standout performer, effectively predicting enzyme activity.
In the semi-final round, participants grappled with the complex task of predicting molecular activity in Caco2 cell experiments. Rising to the challenge, the IceKredit-Nanjing University consortium pioneered a method of feature fusion, artfully enhancing the GEM baseline model. A new MFP encoder structure was introduced, with a focus on imbuing graph features and global attention to molecular structure. This was combined with local structural information within molecules, courtesy of Morgan fingerprints. The result was a more holistic molecular data representation, augmenting the model’s predictive capacity and classification prowess. This innovative approach mitigated overfitting risks, elevating overall model generalization. The team also introduced dropout mechanisms and strategically divided the training and validation sets based on molecular scaffold, ensuring the model’s performance was rigorously tested against novel molecules.
Since March 2022, IceKredit and Nanjing University have embarked on a collaborative journey, delving into AI applications within the medical domain. Their efforts focus on computer-aided drug molecular design methodologies, blending artificial intelligence, molecular dynamics simulation, and computational biology with traditional chemistry and biology laboratories. The outcome is an accelerated discovery process for potential drug molecules.
In the span of a mere year, this partnership has yielded impressive outcomes. Apart from securing honors in the competition, the collaborative team recently published a groundbreaking SCI paper titled “Improving Drug Discovery with a Hybrid Deep Generative Model Using Reinforcement Learning Trained on a Bayesian Docking Approximation.” This novel drug discovery method, a hybrid of deep generative models and reinforcement learning, has demonstrated remarkable potential. Utilizing approximate docking scores predicted by a Bayesian regression model, the method generates new compounds that outperform docking scores of similarly sized molecules by 10-20%, all while expediting the process 130 times faster than conventional docking methods. The innovative approach holds the promise of efficiently uncovering novel chemical molecular structures with potential therapeutic applications.