Applied Machine Learning in Drug Discovery

The Challenge:
Discovering new bioactive small molecules is a lengthy and complex endeavor, especially in the context of antibiotic research. Leveraging the wealth of publicly available chemical and biochemical databases via data-driven modeling holds the promise of accelerating the drug discovery pipeline.
Our Goals:
We aim to leverage Artificial Intelligence (AI) algorithms to develop new in silico models to identify novel small-molecule antibiotics. Specifically, we seek to create an end-to-end pipeline combining Generative Deep Learning (DL) and Quantitative Structure-Activity Relationship (QSAR) models to suggest new chemical moieties and optimize them in terms of antibiotic activity and pharmacological properties.
Our Research:
Our projects focus on two distinct areas. On one hand, we conduct theoretical cheminformatics research focused on developing new algorithms for molecular property prediction, while on the other we employ these methods to model public and in-house datasets to discover new antibiotics. Current projects focus on the following areas:
- Data valuation algorithms to explain the predictions of QSAR models and identify false positives in the training data.
- Multi-modal self supervised learning to develop performant QSAR models on endpoints where little or no experimental data is available for training.
- Using alternative sources of information for QSAR modeling to overcome the limits of chemical representations.
Selected Publications
1. | Schuh, M. G., Boldini, D., Sieber, S. A., "Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery." J. Chem. Inf. Model. (2024) | doi: |
2. | Boldini, D., Grisoni, F., Kuhn, D., Friedrich, L., Sieber, S. A. "Practical guidelines for the use of gradient boosting for molecular property prediction." J Cheminform 15, 73 (2023). | PMCID: PMC10464382 | doi: |
3. | Schuh, M. G., Boldini, D., Sieber, S. A. "TwinBooster: Synergising Large Language Models with Barlow Twins and Gradient Boosting for Enhanced Molecular Property Prediction." arXiv (2024) | arXiv:2401.04478 | doi: |
4. | Boldini, D., Ballabio, D., Consonni, V., Todeschini, R., Grisoni, F., Sieber, S. A. "Effectiveness of molecular fingerprints for exploring the chemical space of natural products." J. Cheminform. 16, 35 (2024) | PMCID: PMC10964529 | doi: |