Therapeutic antibodies are one of the most successful classes of biotherapeutics available, but the traditional antibody development process is cumbersome and has significant limitations. Using artificial intelligence (AI) to speed up the information gathering and access process can help scientists develop better drugs faster. However, the application of AI in the field of large molecule drugs is lagging far behind compared with that of small molecules. On the one hand, it is because of the complex structure-function correspondence of large molecules, and on the other hand, it is because of the lack of large molecule data due to the difficulty of accessing them. And with the breakthrough of machine learning algorithms, the exponential growth of computing power, and the accelerated popularity of sequencing technology and biosynthesis technology, machine learning algorithms began to have unprecedented scale of biological data.The current direction of AI technology applied to antibody drug research can be divided into four main parts as follows.
1. Discovery of New Antigens
Tumor neoantigens play an important role in cancer diagnosis, over-the-top immunotherapy, tumor vaccines, and personalized therapy. The large amount of tumor mutation data obtained by high-throughput sequencing is the data basis for machine learning algorithms to be able to discover antigens. From the tens of thousands of mutated genes in tumor patients, it would greatly reduce research time and cost if de novo antigens that can generate immune responses could be effectively predicted. Deep learning algorithms are integrated into the data processing to resolve candidate peptides for neoantigens with the help of AI technology. By directly identifying neoantigen peptides presented on the cell surface through mass spectrometry, we can identify atypical neoantigens that cannot be identified by traditional NGS methods, while increasing accuracy.
2. Discovery of Antibodies
A fundamental requirement for developing antibody drugs is that the antibody binds well to its target. To find such antibodies, researchers typically start with a known amino acid sequence of an antibody and use bacterial or yeast cells to generate a series of new antibodies with variations of that sequence, then evaluate the ability of these mutants to bind the target antigen. The subset of antibodies that work best undergo another round of mutation and evaluation, and the cycle is repeated until a tightly bound set of finalists emerges. This process is long and expensive, but many of the resulting antibodies are still not effective in clinical trials.
With AI technology, researchers can generate an initial library of hundreds of thousands of possible antibody sequences and feed the dataset into a Bayesian neural network to screen them for affinity to specific protein targets, with much shorter time cycles and pre-screening costs. In addition, AI-based antibody performance prediction, such as specificity transport affinity prediction, developability prediction (including immunogenicity, solubility, aggregation tendency, viscosity, and half-life) can be performed.
3. Optimization of Antibodies
After the discovery of a seedling antibody, researchers use bioengineering to improve the properties of the antibody to ensure effective production without triggering an immune response in vivo, and should also be stable for a sufficient amount of time. Typically, each round of optimization takes 3 months to 6 months, which is time-consuming and labor-intensive.
The application of AI can improve the efficiency and quality of library construction, reduce the number of iterative screens, and make antibody directed evolution more efficient. Combining experiments with machine learning reduces the number of candidate sequences and optimizes the affinity, stability, solubility, and even the performance of seedling antibodies in various physiological activities. For example, natural language processing models can optimize antibody sequences to clinical-level affinity without any explicit target antigen model.
4. De Novo Antibody Generation
Current antibody directed evolution and virtual screening are based on existing natural antibodies. If proteins with new structures and new functions can be constructed and designed from scratch based on biophysical and biochemical principles, it will add more possibilities for drug development. It has been shown that AI can be used to design a variety of very complex protein conformations that do not exist in nature and are more stable than natural proteins. In addition, AI is even able to understand how enzymes should be formed by simply studying raw sequence data.
Although there are some inevitable obstacles to integrating AI tools into the antibody drug development process and a great deal of work remains to be done, there is no doubt that AI is significant in reducing the time and development costs required for drug candidate development and will revolutionize the drug discovery and development process.