Thanks to AI the world of protein engineering just took a giant leap forward

New AI tool AiCE speeds up protein engineering with better accuracy and lower costs. (CREDIT: IGDB)

The world of protein engineering just took a giant leap forward. A team in China has developed a method that makes designing better proteins faster, cheaper, and easier. Led by Professor Gao Caixia from the Institute of Genetics and the Developmental Biology of the Chinese Academy of Science, as well as published in the journal Cell, the team created a new approach called AI-informed Constraints for protein Engineering, or AiCE.

Why Protein Engineering Matters

Proteins are the workhorses of life. They build tissues, send signals, fight diseases, and carry out countless processes in your body. Changing the amino acid sequence of a protein can change its function. This is what protein engineering does. It alters proteins to work better or do new jobs, helping in fields like medicine, agriculture, and environmental science.

Traditional protein engineering uses two main approaches. First, there is rational design, where scientists change proteins based on their knowledge of structure and chemistry. Second, there is directed evolution, where proteins undergo many rounds of mutation and selection to find improved variants. Both approaches have limitations.

A graphical abstract of the study. (CREDIT: Cell)

Rational design depends on human expertise and often gets stuck in local fitness peaks, unable to find the global best version. Directed evolution, on the other hand, costs a lot of time and money. It involves making thousands or millions of variants, then testing each one. Even then, finding improved proteins is rare because the protein fitness landscape is rugged and full of peaks and valleys.

Enter AI for Faster Solutions

Recent advances have used AI to help in protein engineering. AI models explore vast sequence spaces and suggest mutations that might improve function. But these methods usually need powerful computers to train large models, and they do not always work well on proteins outside their training set.

The new AiCE approach solves these problems. It uses existing inverse folding AI models in a smarter way. Inverse folding models work by predicting the amino acid sequence that fits a given protein structure. Instead of designing a structure from a sequence, they do the reverse.

General inverse folding models like ESM-IF1 and ProteinMPNN learn from millions of protein structures and sequences. They understand how different amino acids fold together to form stable proteins. Because these models are already trained, researchers do not need to build new ones from scratch.

AiCE builds on this by adding structural and evolutionary constraints to the model outputs. This means it not only picks sequences that fit the shape but also checks which mutations are likely to work well in nature. By doing this, AiCE can design better proteins without needing heavy computational resources.

How AiCE Works

AiCE has two main modules. The first is AiCEsingle, which predicts high-fitness single amino acid substitutions. It samples many sequences from the inverse folding model and picks the most promising changes. Benchmark tests showed AiCEsingle outperformed other AI-based methods by 36–90%.

AiCE as an AI-informed approach for protein engineering. (A) Conceptual overview of structure-based, evolution-based, and AI-assisted protein engineering approaches. (B) Schematic of the AiCE approach for the rational design of single and multiple mutations using protein inverse folding models. (CREDIT: Cell)

For example, researchers tested AiCEsingle on 60 deep mutational scanning datasets. These datasets include data from many protein families and functions, like tumor suppression (p53), immune defense (Cas9), and stress response (HSP90). AiCEsingle accurately predicted which mutations would improve protein fitness, even for complex proteins and protein–nucleic acid complexes.

Adding structural constraints alone improved prediction accuracy by 37%. Mutations in flexible regions of a protein tended to be more beneficial than those in rigid regions. Flexible areas can handle changes better, leading to higher chances of success.

The second module is AiCEmulti. This predicts combinations of multiple mutations. Combining mutations is tricky because changes can interfere with each other, a problem known as negative epistasis. AiCEmulti integrates evolutionary coupling constraints. This means it looks at how amino acids interact and evolve together, picking combinations that are more likely to work.

Using AiCEmulti, researchers can predict multiple high-fitness mutations quickly and at low cost. This expands its usefulness for engineering proteins with complex functions.

Performance and analysis of AiCEsingle in predicting high-fitness mutations across various proteins. (CREDIT: Cell)

Real-World Results and Applications

The team used AiCE to engineer eight proteins with different structures and uses. These included deaminases, nucleases, nuclear localization sequences, and reverse transcriptases. The success rates ranged from 11% to 88%, which is far higher than random chance.

One exciting result was the development of next-generation base editors. Base editors are tools used in gene editing to change specific DNA letters without cutting the DNA. This makes them safer and more precise. Using AiCE, researchers created enABE8e, a cytosine base editor with an editing window 50% narrower than before. This means it can target genes with greater accuracy.

They also developed enSdd6-CBE, an adenine base editor with 1.3 times higher fidelity, and enDdd1-DdCBE, a mitochondrial base editor with 13 times more activity. These advances could help in treating genetic diseases, improving crops, and many other applications.

S1. AiCEsingle under structural constraints can efficiently predict high-fitness mutations in various proteins, related to Figure 2. (CREDIT: Cell)

What Makes AiCE Different

AiCE is simple to use, efficient, and general. It does not need specialized AI models for each task. By adding structural and evolutionary information to existing models, it improves prediction without heavy costs.

In one test, researchers used inverse folding models to sample many possible amino acid sequences for a protein. They then calculated the appearance rate of each amino acid at each position. Mutations with high appearance rates in flexible regions were more likely to improve protein fitness.

They analyzed 31 large-scale deep mutation scanning datasets. These covered proteins from viruses, bacteria, and eukaryotes. The results showed that mutations in flexible regions had significantly higher fitness than those in rigid areas (p < 0.0001). Logistic regression showed predictions in flexible regions were 18% more likely to be high fitness.

Further testing on 29 other datasets confirmed these trends. Flexible regions remained the best spots for beneficial mutations.

Multi-method-based evolution of the double-stranded DNA deaminase Ddd1 for environmental fitness. (CREDIT: Cell)

Future Potential and Impact

AiCE shows that by combining AI with structural and evolutionary knowledge, scientists can design better proteins faster and cheaper. This opens doors in many fields.

In precision medicine, AiCE can help design enzymes that fix genetic mutations causing diseases. In agriculture, it can create proteins that make crops resistant to pests or harsh environments. In environmental science, it can build enzymes that break down plastics or pollutants.

Professor Gao and the team believe AiCE will make protein engineering accessible to more researchers. They said, “AiCE offers a versatile, user-friendly mutation-design method that outperforms conventional approaches in efficiency, scalability, and generalizability.”

Their work is not just about building better proteins. It is about building a better way to build proteins. By unlocking the power of existing AI tools and adding smart constraints, they created a method that saves time, money, and effort while delivering strong results.

Closing Thoughts

Protein engineering is like solving a giant puzzle where each piece affects all others. In the past, finding the right combination took years of work. Now, thanks to AiCE, that puzzle may become much easier to solve. The next generation of drugs, crops, and materials could arrive faster than anyone imagined.

This breakthrough from the Chinese Academy of Sciences shows what happens when AI and biology come together. It is not only a win for science but also for the world that depends on it.

Note: The article above provided above by The Brighter Side of News.

Like these kind of feel good stories? Get The Brighter Side of News' newsletter.

AI artificial intelligence engineering Evolution Global Good News Health News Innovation News Medical Good News Proteins Research Science