STUDY. The development of mRNA vaccines can be streamlined by involving artificial intelligence

--

Scientists at Princeton University have developed a machine learning model, UTR-LM (5′ untranslated regional language model)which may have real implications for the development of highly effective mRNA-based vaccines.

Compared to other AI models with applications around messenger RNA, or the decoding of various biological sequences, UTR-LM is the first digital tool that focuses on the 5′ untranslated region of mRNA.

According to the results of the study published in the journal Nature Machine Intelligence, UTR-LM outperformed the best benchmarks used in the development of mRNA vaccines, leading to a:

  • 5% improvement in ribosomal load prediction;
  • 8% improvement in translation process effectiveness prediction;
  • 32.5% increase in the efficiency of the protein production process.

Furthermore, the digital tool was used to design a library containing 211 novel 5′ untranslated regions. The sequences were optimized to fulfill certain functions (mainly to make the translation process more efficient) and later evaluated in the laboratory. The results showed a approximately 33% improvement in protein production efficiency compared to the performance of the best optimized 5` UTR sequences currently used in mRNA vaccine development. Increasing the efficiency of protein synthesis even to a small degree represents a major advance in obtaining new and improved therapeutic products.

UTR-LM was trained on endogenous 5` UTR regions, originating from various species. Also, the preparation of the model included the integration of information about the protein synthesis process, as well as their structure and energy. The most difficult part of the model training process was creating the database, which was pieced together from smaller sets of data from multiple studies.

“Part of the database used to train the model comes from a study that contains measures of efficacy, while another part of the database comes from another study that measured expression level. We also collected unlabeled data from multiple sources. Training such a model consists not only in joining all these sequences, but also in joining the sequences with labels that have been collected so far. This has never been done before” – says study coordinator Mengdi Wang, an expert in machine learning.

The UTR-LM digital tool works more or less on the same principle as the LLMs used in the development of chatbots like ChatGPT. One element that differentiates them is the data set they are trained on. While virtual assistants are trained with billions of text information from the Internet, UTR-LM is trained with hundreds of thousands of biological sequences.

STUDY. The development of mRNA vaccines can be streamlined by involving artificial intelligence
Image credit: freepik.com

The central dogma of biology defines the direction in which genetic information flows as follows: DNA – RNA – proteins. Through a process called transcription, messenger RNA takes the genetic information from the DNA and carries it on through translation to the cellular structures responsible for protein synthesis. Only a part of the mRNA contains the genetic code necessary for the future protein, the part that is also translated. Specific, the 5` untranslated region it is at the beginning of the mRNA molecule and plays a crucial role in the regulation of the translation process, also affecting the level of gene expression of proteins.

Read also:

Vaccines based on messenger RNA began to come to the public’s attention with the COVID-19 pandemic, but have more than 20 years of research behind them. Studies are currently looking into the implications of these vaccines in cancer treatment:

The article is in Romanian

Tags: STUDY development mRNA vaccines streamlined involving artificial intelligence

-

PREV What are the proteins that restore skin cells and how they help you rejuvenate
NEXT More beneficial exercise than walking