SCIENTIFIC EDUCATIONAL CENTER science idea

The iMolecule scientific group from Skoltech has developed a solution that, based on data on the structure of RNA and DNA, predicts the sites of these molecules suitable for interaction with the intended medicinal substances. Knowing these so-called binding sites, it is possible to more effectively and purposefully find formulas for new drugs, including antiviral ones. The solution uses artificial intelligence and determines binding sites more accurately than analogues, since it takes into account the influence of the spatial configuration of the molecule on the availability of sites.
For a long time, pharmacologists saw in RNA only an intermediary between the instructions in our genome (DNA) and the functional proteins encoded in them — the proteins themselves remained the target of most drugs. At the same time, it is known that proteins are encoded only in an insignificant part of those 85% of the genome from which RNA is transcribed. The remaining, non-coding RNA participates in gene regulation or performs other functions, often assuming a certain conformation, that is, a spatial configuration. Since processes involving non-coding RNA can also contribute to the development of diseases, RNA sequences — and DNA, too - are increasingly being considered as potential targets for drugs.
"Nucleic acids, DNA and RNA, are involved, for example, in signal transmission and other processes that can be medically affected. This approach, in particular, may be suitable for diseases involving disordered proteins or proteins without available binding sites," explained the head of the study, senior lecturer at Skoltech Peter Popov. —In addition, there are foreign RNA and DNA, for example viral ones - coronavirus, HIV, etc., which are one of the main targets in the fight against pathogens."
To unlock the potential of all these alleged drug targets, pharmacologists need tools to sort through huge databases of chemical compounds — thus determining which of them interact with a particular nucleic acid and through which sites.
"Our solution is based on similar work with proteins," Popov explained. - Three-dimensional structures of nucleic acids are encoded in the form of high-dimensional tensors. Then the computer vision algorithm "looks" at tensors and looks for areas similar to binding sites. After detecting the conformation and the binding site, you can start purposeful work on finding drugs. Thus, our work is part of the transition from blind search to rational drug design. The superiority of the latter becomes more noticeable with the growth of connection libraries."
One important advantage of the new solution is due to the fact that DNA and RNA molecules, due to their shape, tend to fold and take on different conformations — at the same time, their properties change, including the available binding sites. Traditional approaches are based on the sequence of nucleic acids, that is, the "letter code", but ignore the conformation, which is a big disadvantage.
"In addition, most of the previous methods were applicable only to RNA, and specifically to a single chain. And ours works with DNA and two or more chains, and we can even detect sites that occur "at the junction" when several macromolecules interact," added Igor Kozlovsky, a graduate student at Skoltech, the first author of the work.
"A good example of why conformation should not be ignored is associated with the most common type of HIV," the scientist continued. — He has an RNA site that many drugs target. But although the sequence of nucleic acids is the same, when the conformation of the molecule changes, the set of drugs that can affect it changes. The predictions of our neural network reproduce this effect."
The new solution has one unexpected application "backwards": instead of recognizing binding sites on a potential target, you can consider a problematic active substance. It can be a small molecule like a hormone, the action of which causes the disease.
"It is possible to "distract" these small molecules. To do this, you need to build a short sequence of nucleic acids, called an aptamer, which will serve as a target for a problematic hormone or other substance. Obviously, there should be a binding site on the aptamer, and our solution can be used to design aptamers with stronger interaction," Popov concluded.
The article was published in the journal Nuclear Acid Research: Genomics and Bioinformatics
PHOTO: The gray figures correspond to six spatial configurations of the same HIV RNA sequence. It is the target of many antiviral agents, such as the compounds shown here in colored beads. The neural network presented by scientists from Skoltech predicts the presence of binding sites where large pink spheres are depicted. It can be seen that they correspond to the location of reliably known sites that are shaded in blue, orange, etc. © Igor Kozlovsky, Peter Popov/NAR Genomics and Bioinformatics
Source: skoltech.ru, sci-dig.ru

Certificate of registration of mass media ЭЛ № ФС 77 - 78868 issued by Roskomnadzor on 07.08.2020