A theorem is formulated for genes belonging to one RNA chain and overlapping in pairs. In the theorem a solution of the inverse problem is obtained: to compute all the nucleotide sequences (n.s.'s) corresponding to two protein sequences with sections on which their genes overlap. It is proved that this problem has unique solution if protein sequences do not contain leucine and argine. Due to presence of Leu and Arg Ambiguous points may occur in corresponding positions. These positions are determined by specific properties of local overlaps for codon collections of leucine (L-positions) and arginine (R-positions). Overlapping genes are analyzed, which were found in the genomes of some viruses, for example, BSMV, PAMV, OO174, G4, HIV-1, HIV-2, SIVMAC, STLV-IIIAGM, HBV, GSHV, WHV, ASHV. Among them there are the genomes with more than 50% overlapping. On the basis of analysis fulfilled, it was concluded that writing of nucleo-tides in RNA both in L- and in R- positions is not accidental. The positions may correspond to points of regulatory sequences, and also they may play a significant role in generating some bonds in the secondary structure of MRNA. It is noted, that considered positions correspond to DNA positions, where silet mutations are admissible. These positions may have an influence on regulation of bonding of double-helix DNA with a protein, because writing both AT-pair and CG pair is admissible in them. Analysis of overlaps in closely related genomes OO174, G4 showed the existence of silent mutation, and in WHV, HBV - of seven silent mutations. Tremendous decrease of possible coding sequences for transition to overlaps has been obtained: for the protein B from OO174 the number of such n.s.'s decreased from 1072 to 4.
Publication language:russian
Research direction:
Mathematical modelling in actual problems of science and technics