Please use this identifier to cite or link to this item:
https://www.arca.fiocruz.br/handle/icict/10104
Type
ArticleCopyright
Open access
Sustainable Development Goals
07 Energia limpa e acessívelCollections
- CDTS - Artigos de Periódicos [475]
- IOC - Artigos de Periódicos [12823]
Metadata
Show full item record
THE PURINE BIAS OF CODING SEQUENCES IS DETERMINED BY PHYSICOCHEMICAL CONSTRAINTS ON PROTEINS
Ancestral codon
Purine bias
Secondary structure
Helix
Sheet
Turn coil
Ribosome
Translation
Energy cost
Affilliation
Universidad de la República. Facultad de Ciencias. Sección Biomatemática. Montevideo, Uruguay.
Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Genômica Funcional e Bioinformática. Rio de Janeiro, RJ, Brasil.
Universidad de la República. Facultad de Ciencias. Sección Biomatemática. Montevideo, Uruguay.
Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Genômica Funcional e Bioinformática. Rio de Janeiro, RJ, Brasil.
Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Genômica Funcional e Bioinformática. Rio de Janeiro, RJ, Brasil.
Universidad de la República. Facultad de Ciencias. Sección Biomatemática. Montevideo, Uruguay.
Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Genômica Funcional e Bioinformática. Rio de Janeiro, RJ, Brasil.
Abstract
For this report, we analyzed protein secondary structures in relation to the statistics of three nucleotide codon positions. The purpose of this investigation was to find which properties of the ribosome, tRNA or protein level, could explain the purine bias (Rrr) as it is observed in coding DNA. We found that the Rrr pattern is the consequence of a regularity (the codon structure) resulting from physicochemical constraints on proteins and thermody-namic constraints on ribosomal machinery. The physicochemical constraints on proteins mainly come from the hydropathy and molecular weight (MW) of secondary structures as well as the energy cost of amino acid synthesis. These constraints appear through a network of statistical correlations, such as (i) the cost of amino acid synthesis, which is in favor of a higher level of guanine in the first codon position, (ii) the constructive contribution of hydropathy alternation in proteins, (iii) the spatial organization of secondary structure in proteins according to solvent accessibility, (iv) the spatial organization of sec-ondary structure according to amino acid hydropathy, (v) the statistical correlation of MW with protein secondary structures and their overall hydropathy, (vi) the statistical correlation of thymine in the second codon position with hydropathy and the energy cost of amino acid synthesis, and (vii) the statistical correlation of adenine in the second codon position with amino acid complexity and the MW of secondary protein structures. Amino acid physicochemical properties and functional constraints on proteins constitute a code that is translated into a purine bias within the coding DNA via tRNAs. In that sense, the Rrr pattern within coding DNA is the effect of information transfer on nucleotide composition from protein to DNA by selection according to the codon positions. Thus, coding DNA structure and ribosomal machinery co-evolved to minimize the energy cost of protein coding given the functional constraints on proteins.
Keywords
GenomicsAncestral codon
Purine bias
Secondary structure
Helix
Sheet
Turn coil
Ribosome
Translation
Energy cost
Share