Base substitution is one of the raw fuels that produce genetic variation; drive evolution. Recent studies have shown that the genome components affect mutation patterns to some extent. In order to infer the correlation between the Transition/Transversion ratio (Ts/Tv); the number of immediately adjacent A&T nucleotides, we investigated 3611007 Oryza sativa SNPs (including 45462 coding SNPs,; 242811 intronic SNPs); 32019 Arabidopsis SNPs. The results show that Ts/Tv is negatively correlated with the number of immediately adjacent A&T in O. Sativa; Arabidopsis. We further calculated AT2 (the number of SNPs whose immediately adjacent nucleotides are either A or T); AT0 (the number of SNPs whose immediately adjacent nucleotides are either C or G) for all 6 types of SNPs. C/G SNP of O. sativa; Arabidopsis has the highest AT2/AT0, which denotes C/G SNP may be influenced by the adjacent A&T nucleotides mostly. For SNPs in O. sativa, the neighboring effect of A&T nucleotides is limited to 2 nucleotides on both sides; for SNPs in Arabidopsis, the effect extends no more than 4 nucleotides on both sides.
ZHAO HuiLI QizhaiLI JunZENG ChangqingHU SongnianYU Jun
DNA composition dynamics across genomes of diverse taxonomy is a major subject of genome analyses. DNA composition changes are characteristics of both replication and repair machineries. We investigated 3,611,007 single nucleotide polymorphisms (SNPs) generated by comparing two sequenced rice genomes from distant inbred lines (subspecies), including those from 242,811 introns and 45,462 protein-coding sequences (CDSs). Neighboring-nucleotide effects (NNEs) of these SNPs are diverse, depending on structural content-based classifications (genomewide, intronic, and CDS) and sequence context-based categories (A/C, A/G, A/T, C/G, C/T, and G/T substitutions) of the analyzed SNPs. Strong and evident NNEs and nucleotide proportion biases surrounding the analyzed SNPs were observed in 1-3 bp sequences on both sides of an SNP. Strong biases were observed around neighboring nucleotides of protein-coding SNPs, which exhibit a periodicity of three in nucleotide content, constrained by a combined effect of codon-related rules and DNA repair mechanisms. Unlike a previous finding in the human genome, we found negative correlation between GC contents of chromosomes and the magnitude of corresponding bias of nucleotide C at -1 site and G at +1 site. These results will further our understanding of the mutation mechanism in rice as well as its evolutionary implications.