Realtime Access Map
In silico analysis of simple sequence repeats (SSRs) in chloroplast genomes of Glycine species
MetadataShow full item record
Microsatellites, also known as simple sequence repeats, are short (1-6 bp long) repetitive DNA sequences present in chloroplast genomes (cpSSRs). In this work, chloroplast genomes (cpDNA) of eight different species (G. canescens, G. cyrtoloba, G. dolichocarpa, G. falcata, G. max, G. soja, G. stenophita, and G. tomentella) from Glycine genus were screened for cpSSRs by MISA perl script with a repeat size of ≥10 for mono-, 5 for di-, 3 for tri-, tetra-, penta- and hexa-nucleotide, including frequency, distributions, and putative codon repeats of cpSSRs. According to our results, a total of 1273 cpSSRs were identified and among them, 413 (32.4%) were found to be in genic regions and the remaining (67.6%) were all located in intergenic regions, with an average of 1.04 cpSSRs per kb. Trinucleotide repeats (45%) were the most abundant motifs, followed by mononucleotides (36%) and dinucleotides (11.8%) in the plastomes of the Glycine species. In genic regions, trimeric repeats, the most frequent one reached the maximum of 70.7%. Among the other repeats, mono- and tetrameric repeats were represented in proportions of 25.7% and 3.6%, respectively. Interestingly, there were no di-, penta-, and hexameric repeats in coding sequences. The most common motifs found in all plastomes were A/T (97.8%) for mono-, AT/AT (98%) for di-, and AAT/ATT (41.5%) for trinucleotides. Among the chloroplast genes, ycf1 had the highest number of cpSSRs, and G. cyrtoloba and G. falcata species had the maximum number of genes containing cpSSRs. The most frequent putative codon repeats located in coding sequences were found to be glutamic acid (21.2%), followed by serine (15.5%), arginine (8.3%) and phenylalanine (7.8%) in all species. Also, tryptophan, proline, and aspartic acid were not detected in all plastomes.