Comparative Evaluation of Rice SSR Markers on Different Oryza Species

The growing number of rice microsatellite markers permit a comprehensive comparison of allelic variation among the markers developed using different methods, with diverse repeat motifs and at variable genomic regions. Under this study, comparison between a set of 67 microsatellite markers representing the whole (twelve) rice chromosomes was done over worldwide collections of nine species of the genus Oryza. These SSR markers were evaluated for the genetic parameters such as; number of alleles amplified per primers, observed heterozygosity, gene diversity, rare allelic frequency and Polymorphic Information Content (PIC) values. Among the microsatellite markers that were assessed in the present study, highest overall degree of genetic diversity was recorded on a dinucleotide repeat motif containing markers. Therefore, employing such very informative markers for future molecular characterization and diversity study of Oryza species is advisable. The unique alleles generated from those polymorphic markers could also have significant role on the efforts of conservation, population genetics study and identification of both wild and cultivated Oryza species.


Introduction
The Asian cultivated rice (Oryza sativa L.) which feeds more than one third of the world's population is categorized under the genus Oryza [1]. A variety of local landraces and cultivars of rice are found in the indica and japonica subspecies [2,3]. Other than these two huge reservoirs of rice germplasm, over 20 wild Oryza species constitute an exceptionally valuable gene pool for rice improvement [4]. Thus, knowledge on the level of genetic diversities and species relationships in the genus Oryza is essential for efficient strategies targeting collection, conservation and introgression of useful genes to cultivated rice [5]. In rice breeding, hybridization between parental lines with a defined genetic distance and subsequent selection is a common approach [1]. These days, information on the extent of such genetic relationships between genotypes for an effective breeding program is generated from diverse molecular marker-based techniques [6].
Molecular markers are powerful tools in the assessment of genetic variation and elucidation of genetic relationships within and among plant species [7][8][9]. Likewise, molecular markers have been used in the genus Oryza to identify accessions [10], determine genetic structure and pattern of diversity [11], and optimize assembly of core collections [12]. Earlier molecular marker types such as RAPD, ISSR and AFLP have been used very frequently for fingerprinting and characterization of varieties and germplasm accessions [13]. However, they can be utilized without prior genomic information on the target crop and thus considered as markers of choice [6]. Since 2000, the locus specific Simple Sequence Repeat (SSR) markers got preferential application in cultivar identification of many crops such as, rice [14].
conservation, source of the SSRs (genomic and EST-SSR) and/ or nature of the SSRs' nucleotide repeat unit (such as di-, triand tetra) [6]. Besides, [27] reported markers having perfect dinucleotide (GA and CT) repeat motifs as markers with high level of variation among rice genotypes.
Many studies over diverse types of molecular markers reported SSRs as markers with highly significant allelic variation [28]. Allelic variation for tested SSR loci defines polymorphism among varieties or the polymorphism information content (PIC) value [1]. Hence, utility of SSR markers for assessment of genetic diversity, population structure study and crop improvement depend on quality of PIC they provide [6]. Indirectly, prior choice of markers with highly informative and polymorphic rice SSR markers-based study is too critical to conserve rice genotypes, reveal gene pool of rice landraces and unlock valuable genes for breeding purposes [29]. Thus, the present study was conducted to compare the widely used 67 selected rice SSR markers (RMs) for the assessment of genetic variability and population structure among 426 samples of 9 Oryza species. SSR markers evaluation over worldwide Oryza species collections for statistical and genetic parameters such as number of alleles amplified per primers, gene diversity, heterozygosity and PIC value identified markers which are more informative and promising for future genetic investigation on both wild and cultivated rice types.

Plant material
A total of 426 rice samples, hereafter referred as "accessions" comprising 35 accessions of cultivated rice germplasms from Yunnan University gene bank (Table 1), 31 accessions of 7 AA and CC genome wild Oryza species from IRRI (Table 2) and 360 collections from 12 Ethiopian populations of the African wild rice (O. longistaminata) ( Table  3) were included in this study.
SSRs are highly frequent 1 to 6 bp repeated motifs distributed throughout the nuclear [15], chloroplast [16,17] and mitochondrial [18] genomes. Unique simple sequence repeat (SSR) profiles in rice cultivars can be generated by using few primers covering all of the 12 chromosomes [19]. However, a total of 18,828 Class 1 di-, tri-and tetranucleotide SSRs, representing 47 distinctive motif families, were identified and annotated on the rice genome [1]. The published high-density linkage map indicated abundance of microsatellite markers (with an average of 51 hypervariable SSRs per Mb) with the highest density of markers at chromosome 3 (55.8 SSRMb-1) and at chromosome 4 (41.0 SSRMb-1) [20].
Such neutral and co-dominant SSR markers have merits like fast assay, technical simplicity, high polymorphism and stability [21]. Besides, the reproducible and cross-species transferable feature of SSR markers make them as valuable tools for genetic diversity study [9]. Among the most widely used DNA marker types, SSR markers are widely applied in gene mapping [22,23], establishment of genetic relationships [15,24], construction of fingerprints [23,25], genetic purity test [26], molecular evolution studies [21], varietal identification and heterosis utilization [9]. For instance, clustering data generated from SSR makers were used for identifying a hybrid with highest heterosis from the intercrossing between different and distantly related Oryza species [1].
Variations on the molecular basis of polymorphism and distribution across the genome allow the different SSR marker types with different views for a given population structure [6]. The high degree of length polymorphism or variable allelic size by different SSR markers may also be attributed to chromosomal rearrangements during the genome's evolution, the different number of repeats in the SSR regions or replication slippage mechanism [21]. Differences in the level of SSR polymorphism may also be associated with gene  samples were diluted to 20 ng/μl using TE (Tris-EDTA) and stored at -20 °C.
A total of Sixty-seven nuclear SSR markers covering the 12 rice chromosomes were used in this study (Table 4). Those polymorphic SSR markers were selected for amplification according to the reports from [27,31,32]. Polymerase chain reaction (PCR) was done by using a 10 μl reaction mixture in a 96-well plate. Each reaction mixture contained 4 μl

Genomic DNA extraction and polymerase chain reaction (PCR)
Total genomic DNA was extracted from fresh leaves by using CTAB protocol as described by [30]. Quality for the extracted DNA was determined by electrophoresis in a 1% agarose gel and quantification was accomplished using a spectrophotometer. For SSR analyses, the extracted DNA     Key; Pi is the frequency of the i th allele, ∑pi 2 is the sum of squared allele population frequency, Mean He is average He across the populations, mean Ho is average Ho across the populations and Ht is total expected heterozygosity.

Overall allelic diversity
A total of 440 different and reproducible alleles were detected at the 67 microsatellite markers' loci. The number of alleles per locus generated by each marker ranged from 2 to 19, with an average value of 6.49. Among those SSR markers, the highest numbers of alleles (19, 14, 13, 13 and 12) were produced by RM225, RM207, RM184, RM206 and RM209, respectively and the smallest number of alleles (2) was produced for RM22 and RM171 ( Table 5). The frequency of a major allele at each locus ranged from 34% (RM1) to 95% (RM60). And on average, 66% of the total alleles of the 67 SSR markers were common or major alleles.

Polyacrylamide gel electrophoresis, SSR alleles scoring and analysis
The amplified PCR products (10 μl) were mixed with a 3 μl bromophenol blue loading dye and electrophoresed in an 8.0% polyacrylamide gel and detected using silver staining as described by [31]. The size of the most intensely amplified product was determined based on its migration in comparison with the size standard (50 bp DNA ladder). The different alleles of a marker were identified on the basis of their size or length in base pairs (bp) variation. Due to the co-dominant nature of SSR markers, the amplified bands representing the different alleles were scored as different genotypes. Thus, bands were recorded as (11, 22, 33. . .) to represent homozygous genotypes or (12, 13, 23. . .) to indicate the heterozygous genotypes. For each marker, '?' was used for missing data.
The number of alleles per locus, number of rare alleles, expected and observed heterozygosity and Fstatstic values like F is , F it and F st were calculated using GenAlEx 6.502 [33]. Besides, the major allelic frequency and PIC of each marker were computed using Power Marker Version 3.25 [34]. To detect unique alleles, number of accessions having a specific allele was counted for each locus and alleles with a frequency of less than 5% were considered as unique alleles for a particular locus. According to [35], polymorphic information content (PIC) for each marker was also calculated. and subspecies.
Differences in plant material composition, population size and species composition make direct comparisons of this study with others a bit irrational. This study however, estimates genetic diversity of each marker in terms of parameters like; the mean number of alleles, genetic diversity, PIC values and number of rare alleles over different Oryza species.
The mean number of alleles detected in the present study (6.49) is comparable to the number of alleles noted by earlier researchers on African rice [38]. According to [39], number of alleles in African wild rice (from 11 to 16 with an average of 14 alleles) is high. Thus, inclusion of large African wild rice materials in this study could lead to greater diversity than the results observed in previous studies showing 1-8 alleles with an average of 4.58 alleles for various classes of microsatellite [40] and also 3 to 9 alleles, with an average of 4.53 alleles per locus for 30 microsatellite markers [41]. Other than the diverse Oryza species assessed in this study, such variability on the number of detected alleles per locus might be associated to the markers' specificity [1].
In this study, all of the markers with highest number of allels (RM225 = 19, RM207 = 14, RM184 = 13, RM206 = 13 and RM209 = 12) have a dinucleotide repeat motifs of (CT)18, (CT)25, (GA)19, (CA)7 and (CT)18, respectively. Though 3 of those 5 markers were with (CT) motifs, [27] reported markers with repeat motif (GA) displayed high level of variation among the rice genotypes than markers with (CT) motifs. However, our results showed that markers having perfect dinucleotide repeat motifs irrespective of their dinucleotide type such as; (CT), (GA) or (CA)), are potentially best markers for molecular characterization and diversity analysis of different Oryza species.
The effective number of alleles in this study was 103.84 in total and per locus varied from 0.85 (RM14) to 2.3 (RM1) with an average of 1.54. Here the range was not wide as, the actual number of alleles varying between 2 and 19. This fact implicated the high influence of number of tested samples on the number of identified alleles and its insignificant impact on the effective number of alleles. It also reveals the higher reliability of effective number of alleles for practical genetic diversity analysis [42][43][44]. From the whole SSR markers assessed in this study, RM225, RM209, RM207 and RM84 produced Na greater than 10 and Ne greater than 2. Since such markers showed greater genetic diversity, they could be suitable tools for assessing genetic diversity within and among Oryza members. The highest levels of actual and effective number of alleles recorded in such locus also contribute to their great levels of expected heterozygosity (on average 0.46) [45].
However, all loci in this study except RM159 and RM184 showed a lower observed Heterozygosity (Ho) (mean = 0.12) than the expected Heterozygosity (He) (mean = 0.27), suggesting a clear shift from the Hardy-Weinberg equilibrium [46] and this shift can be attributed to forces akin to inbreeding within groups [47] or lack of distinctly isolated Oryza populations [46]. Such heterozygous deficiency or deviations from Hardy-Weinberg expectation were also indicated by the relatively high F is value (0.62). The average F st = 0.59, implicated 59% of the total genetic variation among populations. The mating fixation index (Fis), Total inbreeding coefficient (Fit), Genetic differentiation (Fst), and gene flow (Nm) were calculated and indicated in (Table 5). The effective number of alleles was 103.84 in total and per locus Ne varied from 0.85 (RM14) to 2.3 (RM1) with an average of 1.54. When all populations were pooled as one for each locus, the observed hetrozygosity (Ho) was in a range of 0 to 0.5 and 0.12 on average. From the 67 SSR markers-based study, level of genetic diversity (He) fluctuated between 0.06 (RM126) and 0.48 (RM207) with an average of 0.27.
In this study, the observed hetrozygosity (0.12) was actually less than the expected heterozygosity (0.27). From the entire microsatellite loci assayed in this study, the mean degree of genetic differentiation among populations (Fst), the Wright's fixation index (Fis), Total inbreeding coefficient (Fit), and gene flow (Nm) values were 0.59, 0.62, 0.83 and 0.2, correspondingly (

PIC values and unique alleles
Based on their allelic diversity and frequency, PIC values of the 67 SSR markers showed a very great variation. In this study, the highest PIC value (0.72) was recorded for RM225 and RM1. As shown on Table 5, 0.41 was the average PIC value and RM60 (0.09) and RM 22 (0.1) were rice SSR markers with the lowest PIC values. About 38.9% of the SSR markers used in this study showed PIC values higher than 0.5 and highly informative.
From the 440 alleles generated by the 67 microsatellite markers, 33.63% (148 of them) had a frequency ≤ 0.05 and were detected as rare alleles. The maximum number of unique alleles (12) were found in RM225 and the average of the rare alleles was 2.18 (Table 5). Moreover, 77.8% of the markers with higher PIC value (> 0.60) had at least one unique alleles. Generally, RM207 and RM225 were the most informative markers as, they identified rare alleles and produced the highest number of alleles (14 and 19), respectively.

Discussion
Microsatellites are PCR based DNA markers that are widely used in genetic diversity, varietal identification, and germplasm characterization of rice [36]. However, their distribution across the genome and level of polymorphism is quite different among each other [37]. Hence, their application on rice germplasm characterization and improvement will depend on reliability of the information they provide [6]. This comparative evaluation study of the rice SSR markers polymorphism also showed differences on their power of revealing genetic variation over diverse set of Oryza species system and its consequence of high intrapopulation inbreeding (F is = 0.62) could be major factors for the high total inbreeding (F it = 0.83) [48]. In the present study, low value for the number of migrants per generation (Nm = 0.2) was estimated. In fact, pollen viability in the genus Oryza is in general limited to few minutes [49]. Thus, dispersal of whole plants by programmed abscission followed by floating downstream might be factors responsible for the observed Nm value.
According to [50], PIC value is a derivative of both allelic diversity and frequency. Sensitivity of genotyping method and location of primers in the genome largely affect PIC [6]. Thus, PIC value that reflects the allele frequency and diversity among accessions could be varied from one to another SSR locus [51]. In the present study, PIC values for the 67 SSR markers ranged from the lowest value of 0.09 (RM60) to the highest value of 0.72 (RM1 and RM225), with a mean of 0.41 (Table 5). This level of mean polymorphism (0.41) is consistent with the reported PIC value in previous works [2,52,53].
According to [54], markers with a PIC value more than 0.5 are regarded as highly polymorphic. About 38.9% of the SSR markers used in this study had PIC values exceeding 0.5 (Table 5). Such great PIC values maybe associated to the highly co-dominant expression or presence of multiple alleles [19]. The highly polymorphic markers implicated in this study are greatly informative for genetic studies and detection of more alleles at a specific locus [55].
These days, number of unique alleles in a population (private allelic richness) is largely considered for many conservation and population genetics applications [56], distinguishing different species and populations of a species [57] and inferring evolutionary history of a population [58]. As indicated in Table 5, the maximum and average numbers of unique alleles in this study were 12 (RM225) and 2.18 respectively. This wide variation in the numbers of private alleles might show the variable periods of genetic isolation during the evolutionary history of Oryza species [48]. Markers used in this study and their association with rare alleles could be utilized by plant breeders and geneticists for the marker assisted selection programs [5]. Moreover, possible relationship of such unique alleles with diverse quantitative trait locus (QTL) regions must be studied [51].