Exploring elite alleles for seed isoflavones concentration in soybean by association analysis

R E G U L A R A R T I C L E Wang, et al.: Association analysis isoflavones soybean SSR 86 Emir. J. Food Agric ● Vol 28 ● Issue 2 ● 2016 et al. 2001). They have efficiently controlled or inhibited the growth of human breast cancer cell lines in culture (Lee HP 1991). These biological characters have resulted in attracting more attention to soybean seed isoflavones and an increasing interest in changes the isoflavones concentrations of soybean practical varieties. Many of studies shown soybean seed isoflavones concentrations have a great fluctuations, because many biotic and abiotic factors influence their synthesis and accumulation. For example, the isoflavones could be determined over a wide concentration range (0.81135.0 mg/kg for daidzein, 1.9-1442.5 mg/kg for genistein and 0.5-54.6 mg/kg for glycitein) (Beatrix Preinerstorfer et al. 2004). Lucimara et al reported that isoflavones concentrations were influenced by the cytoplasm and the nuclear genes of the maternal parent (Lucimara Chiari et al 2006). Nevertheless, in spite of genetic factors, the environment interactions was largely influence on isoflavones concentration in seeds, such as soybean variety, cultivation year, cultivation location, and temperature (Hoeck JA et al. 2000; Lee SJ et al. 2002; Mebrahtu T et al. 2004; Murphy SE et al. 2009; Juan Jose et al. 2009). Juan et al found farther that isoflavones accumulation in seeds was influenced by multiple interacting genetic loci (Beatrix Preinerstorfer and Gerhard Sontag 2004), meanwhile, when plants grew in variable environments, epistasis has been considered as an important source of genetic variation. Although seed isoflavones were attributable to genetic and environment causes as well as g × e interaction, the heritability of variety was still a key factor causing less phenotypic change in diverse environments. Hence, the main target of the present study was to investigate the variations in isoflavones concentrations in soybean seeds with cropping year. These informations may suggest ways to realize the heritability of isoflavones concentrations in soybean variety, which is very important for soybean breeders aiming to cultivate varieties with high isoflavones concentrations. MATERIALS AND METHODS Plant materials A total of 135 soybean (Glycine max) varieties from different regions for this experiment were cultivated at same location during 2012-2014 in Beijing (Table S1). The soil was a silt clay loam year by year. The planting arrangement was 3×1.5 m per plot, each plot consisted of three rows (3m long and 0.5m between rows) and the experiment completely randomized block design with three replicates. The fertilizers were applied prior to plowing at the recommended rates of 6.2, 4.6 and 5.0 kg per 666.67 m2 135 soybean varieties S1 Zhonghuang4 S2 Zhonghuang5 S3 Zhonghuang7 S4 Zhonghuang10 S5 Zhonghuang18 S6 Zhongpin94 S7 Zhongpin6034 S8 Zhongpin661 S9 Zhongzuo96 S10 Zhongzuo97 S11 Zhongzuoyc17 S12 Zhongzuo92


INTRODUCTION
Soybean (Glycine max) has had a long history as a domesticated plant, originated from the eleventh century BC in China (Stephen Barnes 2010) and become a popular crop plant in China and East Asia, where they have long been cultivated as an important nutritional component of diets and used in many foods, such as soybean oil, soybean sprout, paste, soymilk and tofu (Kim EH et al. 2006).Nowadays, soybean is gaining acceptance in many countries largely as one of the best vegetable protein and oil sources, owing to its beans contains about 40% protein and 20% oil.By 2012, annual world planting area had reached to 108.749 million hectares and production had risen to 267.999 million tons.
In recent decades, several studies have shown some components in soybean possess the health benefits.Regular consumption of soybean foods can reduce the incidence of breast, colon, and prostate cancers (Isanga J et al. 2008), prevent heart disease, osteoporosis (Messina M 2005) and lower plasma cholesterol (Hsu CS et al. 2001), and reduce menopausal symptoms (TY Tai et al. 2012).Isoflavone, a naturally occurring plant chemicals belonging to a category of polyphenols in soybean, were recognized most likely one of the components responsible for the health benefits of soybean and play a potential role in therapeutic or preventive effects on a range of hormone-dependent conditions (Messina M et al. 2006).These discoveries have resulted in the development and application of many functional foods and food supplements based on soybean isoflavones.
Isoflavones belong to a group of compounds that share a basic structure consisting of two benzyl rings joined by a three-carbon bridge.In soybean seed and soybean products, Isoflavones exist as aglycones (daidzein, genistein, and glycitein), 7-O-β-glucosides and two glucoside conjugate forms, acetylglucosides and malonylglucosides.Daidzein and genistein, the most abundant isoflavones found in soybeans, have chemical structures similar to estradiol (Knight DC and Eden JA 1996) and are both strong antioxidants and occupied 80% of antioxidant potential of soybean (Arora A et al. 2000;Hsu CS et al. 2001;Hwang J Soybean isoflavones are valuable in certain medicines, cosmetics, foods and feeds.Selection for high-isoflavone content in seeds along with agronomic traits is a goal of many soybean breeders.In our study, with 2 tables association mapping is a useful alternative to linkage mapping for the detection of marker-phenotype associations.Association analysis studies can be used to test for associations between molecular markers and target phenotype.The main objective of this study is to identify simple sequence repeat (SSR) markers associated with the soybean quality traits of isoflavones content.The four quality traits were evaluated in 135 soybean cultivar accessions from China, and the 135 accessions were genotyped with 100 SSR markers, analysis of population structure revealed three subgroups in the population.A total of 31 marker-trait associations related to the four traits were identified.According to the results, the association analysis in this study can be an effective method for QTL mapping and can help breeders to develop new approach for improving the content of isoflavones in soybean.et al. 2001).They have efficiently controlled or inhibited the growth of human breast cancer cell lines in culture (Lee HP 1991).These biological characters have resulted in attracting more attention to soybean seed isoflavones and an increasing interest in changes the isoflavones concentrations of soybean practical varieties.
Many of studies shown soybean seed isoflavones concentrations have a great fluctuations, because many biotic and abiotic factors influence their synthesis and accumulation.For example, the isoflavones could be determined over a wide concentration range (0.8-1135.0 mg/kg for daidzein, 1.9-1442.5 mg/kg for genistein and 0.5-54.6mg/kg for glycitein) (Beatrix Preinerstorfer et al. 2004).Lucimara et al reported that isoflavones concentrations were influenced by the cytoplasm and the nuclear genes of the maternal parent (Lucimara Chiari et al 2006).Nevertheless, in spite of genetic factors, the environment interactions was largely influence on isoflavones concentration in seeds, such as soybean variety, cultivation year, cultivation location, and temperature (Hoeck JA et al. 2000;Lee SJ et al. 2002;Mebrahtu T et al. 2004;Murphy SE et al. 2009;Juan Jose et al. 2009).Juan et al found farther that isoflavones accumulation in seeds was influenced by multiple interacting genetic loci (Beatrix Preinerstorfer and Gerhard Sontag 2004), meanwhile, when plants grew in variable environments, epistasis has been considered as an important source of genetic variation.
Although seed isoflavones were attributable to genetic and environment causes as well as g × e interaction, the heritability of variety was still a key factor causing less phenotypic change in diverse environments.Hence, the main target of the present study was to investigate the variations in isoflavones concentrations in soybean seeds with cropping year.These informations may suggest ways to realize the heritability of isoflavones concentrations in soybean variety, which is very important for soybean breeders aiming to cultivate varieties with high isoflavones concentrations.

Plant materials
A total of 135 soybean (Glycine max) varieties from different regions for this experiment were cultivated at same location during 2012-2014 in Beijing (Table S1).The soil was a silt clay loam year by year.The planting arrangement was 3×1.5 m per plot, each plot consisted of three rows (3m long and 0.5m between rows) and the experiment completely randomized block design with three replicates.The fertilizers were applied prior to plowing at the recommended rates of 6.2, 4.6 and 5.0 kg per 666.67 m 2

soybean varieties
Qixing Table S1: Soybean varieties in our study Table S1: Soybean varieties in our study Contd...
for N, P 2 O 5 and K 2 O, respectively.Soybean seeds were sowed on June 15-16 every year and harvested after mature completely for each variety from each replicate at each crop year and each plot were harvested only middle row as seed samples, and then stored at freezer with under -18°C until analyzed for isoflavone concentration.Whole seed samples were analyzed the isoflavones and this analysis was undertaken at Beijing key laboratory of new technology in agricultural application, Beijing University of Agriculture.

Isoflavone extraction and quantification
Isoflavone concentrations were determined using HPLC as described by Vyn et al. (2002).Approximately 1g sample was mixed with 5ml methanol (100%) in 20ml plastic bottle and was used ultrasonic waves to make it dissolve quicker 1h, and then, static solution for 24h at room temperature.
1.5 ml methanol solution was extracted into centrifuge tube to separate 20 min at 14000 r•min -1 using a refrigerated centrifuge.The supernatant liquor was filtered through a 0.45μm nylon syringe filter paper (Whatmanno42) to HPLC analysis.
The HPLC system consisted of an Agilent 1200 liquid chromatography pump and an detector (Agilent Technologies Co. ltd).The column for analysis was a TC-C18 (250 mm×4.6 mm, 5 μm), and UV absorption was measured at 254nm.

Population structure
Population structure was estimated by STRUCTURE v2.3.2 (Pritchard et al. 2000) (Pritchard JK et al. 2000).The number of hypothetical subpopulations (K) was set from 2 to 9 with a burn-in period length of 50,000 iterations and a run of 500,000 replications of Markov Chain Monte Carlo (MCMC) after burn-in.Each K was duplicated five times.The admixture model of STRUCTURE allowed for population mixture and correlated allele frequencies.The most appropriate K value was evaluated by lnP(D) in the STRUCTURE output (Evanno et al. 2005).According to the most appropriate K value, the Q-matrix of five repeats was integrated by using the CLUMPP software (Jakobsson and Rosenberg 2007).

Association mapping
For marker-trait association, a structured association approach was implemented by a general linear model (GLM) in TASSEL 2.1 (Bradbury PJ et al. 2007).In order to correct for spurious associations, the Q-matrix was used in the model.The threshold (P value) for significant association between markers and traits was 0.001.The phenotypic variance explained (PVE) for each significantly associated locus was evaluated by R 2 values for the markers (Zhang J et al. 2011).

Statistical analysis
The soybeans were cultivated using a completely randomized design, which was replicated three times.The analysis of isoflavones by HPLC was repeated three times with each variety.Analyses of variance for all data were undertaken using the general linear model procedure and the SPASS17.0 software.The pooled mean values were separated on the basis of least significant differences at the 0.05 probability level.

Phenotypic analysis of isoflavone content
According to the result of isoflavone content, the mean content of total isoflavone (TI) in different soybean varieties was 1579.1).The high values of CV indicated wide phenotypic variation among accessions, which was suitable for association analysis.

Allelic diversity and population structure
A total of 100 SSR markers were used to detect polymorphisms in all soybean varieties.A key issue for association analysis is estimation of population structure, which can result in spurious associations between phenotypes and markers.The Q-matrix from STRUCTURE can help to reduce the risk of false positives arising from population structure (Bradburyet al. 2007).
One hundred SSR markers were selected to estimate the population structure.The average lnP(D) value for each K (from 1 to 8) is visualized in Fig S1 and the inflection point appeared at K = 3.According to lnP(D), Population was classified into three subpopulations, containing 65, 16 and 66 accessions, respectively (Fig. S2).

DISCUSSION
Soybean seed isoflavones have many uses in foods, medicines, cosmetics, and animal farming (Brouns F 2002).Thus, the improvement of seed isoflavone content in soybean cultivar is increasingly focused by breeders.Fendou53 (2721.44 µg/g) was proved to have highest isoflavone content in all soybean varieties for three years.Choi et al. (1996) showed their results that total isoflavone content changed from 458 to 3309 µg/g across location within the same year to single or multiple soybean cultivars (Wang H and Murphy P 1994;Choi JS et al. 1996).In our study.The TI values see table 1 (Table 1).Meanwhile, seed isoflavones were attributable to genetic and environment causes as well as g × e interaction or year × location, these studies showed that for effective cultivar improvement, the main genotypic effects of total and individual isoflavone have the important influence.
The 100 SSR markers selected for association analysis, twelve loci associated with DZ, fourteen loci associated with DC, twelve loci associated with GT, and eight loci associated with TI were mapped onto eleven, eleven, ten and eight LGs, respectively.There association loci explained 4.1-10.6% of phenotypic variation for total isoflavone.Most of variation was <30%.In soybean seeds, the low level of phenotypic variation evaluated by association analysis was similar to the other studies (Njiti VK 1999;Meksem K et al. 2001;Kassem MA et al. 2004;Kassem MA et al. 2006;Primomo VS et al. 2005).
In this study, thirty-one SSR markers associated with DZ、GC、GT and TI, and in these markers, same marker (satt540 in LG M) had been detected in DZ、GT and TI at the same time.Wang yan et al. (2014) used 'Zhongdou27' (high isoflavone) × 'Jiunong20' (low isoflavone) to dentify eQTL underlying expression of four gene families encoding isoflavone synthetic enzymes involved in the phenylpropanoid pathway including the Satt540 marker.Primomo et al. (2005) detected Satt540 marker was from different isoflavone content in soybean seeds (Primomo VS et al. 2005).Zeng Guoliang et al ( 2009) also detected Satt540 marker was associated with GC、GT and TI.
In our study, same marker (satt540 in LG M) had been detected in DZ、GT and TI.This suggests Satt540 was weakly influenced by genetic background and environment.
Meanwhile, Satt540 was associated with certain foliar resistances such as to aphids and to white mold (Li Y et al. 2007;Guo XM et al. 2008).Furthermore, isoflavones in leaves that protectd soybeans from pests or pathogenic microbes may be transported to seed (Morris PF et al. 1991;Benhamou N et al. 1999).Therefore, Satt540 marker in this region could represent a major seed isoflavone content locus.SSR marker (Satt546 in LG D1b) was associated with GC、GT and TI, it was a creativity discover related to seed isoflavone content, in this region of soybean genome, a QTL for expression of gene family encoding isoflavone synthetic enzymes: C4H (cinnamate-4-hydroxylase) (Yan Wang et al. 2014).
The present study investigated a number of SSR marker associated with DZ、GC、GT and TI, and predicted some SSR markers related to the isoflavone contents in soybean seeds.
Although seed isoflavonoids display a broad range of variation, their synthesis and accumulation are affected by many biotic and abiotic factors.There are considerableadvances in these studies.In our study, the usability of these markers associated with isoflavones in soybean seeds could promote MAS in breeding programs, and improve an efficient method for developing soybean cultivars.

CONCLUSION
In summary, our findings suggested that 31 marker-trait associations related to the four traits were identified, including the SSR markers (Satt540 and Satt546) which have been previously reported.The results also suggested that the use of the SSR marker (Satt540 and Satt546) could probably improve an efficient method for developing highisoflavone soybean cultivars.

Fig S1 .
Fig S1.Values of ∆K with its modal value used to detect the true K of three subgroups (K=3).

Fig S2 .
Fig S2.Population was classified into three subpopulations.
The mobile phases consisted of solvents A and B in the HPLC analysis.Solvent A was 40% methanol and solvent B was 0.1% glacial acetic acid (pH3.22).The injection time was 20 min with 10μl sample and solvent flow rate 1ml•min -1 .