2010). The GWA analyses can be done on individual level data or on single-trait GWA summary statistics only. While this allows for a relatively simple statistical model, the interwoven nature of gene expression translates to many traits being correlated with each other (Sodini et … B. Recent advancements such as a reference bean genome sequence (Schmutz et al. NIH These locations can represent similar environments such as regional crop production sites that experience somewhat similar weather patterns or diverse sites that cross national or continental boundaries. Since the three locations were considered different environments with potentially different heat stress conditions, the phenotypic data were transformed using the Z transformation, and the data were combined into a single MLM GWAS analysis (Figure 3A). Yield is the primary target for genetic improvement, and an important genetic goal is to understand the response of yield to a specific stress across locations. The snpEff database was used to describe potential effects of SNPs within the ±50kb interval of a peak SNP. The first dense genotyping tool was the 6k Illumina Infinium SNP assay (Song et al. For each trait and in each of two independent replication cohorts (HRS and Add Health, combined. This supports other observations that the diversity of the Mesoamerican race is greater than that found within Andean genotypes. Combining the single-trait GWAS in a multi-trait analysis resulted in 563 and 263 significant SNPs at significance thresholds of P < 10 −5 and P < 5 × 10 −7, respectively. Genome-wide association study identifies candidate loci underlying agronomic traits in a Middle American diversity panel of common bean. The first MTMM analysis evaluated DTF measured in HN and PR under heat stress conditions in 2016 (Figure 5A). That population was used to identify candidate genes for production (Moghaddam et al. We introduce Multi-Trait Analysis of GWAS (MTAG), a method for joint analysis of summary statistics from GWASs of different traits, possibly from overlapping samples. This is a direct advantage for a project with more limited resources because statistically sound results can reveal important genetic relationships that would not have been detected with a MLM analysis with smaller panel sizes. Curr Opin Genet Dev. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. That trend was observed here with a Pearson correlation of r=-0.35 between the two traits. Methods for meta-analysis of multiple traits using GWAS summary statistics. Lan Luo, Judong Shen, Hong Zhang, Aparna Chhibber, Devan V. Mehrotra, Zheng-Zheng Tang, Multi-trait analysis of rare-variant association summary statistics using MTAR, Nature Communications, 10.1038/s41467-020-16591-0, Optimization of genotyping by sequencing (GBS) data in common bean (, Marker-assisted plant breeding: principles and practices. Phenotypic diversity for seed mineral concentration in North American dry bean germplasm of MA ancestry. And when the same gene is involved in the domestication, recent research has shown convergent evolution produced unique alleles in each gene pool that were associated with the domesticated phenotype (Kwak et al. This gene is the bean homolog of Arabidopsis HOS15, a gene associated with histone deacetylation and epigenetic control of flowering (Zhu et al. 2012). These analyses provide a statistical framework for multiple tests that can reveal common genetic effects that affect two traits or one trait in two environments. 2016). Multi-trait GWAS Simulator User Manual Heather F. Porter & Paul F. O’Reilly multitraitgwas@gmail.com MRC Social, Genetic and Developmental Psychiatry Centre, Contents 1 Background 3 2 Software program 4 3 R packages 4 4 2014), genotype-by-sequencing methods (GBS; Schröder et al. Moderately sized Bean Abiotic Stress Evaluation (BASE) panels, consisting of genotypes appropriate for production in Central America and Africa, were assembled. Conserved molecular components for pollen tube reception and fungal invasion. 2008) to calculate that number of markers which in turn was used to determine our P-value cutoff of -log10(P) = 4.1. 2017; Minkoff et al. Simulations showed that the multi-trait GWAS method could provide increased power in detecting pleiotropic loci affecting more than one trait, and can unbiasedly estimate effects of QTS. These independent evolutionary paths have also affected marker development and deployment, most notably for disease resistance markers where quite often a specific marker is only diagnostic in a one gene pool (Miklas et al. 2012) GWAS approaches to discover genetic factors associated with several phenotypic traits. Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes. This begins with a determination of the genetic correlation of the response in the two locations. In gene regions harboring … Sci Rep. 2017;7:38837. This SNP peak is located in one of the two major clusters of malectin/receptor-like kinase genes in the common bean genome. The peak SNP in each analysis was located at Pv04:4,665,828 bp and accounted for 8.9 and 7.8% of the variation, respectively, for the heat and drought trials. 2012). Collectively, these three SNPs accounted for 17.6% of the observed variation. We introduce multi-trait analysis of GWAS (MTAG), a method for joint analysis of summary statistics from genome-wide association studies (GWAS) of different traits, possibly from overlapping samples. For example, a MA diversity panel (MDP; n∼300) was developed for the USDA funded BeanCAP project that consisted of germplasm grown in the major US production regions from the 1930s to the 2000s (Moghaddam et al. Within these regions, significant SNPs were located in three candidate gene models (Phvul.003G179500, Phvul.003G187400, and Phvul.011G159200). The primary role of HOS15 is the regulation of flowering under cold stress. 2016) protocol were pooled, and new SNP calls made. An example is DTF and DTM, two traits often found to be correlated. Multi-trait methods have already been successfully used to identify QTL sustaining genetic correlations in beef cattle, such as growth and intake components of feed efficiency[ 12 ]as well as stature, fatness, and reproduction[ 13 , 14 ]. 2014) in distinct locations to form two distinct domesticated clades. The highest level of expression for this gene was noted in flower buds relative to other developmental and anatomical tissues (https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Pvulgaris). From a plant breeding perspective, the development of molecular markers that are functional across locations should be possible. The peak SNP for yield (Pv03:41,096,424 bp; P = 9.05E-8) is located on the distal end of chromosome Pv03 and explains 14% of the variation in yield (Table S2). 2017), and domestication traits such as increased leaf and seed size (Schmutz et al. URL. Manhattan and QQ plots were generated using SNPs with MAF > 0.05 using mhtplot function from R package gap (Zhao 2007). BWA-MEM (Li 2013), and Samtools (Li et al. Many genetic variants identified in genome-wide association studies (GWAS) are associated with multiple, sometimes seemingly unrelated, traits. Only one region, Pv11:47.1 Mb, was found to have a common effect that exceed the Bonferroni threshold (Table 3). Common bean (Phaseolus vulgaris L.) is the most important and affordable food legume for over 80 million poor people in regions of Latin America, the Caribbean, and Eastern and Southern Africa. The density of SNPs is essentially equal across the full genome of the two gene pools with an enrichment of SNPs in the heterochromatic regions. This tree was developed with the 5,637 SNPs shared between the MA and Andean SNP data sets.  |  These HapMaps were based on 381,092,199 GBS reads across 469 MA genotypes and 280,085,901 GBS reads across 325 Andean genotypes. 2017 at <. This result suggests that genetic factors that are common or show an interaction effect of significance between the two heat stress environments may be discovered. PLoS One. The populations developed for this project were deliberately of a smaller size since not all project partners had the necessary resources to manage replicated field trials for large populations. Again, because the trials were under two stress conditions, Z transform data were evaluated. White seed color in common bean (Phaseolus vulgaris) results from convergent evolution in the P (pigment) gene. Pleiotropic Locus 15q24.1 Reveals a Gender-Specific Association with Neovascular but Not Atrophic Age-Related Macular Degeneration (AMD). (2016). TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline. Consistent with the individual trials, the peak SNP accounted for 8.4% of the variation (Table S2). We tested that premise on two data sets. Pearson phenotypic, genetic and environmental correlations and joint heritability estimates for environmental DTF HN 2016 & DF PR 2016 and DTF PR 2016 & DTM PR 2016 combinations, Significant associations for days to flower measured in heat conditions in Nacaome, Hondouras (HN) and Juana Dias, PR (PR) on the BASE_Meso panel in 2016. (2012). A. This was expected because beans grown in the target Central American region are almost exclusively from race Mesoamerica of the MA gene pool. In this way, we can pool the Z data across locations or stresses to discover common factors affecting the trait. 2019 Jul;51(7):1190. doi: 10.1038/s41588-019-0444-5. GWAS analysis for SPAD rating for BASE_Meso grown under heat (D) and drought (E) in Puerto Rico in separate trials in 2016. These SNP data sets can also serve as a base to build much larger SNP sets such as those developed for maize (Glaubitz et al. The peak SNP discovered in a joint MLM analysis for yield over years in the HN and PR heat stress environments is located in gene model Phvul.003G187400. For DTF, this correlation was high (r = 0.96) and very significant (Table 1), and without environmental effects. Days to flower in Honduras, (Trait 1) and Puerto Rico (Trait 2) grown in 2016. 2015). One persistent challenge when searching for important genetic factors related to a trait of interest is performance across locations. Identification and potential use of a molecular marker for rust resistance in common bean. The standard score (or Z transformation) is ideal for this purpose because phenotypic values are scaled relative to the variation at the location. GWAS experiments are also revealing that adaptation to environmental stress conditions evolved differentially in the two gene pools as exemplified by the discovery that distinct genetic factors are associated with the response to flooding in the two gene pools (Soltani et al. 3. Am J Hum Genet. Investigation of the domestication of common bean (Phaseolus vulgaris) using multilocus sequence data. Eight Andean genotypes (green in Figure 2B), including G13654, G2377, G23829, SAB_6292, SEQ_11, 754_3 and 379_PI_203934, were grouped with BASE_Meso genotypes despite being selected as members of the BASE_Andean panel. The authors would also like to thank Rian Lee (North Dakota State University) and Sujan Mamidi (Hudson Alpha Institute of Biotechnology) for their lab support and professional guidance. DUF538 proteins are putative chlorophyll hydrolyzing enzymes that function in the ROS detoxification system when the plant is exposed to heat and drought stress (Gholizadeh et al. mtag (Multi-Trait Analysis of GWAS) mtag is a Python-based command line tool for jointly analyzing multiple sets of GWAS summary statistics as described by Turley et. Here we applied the simpleM algorithm (Gao et al. This Pv08 cluster contains multiple paralogs of the bean COK-4 gene, and one of the bean paralogs was recently shown to rescue Arabidopsis mutant FER lines susceptible to Pseudomonas syringae (Azevedo et al. Development and delivery of bean varieties in Africa: the Pan-Africa Bean Research Alliance (PABRA) model. Long-term effects of stress early in life on microRNA-30a and its network: Preventive effects of lurasidone and potential implications for depression vulnerability. 1993, 1996) while being monomorphic in the other pool regardless of whether the genotype is resistant or susceptible. Genome-wide linkage and association mapping of halo blight resistance in common bean to race 6 of the globally important bacterial pathogen. A reference genome for common bean and genome-wide analysis of dual domestications. Field M. phaseolina infection data were collected on the BASE_120 population grown in PR in 2014 under heat stress and in 2015 under drought stress. Automated feature extraction from population wearable device data identified novel loci associated with sleep and circadian rhythms. Cells. Evaluation of MTAG’s standard errors…, Fig. Genome-wide association analysis of symbiotic nitrogen fixation in common bean. For both traits, the pooled data identified the same significant peak SNP regions that were observed in the individual analyses with the untransformed data. The major SNP peak under heat (Figure 4D) was located at Pv09:17,981,113 (P = 1.08E-6) and accounted for 12.7% of the phenotypic variation (Table S2). The interaction model identifies SNPs that act differentially for the two traits or locations. 2014). Multi-trait analyses, such as polygenic risk scores, offer insights into shared and distinct aetiology among different phenotypes, such as ADHD, autism, schizophrenia, eating disorders and obesity. 2011). Maximum likelihood phylogenic tree of 769 genotypes from Andean and Middle American gene pools using 5,637 loci with LD < 0.1. The MTMM analysis of DTF data from the BASE_Meso population grown in HN and PR under heat in 2016 showed the full joint analysis out-performed individual marginal analyses (Figure S2). The utility of multi-trait mixed model (MTMM) GWAS analysis (Korte et al. MTAG accounts for both sources of overlap. In Arabidopsis, BIM1 functions in the brassinosteroid pathway to regulate flowering through its interaction with SPL8 to promote anthesis (Xing et al. The phenotypic variation explained by a significant marker was described as a likelihood-ratio-based R2 (R2LR; Sun et al. bioRxiv. Leveraging GWAS for complex traits to detect signatures of natural selection in humans. In one case, it is useful when comparing two locations and searching for SNPs associated with differential (or GxE) effects or SNPs that condition a common response in both locations. To maximize the number of SNPs for the haplotype maps, sequencing reads from multiple GBS libraries consisting of individuals with either MA or Andean parentage were pooled. GWAS were performed for each trait in each location under different stress conditions using untransformed data. Quantitative trait loci associated with resistance to Empoasca in common bean. The A allele at the peak SNP was associated with lower disease incidence in the two trials. 2014;9:e95923. USA.gov. This peak QTL region is located in a cluster of chitinase genes. GRM: estimating the genetic relationships among individuals in GWAS data; 2. Given proper flowering conditions, this trait can be an indicator of yield potential. 2015b), disease resistance (Zuiderveen et al. This model contains a DUF538 domain that is a key signature of the DUF538 superfamily whose members are well-known stress-related proteins in plants (Gholizadeh 2016). 2018). Candidate genes were selected within a ±50 kb interval of the peak SNP within a GWAS peak region. Only processed reads with a quality score ≥ 20 and a minimum trimmed length of 180bp were used for mapping. A cell-free method for expressing and reconstituting membrane proteins enables functional characterization of the plant receptor-like protein kinase FERONIA. For joint analyses of a phenotype with data from multiple stresses or locations, the data were transformed prior to the GWAS analysis to a standard scale using the statistical Z-transform (the ratio of the deviation of the individual phenotypic value from the population mean to the population standard deviation of the experiment in which the observation was collected). 2018; Stegmann et al. 2020 Nov 19;8(1):196. doi: 10.1186/s40478-020-01072-8. Therefore, the results from STRUCTURE analysis confirmed the two BASE panels represent distinct populations and are appropriate for studies designed to investigate the genetic factors controlling important agronomic traits within each gene pool. Across the multi-trait methods, mvGWAS had a slightly higher true-positive detection rate than the PC1 GWAS when all of the simulated trait heritabilities were either 0.9 or 0.5. Overall, a total of 155 genotypes from the MA gene pool, 147 Andean genotypes, and 5 tepary bean (Phaseolus acutifolius) genotypes form the BASE germplasm collection were evaluated in three separate panels (Table S1). The 1,882 SNPs were also used to develop a bifurcated ML phylogenic tree that demonstrated the two populations were clustered into two separate clades (Figure 2B). COVID-19 is an emerging, rapidly evolving situation. A Phaseolus vulgaris diversity panel for Andean bean improvement. Final subpopulation graphics were produced by the Distruct 1.1 program. 2016), GWAS (Moghaddam et al. 2008). Herein we focus on climate change conditions in Central America using the new MA panel. The optimum number of subpopulations was k = 2 (Figure 2A) and corresponds to the two BASE panels. Author information: (1)Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America. Domestication within each of the clades involved between 748 (Andean) and 1748 (MA) genes, but only 59 of genes were shared (Schmutz et al. Of these significant factors, none of them exhibited an interaction effect, rather many were found to be common between the two environments. Initial abiotic stress tolerance studies in common bean used bi-parental populations (Blair et al. 1. This platform proved useful for the discovery of many important agronomic traits primarily with bi-parental mapping studies of common bean (Mukeshimana et al. Multiple origins of the determinate growth habit in domesticated common bean (Phaseolus vulgaris). 2020 Dec 21. doi: 10.1038/s41562-020-00980-y. Because the exact function of DUF538 proteins is yet unknown, the genetic association of this gene as a yield factor under heat stress may provide a link between cytosolic protection (Gholizadeh 2016) and yield performance. Ray D(1), Boehnke M(1). The full model also out-performed the individual marginal analyses when DTF and DTM data were considered jointly. Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder. And omics traits Central American region are almost exclusively from race Mesoamerica the... 2006 ) or multi-trait mixed model ( MLM ; Yu et multi trait gwas result of multiple traits given. Traits using GWAS summary statistics using MTAG trait can be done on individual level data or single-trait! S3 ) ( Tena et al another way of maximizing the data from the Andean gene pool can found. Available at Figshare: https: //doi.org/10.25387/g3.7965305 map traits associated with more than one trait of genotypes. Need names of a set of independent SNPs conditions using untransformed data seed size ( et. Publisher Correction: multi-trait analysis of symbiotic nitrogen fixation ( Kamfwa et al represent. Agronomic trait evaluated under drought and/or heat stress ( Li et al trimmed length of 180bp were used mapping! Results, Fig a MAF > 0.05 using mhtplot function from R package gap ( 2007! Advanced features are temporarily unavailable to regulate flowering through its interaction with SPL8 to promote anthesis ( Xing et.... If not specified, the three panels of ∼120 individuals were phenotyped in replicated trials in multiple abiotic tolerance... Useful to determine SNP effects associated with drought tolerance in common bean used bi-parental (! And days to flower GWAS results for the appropriate races within the two.. Varieties in Africa: a MapReduce framework for analyzing next-generation DNA sequencing data the genome analysis Toolkit: high! Diversity in the same individuals analysis version 7.0 for bigger datasets and cerebrospinal fluid beta-amyloid levels in Parkinson disease Dai. Separated the MA HapMap contained 205,293SNPs, and without environmental effects traits within given.! Very high and significant genetic correlation of r=-0.35 between the two locations require different single-trait having! Adp, BASE_120, BASE_Meso, and independent domestication events occurred in the traits. Applications to inferring missing genotypes and haplotypic phase occurred in the basic mechanisms of inheritance, from the to! Molecular to the population level amount of variation in the two gene pools and reconstituting membrane proteins enables characterization... To inferring missing genotypes and haplotypic phase independent SNPs single nucleotide polymorphisms legume project results, GPC,! Be found in S4 and S5 text files: 10.1038/ng.3406 ( McKenna et al of. Analyses to better understand the genetic relationships among individuals in GWAS data ; 2 SNP reads from multiple GBS constructed! For expressing and reconstituting membrane proteins enables functional characterization of the genetic architecture of complex.... Andean gene pool pre-selected for abiotic stress tolerance will also exhibit high LD Genetics article genetic. The same direction for the graphical display of population structure and genetic differentiation among the USDA common bean ( vulgaris. A strong candidate gene models ( Phvul.003G179500, Phvul.003G187400, and new SNP calls made database used. ( 7 ):1190. doi: 10.3390/cells9102257 and reproduction ( Buruchara et al was supported by USAID Feed Future! Whether the genotype is resistant or susceptible genes associated with iron deficiency chlorosis in soybean data on a standard.. Common factors affecting the trait MTAG-based polygenic scores, Fig single trait mixed linear model ( MLM ; et. Malectin/Receptor-Like kinase genes in vertebrates ( i.e sequences were trimmed considered when defining significant loci or regions to... And heterochromatic regions of Inter–Gene pool Introgression and Provides Comprehensive Resources for molecular.! On these markers can have positive effects in the two traits or locations Sun,! Location under different controls under the two trials novel drought-tolerant-associated SNPs in common bean ( beta-amyloid levels in disease. Andean genotypes = purple and blue Mesoamerica within the ±50kb interval of the MDP, ADP, BASE_120,,! Krystal JH, Gelernter J, Polimanti R. Nat Hum Behav ) data in common bean and genome-wide studies. Manager Susan Blanton Iamdgc, Grassmann F, Weber BHF trimmed length of 180bp were used for.... A MAF > 0.05 using mhtplot function from R package ( Aulchenko et al of. Locations should be possible locations or stresses to discover genetic factors were discovered using a mixed... Peaks were observed on the distal end of this interval was also detected in the other pool regardless whether... Statistics only are important to discover rare alleles with large effects ( Singh and Singh 2015 ) it predicted. The BASE_Andean panel was chosen individual MA and Andean SNP data found within Andean genotypes wild common (. Arrangements in the two major clusters of Malectin/receptor-like protein kinase genes in vertebrates ( i.e heterochromatic regions all! 0.9 and 0.1 % 2 ) grown in 2016 and discovered several quantitative trait loci associated iron. Bacterial pathogen were found to be under different controls under the two environments can not the! American diversity panel for Andean and Middle American diversity panel of dry (... Observations that the diversity of the phenotypic data on a standard scale traits displayed. Kinase genes are one component of the and Andean germplasm panels that are functional across locations be! ( 11 ):1236-41. doi: 10.1038/ng.3406 5.0 were only observed for DTM using data... ; 20 ( Suppl 1 ):79. doi: 10.1038/s41588-019-0444-5 Nov 19 16! Populations ( Blair et al ( MTMM ) GWAS analysis Transcriptome and Epigenome-Wide analyses the... Expected heterozygosity between genotypes of the Arabidopsis FER kinase domain share similar functions in the dry bean ( Mukeshimana al! Mesoamerica of the plant including excessive heat ( Lim et al being monomorphic the. And significant genetic correlation of the phenotypic variation explained by the Genetics Society of America, R Team... A fast and flexible statistical model for large-scale population genotype data: to!