A Multi-Omics Database for Tomato Research and Breeding

Mar 11, 2019 | Blogs, Life Science Research, Metabolomics | 0 comments


Ace. Beefsteak. Big Boy. Kumato. Early Girl. Roma. Sun Gold. San Marzano. These are just a few of the thousands of varieties of tomato plants available today. And while all of these varieties may be very different with respect to crop yield, disease resistance, fruit shape, color, and size, these traits (and more) were mainly cultivated through phenotype selection.

More recently breeders have experimented with molecular breeding strategies. With the completion of the tomato genome over five years ago, this has become even more effective. But now scientists at the Chinese Academy of Agricultural Sciences in Shenzhen, and the Huazhong Agricultural University in Wuhan, in collaboration with other scientists in China, United States, Germany, and Bulgaria have established a multi-omics database for tomatoes consisting of metabolomics, vari-omics (mutation data), and transcriptomics data in addition to genomic data. By using a combination of LC-MS/MS metabolomics data, transcription data, and SNP data they were able to compare concentrations of metabolites with mutation frequencies and gene expression data. As described in their paper “Rewiring of the Fruit Metabolome in Tomato Breeding,the combined database not only provides a framework for conducting tomato breeding more scientifically, but it also provides a foundation for understanding how domestication and subsequent cultivation of the tomato plant resulted in profound changes to the fruit phenotype (as observed through the metabolome) that were often unintentional and unexpected.

Take for example fruit size. Wild type tomatoes are small berry-like fruits. Domestication and subsequent cultivation for larger fruit have resulted in the “beefsteak” type tomatoes of today that are over 100x larger than their ancestors. A closer look at the metabolomes with respect to the variomes of tomato species representing wild type “PIM” (S. pimpinellifolium), domesticated “CER” (S. lycopersicum var. cerasiforme) and cultivated “BIG” (S. lycopersicum) populations shows that only about 30% of the metabolome is likely associated with genes responsible for larger fruit. Thus, the majority of the metabolite differences observed between PIM-CER and CER-BIG were likely due to the linkage of other genes with fruit weight genes during the breeding process.

What about taste? Another fascinating finding involved fruit bitterness. Steroidal glycoalkaloids (SGAs) are one of the main secondary metabolites found in tomatoes and protect the plant from predators (insects, bacteria, etc.). However, due to the bitterness, they impart to the tomato fruit, they have been selected against during cultivation. In this study, the researchers were able to map this selection through their multi-omics database. They show that one particular SNP which results in a V-A mutant in a glycoalkaloid metabolism protein was associated with the alteration of the content of 8 SGAs. Additionally, the frequency of the allele for lower SGA content increased from 0% in PIM to 26.3% in CER to 57.3% in BIG which is consistent with selecting for less bitter fruits over the course of domestication and cultivation. Here, the metabolomics data was pivotal for helping to understand how bitterness was deselected by selectively breeding out SGAs in modern tomato varieties, and how a single mutation was responsible for regulating a large number of these SGAs.

And then there’s the matter of color. Tomatoes are red, right? While tomatoes are typically thought of as red, they can actually come in a variety of colors. For example, pink tomatoes are especially popular in Asian countries. Using LC-MS/MS again, the researchers performed metabolomics studies on pink vs. red tomato populations. Here the researchers found 122 metabolites that were different between pink and red tomatoes. From the 13,361 triple relationships between the multi-omics data set, a previously identified signature SNP within the SIMYB12 transcription factor could be associated with 56 of these metabolites and 69 genes. Next, SIMYB12 knockout mutants were created. In this study, 152 metabolites were found to be altered. As expected, the concentrations of flavonoids were altered since they are responsible for the red color. However, other chemicals were also changed including glycoalkaloids, polyamines, polyphenols, and primary metabolites. Thus, SIMYB12 is directly or indirectly involved in more than just flavonoid metabolism and appears to be a “major hub gene” of fruit metabolism. As the authors state, this is an excellent example of how selection for one trait can have a major impact on seemingly unrelated traits that can potentially impact fruit quality.

There is so much information in this paper, it is impossible to review it all in this blog. Through their work, the authors have shown that a multi-omics database can be used as both a research tool and a production tool for improvement of tomato crops. Through their multi-omics database, they were able to understand the root cause behind differences in plant phenotypes and the evolution of different cultivars through the breeding process. Additionally, by linking metabolomics data with genomic data, the authors were able to map out how breeding for specific traits may unintentionally impact other fruit quality parameters. Indeed, this is a shining example of how a multi-omics database workflow can be used to understand plant evolution and improve food production.


  1. Zhu et al., 2018, Cell 172, 249–261, January 11, 2018 ª 2017 Elsevier Inc. https://doi.org/10.1016/j.cell.2017.12.019

Questions for the Authors

  1. With respect to the metabolomics data and the discussion of coefficients of variation, you state that the phenotypic variation of the PIM group you observed was substantially lower than the CER group and the BIG group which is inconsistent with the level of genetic diversity in the sub-populations. Could some of that discrepancy be explained because there were simply fewer varieties representing the PIM group for your studies (31) vs. CER and BIG (124 and 287 respectively)?

    Yes, this is the main explanation is reasonable. Also, the metabolomes detected are only partial metabolites of the entire fruit and are not necessarily identical to the genomic level variations.
  2. Your studies around introgression of the tomato mosaic virus resistance gene were fascinating. In particular, you found 346 metabolites that were altered between resistant (R) and susceptible (S) genotypes. Comparing these to the 589 metabolites that were significantly different between PIM and BIG (indicative of the domestication process) there were 52 metabolites that were increased in the R pool but were decreased from PIM to BIG, and 75 metabolites that were decreased in the R pool but were increased form PIM to BIG. In other words, the fate of these metabolites is reversed by resistance breeding using wild introgression. Do you have any insight into what these metabolites are? What does this mean with regard to future breeding strategies?

    These metabolites will be studied in the future in the breeding process, when introducing the alleles of wild resources, and while considering the function of the gene. We must also consider other effects, and the same is true for hitchhiking effects.
  3. As you describe in your article, due to their bitterness, SGAs were generally selected against for tomato cultivation. But because SGAs have a protective property against predators, this would theoretically leave the fruit more vulnerable. Were you able to detect any changes that would suggest that the plants can compensate for the loss of SGAs by other mechanisms? For example, did you see any correlation with regard to an increase in fruit skin thickness or toughness with decreasing SGA content either phenotypically or in the metabolome data?

    We do not have much more information about the balance of SGAs content and protective property, but studies have shown that during fruit ripening, SGAs are transformed through a series of modifications that alter the function of the metabolite, affecting activity or transport (the metabolite modifications will be reviewed and published in the Plant Cell). We are also exploring similar studies in the cucumber bitterness (Science 346, 1084-1088).

Discover high-throughput LNP-mRNA integrity profiling

Lipid-based nanoparticles (LNPs) are effective non-viral vectors for delivering messenger RNA (mRNA) products, most notably used for production of vaccines against the recent SARS-CoV-2 pandemic.

Get started with these easy-to-implement methods for global lipid profiling

Lipidomics Quantification of lipids in complex biological samples is challenging and often requires a detailed knowledge of lipids and lipid analysis methods. The inherent complexity of the lipidome means that identifying and quantifying different lipid classes and...

Eliminate chick culling with innovative technology

While it sometimes seems questionable whether humanity and modern technology can coexist, technological advances in science can help pave the way to more compassionate business practices.

Posted by


Submit a Comment