A Smart Way to Profit from the Wealth of Biobanks

Apr 8, 2019 | Blogs, Life Science Research, Proteomics | 0 comments


Microflow LC with SWATH® Acquisition for Digitizing Biobanks

What if you could access thousands of high-quality samples for your research? What if these samples were well-annotated biological specimens? And what if they were carefully segmented into just the specific disease you were interested in studying?

Enter the 21st century and the age of the Biobank.

But what exactly is a Biobank? Biobanks are repositories of biological specimens for research purposes. These specimens span the range from different types of tissues (fresh, frozen, or formalin-fixed paraffin-embedded tissue samples) to different types of body fluids (blood, serum, plasma, urine, saliva, CSF). Specimens can be collected at the point of surgery, for example, saving tissue during excision of a tumor, or in a more regimented process such as asking for blood donations from people that all share a specific phenotype. Biobanks can be created from people in a specific population or region or be created specific to a particular disease. However, many biobanks are more generic and host specimens for a range of diseases and purposes. Today, researchers can find samples representing a whole host of diseases including cardiovascular, metabolic, infectious, inflammatory, neurodegenerative, and even extremely rare diseases, in addition to many types of cancers.

Within the last 20 years, biobanks have been steadily growing in popularity, number, and size to the point where they now play a vital role in biomedical research. According to recent estimates, there are hundreds of biobanks around the world, found in Europe, North America, Asia, Australia, and even the Middle East. While many of these repositories are privately held or for use only by researchers within their own organizations, many repositories offer their specimens publically.1 These repositories safeguard millions of biological specimens, with the largest biobanks holding nearly 20 million samples alone.2 Need brain tissue samples for glioblastoma? There’s a biobank for that. Need blood samples for a twins study? There’s a biobank for that. Want to study thyroid lesions as a direct consequence of exposure to radioactive iodine in fallout from the 1986 Chernobyl nuclear accident? There’s a biobank for that too.3

So how can a proteomics researcher take advantage of this remarkable resource? Genomic researchers have been using biobank specimens for years for genome-wide association studies (GWAS) to look for genetic variations such as single nucleotide polymorphisms (SNPs) that are associated with certain diseases or phenotypes. These microarray experiments have been embraced around the world because they enable thousands or even millions of samples to be interrogated with ease and high throughput.

At first glance, however, a similar endeavor at the protein level appears more challenging. While there are certainly many proteomics based workflows in use today, a proteome-wide interrogation of large numbers of samples would need to be high throughput. Additionally, it would need to be highly sensitive, yet easy to implement and robust. And finally, like an array of an entire human genome on a chip, it would need to be comprehensive and reproducible.

Enter Microflow LC with SWATH Acquisition.

SWATH Acquisition is a data independent acquisition (DIA) strategy that differs from more traditional data dependent acquisition (DDA) strategies. With SWATH acquisition on a TripleTOF® mass spectrometer essentially a complete fragment ion map of each sample is created of all detectable species. Studies have shown that SWATH acquisition is capable of consistently detecting and reproducibly quantifying thousands of proteins from cell lines, not only run-to-run within a single laboratory, but also across multiple laboratories, instruments, and operators located globally.4 The technique is highly sensitive with excellent quantitative accuracy and similar to a genome on a chip, SWATH Acquisition for proteomics is comprehensive and reproducible.

However, SWATH Acquisition also depends upon liquid chromatography (LC) for separation of peptides prior to mass analysis. Because of the high sensitivity it affords, proteomics experiments have historically relied on nanoflow LC which can be slow and tedious. Alternatively, microflow LC can provide enhanced workflow robustness and sample throughput with minimal compromise on the overall workflow sensitivity.5 Depending upon the study objectives, users can tailor the throughput vs. depth equation to suit their needs. For example, an investigation of the impact of very fast LC gradients on proteins and peptides quantified reproducibly and with high confidence reveals that minimal losses are observed when halving the gradient length from 40 min to 20 min. Even very fast gradients of 5 minutes still reproducibly produce high numbers of identified and quantified proteins.6.  Thus, for larger cohort studies, a quick interrogation with a 5-10 min gradient to detect medium to high abundant proteins may be enough to help stratify the samples.

Putting it all together adds up to a powerful workflow for proteomics-wide association studies for biobank specimens. As recently demonstrated by researchers at SCIEX and Biognosys, microflow LC SWATH Acquisition of 105 FFPE colon cancer biobank specimens identified and quantified ~ 4500 proteins across healthy and diseased samples in only 5 days.7 Based of the quantitative differences between the detected proteins, the analysis stratified the diseased samples into three subtypes. Additionally, by overlaying ontology information, molecular functions and biological processes associated with up- and down-regulated proteins could be identified.

So the next time you’re planning your next phase of research, remember the vast resource of biobank specimens. Capitalize on this resource to accelerate your proteomics studies using microflow LC and SWATH Acquisition. Ultimately the protein-disease correlations that this workflow can facilitate can lead to new leads for targets and treatments and a better understanding of health and disease.


  1. Global Biobank Directory, Tissue Banks and Biorepositories: https://specimencentral.com/biobank-directory/
  2. 10 Largest Biobanks in the World: Biobanking.com: https://www.biobanking.com/10-largest-biobanks-in-the-world/
  3. Chernobyl Tissue Bank: https://www.chernobyltissuebank.com/
  4. Collins, B. C.*, Hunter, C.*, Liu, Y.*, Stefani, T., Chan, D., Zhang, H., Bader, S., Moritz, R. L., Schilling, B., Gibson, B. W., Krisp, C., Molloy, M. P., Hou, G., Lin, L., Liu, S., Hirayama, M., Ohtsuki, S., Selevsek, N., Schlapback, R., Tzeng, S.-C., Held, J. M., Larsen, B., Gingras, A.-C., and Aebersold, R. “Multi-laboratory assessment of reproducibility, qualitative and quantitative performance of SWATH-mass spectrometry,” Nature Communications (2017), 8.
  5. Technical Note: “Microflow SWATH Acquisition for Industrialized Quantitative Proteomics,” Christie L Hunter, Nick Morrice.
  6. Technical Note: “Accelerating SWATH Acquisition for Protein Quantitation – Up to 100 Samples per Day,” Christie Hunter, Nick Morrice, Zuzana Demianovac.
  7. Technical Note: “Digitizing the Proteomes From Big Tissue Biobanks,” Jan Muntel, Nick Morrice, Roland M. Bruderer, Lukas Reiter


ProteinPilot phosphopeptide library and DIA-NN

I have prepared a spectral library for a phosphopeptide enriched sample and I am generating my SWATH samples from similarly enriched samples. The problem is that when I use DIA-NN for the retention time alignment and quant, it doesn’t recognise the terminology of the spectral library annotated by ProteinPilot. DIA-NN recognises Unimod:21 for phosphorylation, but PP uses phospho(Tyr) etc. Other than changing the data dictionary to get around the mismatch, anyone have any suggestions for how I might resolve this? Thank you, Roz

Current proteomics software compatibility for ZenoTOF 7600 system

Below is a summary of various other software packages that are useful for processing proteomics data from the ZenoTOF 7600 system.  Note this list is not comprehensive and only covers the tools we have lightly tested to date. Acquisition Type Software Files needed...

Sequential processing of multiple data-files in ProteinPilot

I would like to use ProteinPilot 5.0.2 to process data-sets containing 16 wiff files acquired from fractionated peptides on a 6600 TripleTOF. A Precision T7910 workstation struggles to process four files in parallel and I would like to be able to queue sequential processing of individual files overnight. I currently use the ‘LC ‘ tab to load and process individual data-files but this leads to parallel processing. Is it possible to generate 16 .group files sequentially?

Posted by


Submit a Comment