Microbiomics Bioinformatic

Bioinformatic Services for Metagenomics and Metatranscriptomics

ORB offers comprehensive processing and analysis of metagenomic and metatranscriptomic sequencing data using the most up to date bacterial genome databases and industry-leading bioinformatics tools. Whether you just are interested in our metagenomic services for microbiomic taxonomic classification or you are ready to use our extensive metatranscriptomics services to elucidate the functional profile of the microbiome, ORB can help you to reach and surpass all of your research goals. Please see below for a detailed description of our metagenomic and metatranscriptomic services and our example analysis packages!

The impact of the microbiome on environments ranging from soil to host organisms including plants and humans is being uncovered at an accelerating rate, thanks in large part to advancements in massively parallel sequencing technologies. These technologies (offered by ORB) have enabled profiling of the microbiome DNA composition (metagenomics) and of the microbiome RNA composition (metatranscriptomics) at unprecedented scales allowing for the discovery of invaluable taxonomic and functional profiles of the microbial community.

Metagenomics Bioinformatics

ORB’s metagenomic analysis services provides bacterial taxonomic classification with the highest degree of confidence. Data processing begins with sensitive trimming and filtering of sequencing reads to remove low quality nucleotides and sequences derived from PCR primers. The filtered reads are processed through the QIIME2 pipeline in order to assign reads to unique amplicon sequence variants (ASVs) and then to assign the ASVs to operational taxonomic units using updated public databases. Bacterial diversity data is subjected to further analysis according to the client’s experimental design and goals. The bulleted list below summarizes some of the properties of the microbial population that can be examined:

  • Taxonomic Abundance – Relative frequency of each detected bacterial genus in all samples (Figure 1).
  • Alpha Diversity – Diversity within a sample.
    • Richness – Measure of the number of different bacterial species present in a sample (Figure 2).
    • Eveness – Measure of homogeneity of bacterial species within a sample.
    • Phylogenetic Diversity – Describes the minimum total length of all the phylogenetic branches required to span a given set of taxa.
  • Beta Diversity – Dissimilarity between sample pairs.
    • Bray-Curtis – Measure of the extent to which two samples share the same bacterial species in the same proportions.
    • UniFrac –Analysis of the taxonomic similarity between samples. Accounts for the phylogenetic relationships and abundance of observed species.
Relative Frequency
Figure. 1 : The relative frequency of each detected genus in mouse jejunum, human saliva, and human stool samples.

Example rDNA Analysis
Fig. 2 Phylogenic Diversity.
Figure. 2 : Graphical representation of taxon richness using Faith’s phylogenetic diversity indices for each sample (shown in group boxplots).

For an internal method validation study, ORB performed 16S rDNA V3/V4 amplicon sequencing on a set of 16 microbiome samples derived from either mouse or human (4 human stool, 6 human saliva, 6 mouse jejunum). DNA was isolated using ZymoBIOMICS DNA kit using a bead beating lysis time of 10, 20, or 40 minutes to compare lysis efficiencies and to optimize sample processing conditions. The isolated DNA underwent 16S rDNA V3/V4 PCR amplification followed by a second round of PCR designed to add Illumina index sequences to the end of the amplicons for sequencing compatibility on the Illumina platform. Sequencing was performed on the NextSeq 500 and the data was analyzed for a comprehensive taxonomic classification using the strategies described above. Taxonomic abundance and richness of the demonstration results are depicted in Figure 1 and 2, respectively.

Download a 16S rDNA metagenomics demonstration results package here!


Metatranscriptomic Bioinformatics

ORB's comprehensive metatranscriptome analysis pipeline powers studies of diverse microbial populations based on RNA sequencing from biospecimen and environmental samples. ORB’s computational and statistical strategies include targeted genome alignment, basic taxonomic classification, and de novo transcriptome assembly. These research tools provide an unprecedented opportunity to examine gene regulation for many microbial species simultaneously and determination of which genes encoded in a metagenome are actually transcribed. Each of ORB’s metatranscriptome bioinformatic service packages can be combined with analysis of human epithelial cell transcripts to enhance the understanding of a host—microbiome relationship. Contact us with your microbiome research objectives to determine which strategies or combination of analyses is most appropriate for your study!

Fig. 3 Classification of stool samples.
Figure. 3 : Classification of stool samples based on alignment and read counting using a 47 species reference genome set.

Fig. 4 Krona visualization.
Figure. 4 : Snapshot of Krona visualization of Kraken taxonomic assignment and counting data using raw data from a published study examining the effect of diet on the human gut microbiome. Click on the image above to explore the data set interactively.

Fig. 5 Trinity analysis pipeline.
Figure. 5 : Diagram of ORB’s Trinity analysis pipeline for Metatranscriptomic analysis. Click the image to expand.

Approaches for Metatranscriptome Analysis at ORB

Alignment to targeted microbial reference genomes
Aligns sequencing reads to a bacterial, fungal or viral reference genome specific to the research project, e.g. 116 bacterial strains from 47 species representing the metatranscriptome of human stool samples (ORB's human gut reference).

Taxonomic classification using Kraken
Sequencing reads are assigned taxonomic labels using Kraken. Kraken breaks each read into k-mers and assigns a taxon based on a score calculated from the least common ancestor (LCA) of each k-mer in the read3. The Kraken microbial database mapping k-mers to LCAs was built using all the genomes from ogranisms in the RefSeq bacterial, viral, and archaea domains.

De Novo Transcriptome Assembly
When the species, genera and even phyla present in samples are largely unknown, ORB employs Trinity for efficient de novo full-length transcriptome reconstruction in the complete absence of a reference genome; view ORB's Trinity analysis pipeline in Figure 6. This strategy not only accurately reflects sample composition but also enables potential discovery of new genes and species4.

Example Analysis Package

In a demonstration study, ORB downloaded metatranscriptomic RNA-sequencing reads from human stool samples and processed them using ORB’s human gut metatranscriptome analysis pipeline. The sequencing data were generated as part of the study "Diet Rapidly and Reproducibly Alters the Human Gut Microbiome" (David et al. 2014 Nature. 505(7484):559-563).
David et al. recruited 11 subjects to consume a diet composed of entirely animal or plant products. Each diet was consumed for 5 consecutive days by the study volunteers, who were observed for 4 days before each diet arm to measure normal eating habits. Microbial communitywide gene expression was measured using metatranscriptomics RNA-seq. Samples from 1 baseline timepoint and 2 timepoints on each diet were selected for RNA-seq.
To evaluate the microbial community gene expression, quality-filtered and trimmed sequencing reads were aligned to ORB's human gut reference database (see clustering of stool samples using gene expression from this approach in Figure 4). Sequencing reads were also assigned taxonomy using Kraken, independent of the human gut reference alignment. The average breakdown of microbes in the animal diet samples is shown in Figure 5.

Download a metatranscriptomics demonstration results package here!

Raw Data Files Provided for Microbiomic Bioinformatics
  • Raw sequencing reads in FASTQ format
  • Annotated assemblies
  • Tables of annotated raw and RPKM (reads per kilobase per million) counts