Long Non-Coding RNA Analysis

Fig. 1 Crystal structure of the triple-helical stability element at the 3'end of MALAT1

Long non-coding RNA (lncRNA) molecules are a diverse class of RNAs that are generally defined as being longer than 200 nucleotides and lacking protein coding potential1. While the importance of lncRNAs was initially overlooked, studies now exist that indicate lncRNAs play roles in many cellular processes to promote normal function and cell viability2,3. Furthermore, regulatory lncRNAs affect the expression of other genes at the level of transcription, post-transcription, or translation4. LncRNAs can regulate gene function by binding DNA5, and proteins6. Some regulatory lncRNAs are expressed only in certain stages of organismal development, including during the determination of cell fate and early embryonic development7,8. The development and progression of many diseases now appears to include lncRNA molecules; to date, over 300 lncRNAs are implicated in over 200 diseases9.

Ocean Ridge Biosciences’ RNA sequencing workflow can easily be adjusted to incorporate lncRNA analysis. Interrogation can include information on the types (antisense, intergenic, overlapping, intronic, bidirectional, and processed), chromosomal locations, description of the biological functions and disease associations of lncRNA. Reports can also include protein-lncRNA interactions and genomic variations in lncRNA loci.

ORB’s lncRNA Analysis Pipeline

Figure 2. A schematic diagram representing ORB’s lncRNA analysis pipeline.

ORB’s lncRNA workflow (Figure 2) starts with alignment of passed-filter sequencing reads to the species-specific sequences from the RNAcentral database using Bowtie 2. Reads aligning to each lncRNA are counted using BEDtools. For all ncRNAs having at least one aligned read, annotations are added using RNAcentral’s RESTful API. An important step in analyzing RNA sequencing data is filtering out transcripts that are not reliably expressed. ORB's lncRNA pipeline, only retains ncRNAs with more than 50 mapped reads in at least 25% of samples.

Demonstration of ORB’s lncRNA Expression Analysis

ORB performed full analysis of traditional gene expression as well as ncRNA transcript expression using RNA sequencing data with a mean of 79 million 100 nucleotide paired-end reads (39.5 million reads per direction) generated by an Illumina HiSeq-2000 instrument. This data was obtained from Gene Expression Omnibus from a study investigating the human colon cancer HT29 cell line treated with two different conditions of 5-aza-deoxy-cytidine for 5 days in triplicate10. Figure 3 shows the experimental design.

Figure 3. The HT29 study analyzed 5-aza-deoxy-cytidine at 5 and 10µM for 5 days in triplicate.

Classification of detectable non-coding RNA in HT29 cells was based on mapping to the RNACentral database which divides non-coding RNA into lncRNA, rRNA, antisense RNA, snoRNA, SRP RNA, snRNA, microRNA, and precursor RNA. LncRNA made up more than half of the detectable non-coding RNAs from this study with 5,329 unique molecules, and rRNA comprised the next largest category with 3,296 (Figure 4). Each remaining category had fewer than 270 detectable ncRNAs. Click here to view an example results table from ORB’s lncRNA analysis.

Figure 4. Classification of detectable non-coding RNA in HT-29 cells based on mapping to the RNAcentral database.
Figure 5A. Separation of sample means based on their PC1 and PC2 composite value.
Figure 5B. Correlation of individual gene means to PC1 and PC2 pattern.
Principal Components Analysis performed on RPKM data from 816 significant (FDR <0.1) ncRNA transcripts showed excellent separation of the samples by treatment (Figure 5). Over 100 ncRNAs showed a substantial correlation with the first two principal components (Figure 6-7), which correspond to a measure of the difference between 0µM 5-Aza and 5µM 5-Aza.


Figure 6. Positive Principal Components
Figure 7. Negative Principal Components
The settings for PCA Analysis were:
  • Matrix type - Covariance
  • Number of principal components - 3
  • Fold change threshold for clusters – 2
  • Correlation threshold for clusters – 0.9
  • Transcripts to analyze: All 816 genes
  • Transcripts number to display: 100 %

Contact Us to discuss how ORB's long non-coding RNA analysis service can advance your research!


  1. Kung JTY, Colognori D, and Lee JT (2013) Long Noncoding RNAs: Past, Present, and Future. Genetics 193(3):651-669; DOI: 10.1534/genetics.112.146704
  2. Batista, P. J., & Chang, H. Y. (2013). Long noncoding RNAs: cellular address codes in development and disease. Cell152(6), 1298-1307.
  3. Li, T., Mo, X., Fu, L., Xiao, B., & Guo, J. (2016). Molecular mechanisms of long noncoding RNAs on gastric cancer. Oncotarget7(8), 8601.
  4. Yoon JH, Abdelmohsen K, Gorospe M. (2013) Posttranscriptional gene regulation by long noncoding RNA. J Mol Biol. 425(19):3723-30. doi: 10.1016/ j.jmb.2012.11.024.
  5. Mondal T, Subhash S, Vaid R, Enroth S, Uday S, Reinius B, Mitra S, Mohammed A, James AR, Hoberg E, Moustakas A, Gyllensten U, Jones SJ, Gustafsson CM,Sims AH, Westerlund F, Gorab E, Kanduri C. (2015) MEG3 long noncoding RNA regulates the TGF-β pathway genes through formation of RNA-DNA triplex structures. Nat Commun. 6:7743. doi: 10.1038/ncomms8743.
  6. Han P, Li W, Lin CH, Yang J, Shang C, Nurnberg ST, Jin KK, Xu W, Lin CY, Lin CJ, Xiong Y, Chien HC, Zhou B, Ashley E, Bernstein D, Chen PS, Chen HS, Quer- termous T, Chang CP. A long noncoding RNA protects the heart from pathological hypertrophy. Nature. 514(7520):102-6. doi:10.1038/nature13596.
  7. Cesana M, Cacchiarelli D, Legnini I, Santini T, Sthandier O, Chinappi M, Tramontano A, Bozzoni I. (2011) A Long Noncoding RNA Controls Muscle Differentiation by Functioning as a Competing Endogenous RNA Cell147(4):947
  8. Fatica A and Bozzoni I (2014) Long non-coding RNAs: new players in cell differentiation and development. Nat Rev Genet. 15(1):7-21. doi: 10.1038/nrg3606.
  9. Shi, X., Sun, M., Liu, H., Yao, Y., & Song, Y. (2013). Long non-coding RNAs: a new frontier in the study of human diseases. Cancer letters339(2), 159-166.
  10. Xu X, Zhang Y, Williams J, Antoniou E, McCombie WR, Wu S, Zhu W, Davidson NO, Denoya P, Li E (2013) Parallel comparison of Illumina RNA-Seq and Affymetrix microarray platforms on transcriptomic profiles generated from 5-aza-deoxy-cytidine treated HT-29 colon cancer cells and simulated datasets. BMC Bioinformatics 14 (Suppl 9):S1. doi: 10.1186/1471-2105-14-S9-S1.