Skip to content

TIN & Gene Body Coverage Outputs

RustQC includes two complementary RNA integrity analyses:

  • TIN (Transcript Integrity Number) — reimplementation of RSeQC’s tin.py, measuring transcript-level coverage uniformity via Shannon entropy
  • Gene body coverage — Qualimap-compatible output showing coverage distribution along normalized gene body positions

Both run automatically as part of rustqc rna in the same single-pass BAM scan.

TIN measures the uniformity of read coverage across a transcript. A TIN score of 100 means perfectly uniform coverage; lower scores indicate degradation or bias. TIN is computed using Shannon entropy of read-start coverage at sampled positions along each transcript’s exonic regions.

TIN is particularly useful for:

  • Detecting RNA degradation — degraded samples show low TIN scores across most transcripts
  • Sample QC — median TIN provides a single-number summary of RNA integrity, complementing RIN (RNA Integrity Number) from Bioanalyzer/TapeStation
  • Identifying problematic samples — samples with unusually low median TIN should be flagged for potential exclusion

TIN output files use the BAM file stem as a prefix and are written to a rseqc/tin/ subdirectory under the output directory.

  • Directoryrseqc/
    • Directorytin/
      • sample.tin.xls Per-transcript TIN scores
      • sample.summary.txt Summary statistics (mean, median, stdev)

File: <sample>.tin.xls

A tab-separated file with one row per transcript (gene), reporting the TIN score for each:

ColumnDescription
geneIDGene identifier from the annotation
chromChromosome
tx_startTranscript start position
tx_endTranscript end position
TINTranscript Integrity Number (0-100)

Transcripts below the minimum coverage threshold are excluded from the output. TIN values are formatted to 2 decimal places.

Example:

geneID chrom tx_start tx_end TIN
ENSG00000227232.6 chr1 14695 24886 74.86
ENSG00000268903.1 chr1 135140 135895 60.32
ENSG00000269981.1 chr1 137681 137965 85.89

File: <sample>.summary.txt

A single-row tab-separated summary of TIN statistics across all transcripts:

ColumnDescription
Bam_filePath to the input BAM file
TIN(mean)Mean TIN across all transcripts
TIN(median)Median TIN
TIN(stdev)Standard deviation of TIN

Example:

Bam_file TIN(mean) TIN(median) TIN(stdev)
sample.bam 72.55 83.89 26.14
Median TINInterpretation
> 70Good RNA integrity
50-70Moderate degradation
< 50Significant degradation — consider excluding
< 30Severe degradation

A bimodal distribution of per-transcript TIN scores (some very high, some very low) can indicate selective degradation of certain transcript classes rather than global degradation.

tin:
enabled: true
sample_size: 100 # Equally-spaced positions to sample per transcript
min_coverage: 10 # Minimum read-start count to compute TIN

TIN uses the shared --mapq / -q flag for mapping quality filtering.

Gene body coverage profiles show the distribution of read coverage along normalized transcript positions from 5’ to 3’. Each transcript is divided into 100 equal percentile bins, and coverage depth is accumulated across all expressed genes. This reveals systematic biases such as:

  • 3’ bias — typical of degraded RNA or oligo-dT primed libraries
  • 5’ bias — can indicate incomplete reverse transcription
  • Uniform coverage — ideal result, expected from high-quality libraries

Gene body coverage output is written to a qualimap/ subdirectory in a format compatible with Qualimap rnaseq output, which MultiQC can parse directly.

  • Directoryqualimap/
    • coverage_profile_along_genes_(total).txt Coverage profile (100 bins)
    • rnaseq_qc_results.txt Qualimap-compatible QC summary

File: coverage_profile_along_genes_(total).txt

A tab-separated file with 100 rows, one per percentile bin (0-99), showing the cumulative coverage depth at each normalized position along the gene body:

ColumnDescription
PositionPercentile position along gene body (0.0 to 99.0)
CoverageCumulative read depth at this position across all genes

File: rnaseq_qc_results.txt

A Qualimap rnaseq-compatible text file with four sections:

  • Input — BAM file name
  • Reads alignment — total reads, alignments, secondary, aligned to genes, ambiguous, no feature assigned
  • Reads genomic origin — exonic, intronic, and intergenic base counts with percentages; overlapping exon count
  • Transcript coverage profile — 5’ bias, 3’ bias, and 5’-3’ bias ratios

Example:

>>>>>>> Input
bam file = sample
>>>>>>> Reads alignment
reads aligned = 175,097,721
total alignments = 185,718,543
secondary alignments = 10,620,822
>>>>>>> Reads genomic origin
exonic = 744,941,713 (8.29%)
intronic = 837,462,849 (9.32%)
intergenic = 7,406,659,875 (82.40%)
>>>>>>> Transcript coverage profile
5' bias = 1.20
3' bias = 0.92
5'-3' bias = 1.24
genebody_coverage:
enabled: true # Whether to produce gene body coverage output

Gene body coverage requires GTF annotation (--gtf mode) — it is not available in BED-only mode because it requires transcript-level exon structure to normalize positions along the gene body.

RustQC’s TIN output uses the same file format and column names as RSeQC’s tin.py. The key difference is that RustQC reports one row per gene (using the longest transcript as representative), while RSeQC reports one row per transcript. TIN scores are computed using the same Shannon entropy formula.

The Qualimap-compatible output files are designed for direct parsing by MultiQC’s Qualimap rnaseq module. The coverage profile format and QC results sections match Qualimap’s output structure.

  • TIN: Wang L, Nie J, Sicotte H, et al. Measure transcript integrity using RNA-seq data. BMC Bioinformatics. 2016;17:58.
  • RSeQC: Wang L, Wang S, Li W. RSeQC: quality control of RNA-seq experiments. Bioinformatics. 2012;28(16):2184-2185. RSeQC website
  • Qualimap: Garcia-Alcalde F, et al. Qualimap: evaluating next-generation sequencing alignment data. Bioinformatics. 2012;28(20):2678-2679. Qualimap website