Skip to content

featureCounts Benchmark

RustQC includes built-in gene-level read counting that produces output compatible with the Subread featureCounts format. This page compares the counting performance and output accuracy against standalone Rsubread featureCounts.

Large benchmark input: GM12878 REP1 — a 10 GB paired-end RNA-seq BAM aligned to GRCh38 (63,086 genes).

ToolRuntime
Rsubread featureCounts3m 39s
RustQC (all tools, single pass)3m 56s

RustQC’s single pass includes featureCounts-compatible counting alongside dupRadar duplication analysis and all 7 RSeQC tools. The traditional workflow requires running each tool separately. Standalone featureCounts timing is from Docker with x86 emulation on ARM Mac.

RustQC’s read counting uses the same algorithm as Subread featureCounts:

  • Feature type: exon-level features grouped by gene_id
  • Overlap detection: at least 1 base pair overlap
  • Strand awareness: configurable via the -s / --strandedness flag
  • Multi-mapping: tracked separately for unique and multi-mapper columns
MetricRsubread featureCountsRustQCExact match
allCounts (unique)14,654,57914,654,579100%
filteredCounts (unique)3,599,8323,599,832100%
allCountsMulti16,089,48816,089,488100%
filteredCountsMulti4,503,9204,503,920100%

Gene-level read counts are identical across all 63,086 genes. Assignment statistics (Assigned, Unassigned_NoFeatures, Unassigned_Ambiguous) match exactly.

The output format is directly compatible with downstream tools such as DESeq2 and MultiQC.

Beyond the standard featureCounts counts file and summary, RustQC also produces:

  • Biotype counts (.biotype_counts.tsv) — per-biotype read count summaries
  • Biotype MultiQC bargraph (.biotype_counts_mqc.tsv) — ready for MultiQC visualization
  • rRNA percentage (.biotype_counts_rrna_mqc.tsv) — rRNA fraction for MultiQC general statistics

Generating these in the traditional workflow requires additional scripting after the featureCounts run.