Case Studies
This section provides two case studies demonstrating how to use STRmie-HD depending on the sequencing design:
Paired-End (PE) reads – Combine R1/R2 into single continuous reads using an external merging tool, then run STRmie-HD.
Single-End (SE) reads – Unpaired reads (Single-End), such as Illumina or long reads from Nanopore/PacBio. Run STRmie-HD directly on gzipped FASTQ/FASTA files.
Oxford Nanopore long reads – Run STRmie-HD in Nanopore mode (
--nanopore).
Both flows converge to the same STRmie-HD pipeline and produce the same interactive HTML report for visual inspection.
Case Study 1 – Paired-End Reads (PE) with PEAR
When working with paired-end data, we recommend merging reads upstream with PEAR to obtain single, high-confidence sequences spanning the CAG repeat region.
Example paired-end test files are provided in the repository under
tests/example_file/paired_end_file/.
These can be used to reproduce the workflow immediately after cloning the repository.
Generic PEAR command
pear -f forward_R1.fastq.gz \
-r reverse_R2.fastq.gz \
-v 10 \
-o output_prefix
Parameters
-f forward_R1.fastq.gz→ path to the forward read (R1).-r reverse_R2.fastq.gz→ path to the reverse read (R2).-v MIN_OVERLAP→ minimum overlap length (in bp) required to confidently merge a pair (default: 10).-o output_prefix→ output filename prefix; PEAR will produce files such asoutput_prefix.assembled.fastq(merged reads).
Alternative merging tool: FLASH (Fast Length Adjustment of Short reads) can also be used for merging paired-end reads before running STRmie-HD.
After merging, use the merged reads as input for STRmie‑HD (see Running the Complete Pipeline below).
Example workflow with test data (Paired-End)
Running PEAR
pear -f tests/example_file/paired_end_file/ID1732-HTT-E10-56-HD17401-A001_S55_L001_R1.fastq.gz \
-r tests/example_file/paired_end_file/ID1732-HTT-E10-56-HD17401-A001_S55_L001_R2.fastq.gz \
-v 10 \
-o tests/example_file/HD17401
The same procedure can be repeated for the other test samples provided in tests/example_file/paired_end_file/ (e.g., HD4501 and HD3903) by replacing the corresponding input filenames and output prefix.
Organize merged reads and prepare output folders
mkdir tests/example_file/assembled_reads
mv tests/example_file/*.assembled.fastq tests/example_file/assembled_reads/
gzip tests/example_file/assembled_reads/*
mkdir tests/example_file/strmie_output
Running the Complete Pipeline
strmie --mode Complete_Pipeline \
-f tests/example_file/assembled_reads/ \
-o tests/example_file/strmie_output
Case Study 2 – Single-End Reads (SE)
For single-end datasets, no merging is required. STRmie-HD can ingest FASTQ.gz (preferred, includes quality) or FASTA.gz files directly.
Example single-end test files are provided in the repository under
tests/input_file/.
These can be used directly after cloning the repository, without additional preprocessing.
Example workflow with test data (Single-End)
Prepare directories for STRmie-HD output
mkdir tests/example_file/strmie_output_se
Running the Complete Pipeline
strmie --mode Complete_Pipeline \
-f tests/input_file/ \
-o tests/example_file/strmie_output_se
Case Study 3 – Oxford Nanopore reads
For noisy long reads, STRmie-HD provides an integrated Nanopore mode.
Example Nanopore test files are provided under:
tests/example_file/nanopore_file/.
These can be used directly after cloning the repository, without additional preprocessing.
Example workflow with test data (Nanopore)
Prepare directories output
mkdir tests/example_file/strmie_output_nanopore
Run STRmie-HD in Nanopore mode
strmie --mode Complete_Pipeline \
-f tests/example_file/nanopore_file/ \
-o tests/example_file/strmie_output_nanopore \
--nanopore
Running the Index Calculation
ℹ️ Guidance: If you intend to run the Index_Calculation mode, see the see the HTML report Step by Step workflow section. There you will find instructions on how to use the interactive HTML report to manually correct allele peaks and export a curated table for recalculating indices.