# Case Studies

This section provides two case studies demonstrating how to use **STRmie-HD** depending on the sequencing design:  

1. **Paired-End (PE) reads** – Combine R1/R2 into single continuous reads using an external merging tool, then run STRmie-HD.  
2. **Single-End (SE) reads** – Unpaired reads (Single-End), such as Illumina or long reads from Nanopore/PacBio. Run STRmie-HD directly on gzipped FASTQ/FASTA files.
3. **Oxford Nanopore long reads** – Run STRmie-HD in **Nanopore mode** (`--nanopore`).

Both flows converge to the same STRmie-HD pipeline and produce the same interactive HTML report for visual inspection.  

---

## Case Study 1 – Paired-End Reads (PE) with PEAR

When working with paired-end data, we recommend merging reads upstream with **[PEAR](https://anaconda.org/bioconda/pear)** to obtain single, high-confidence sequences spanning the CAG repeat region.  

> Example paired-end test files are provided in the repository under `tests/example_file/paired_end_file/`.  
> These can be used to reproduce the workflow immediately after cloning the repository.


***Generic PEAR command***  
```bash
pear -f forward_R1.fastq.gz \
     -r reverse_R2.fastq.gz \
     -v 10 \
     -o output_prefix
```

**Parameters**
- `-f forward_R1.fastq.gz` → path to the **forward** read (R1).  
- `-r reverse_R2.fastq.gz` → path to the **reverse** read (R2).  
- `-v MIN_OVERLAP` → **minimum overlap length** (in bp) required to confidently merge a pair (default: 10).  
- `-o output_prefix` → output filename prefix; PEAR will produce files such as `output_prefix.assembled.fastq` (merged reads).  

---

Alternative merging tool: **[FLASH (Fast Length Adjustment of Short reads)](https://anaconda.org/conda-forge/flash)** can also be used for merging paired-end reads before running STRmie-HD.  

After merging, use the merged reads as input for STRmie‑HD (see **Running the Complete Pipeline** below).

---

### Example workflow with test data (Paired-End)

***Running PEAR***
```bash
pear -f tests/example_file/paired_end_file/ID1732-HTT-E10-56-HD17401-A001_S55_L001_R1.fastq.gz \
     -r tests/example_file/paired_end_file/ID1732-HTT-E10-56-HD17401-A001_S55_L001_R2.fastq.gz \
     -v 10 \
     -o tests/example_file/HD17401
```

The same procedure can be repeated for the other test samples provided in tests/example_file/paired_end_file/ (e.g., HD4501 and HD3903) by replacing the corresponding input filenames and output prefix.

***Organize merged reads and prepare output folders***

```bash
mkdir tests/example_file/assembled_reads

mv tests/example_file/*.assembled.fastq tests/example_file/assembled_reads/

gzip tests/example_file/assembled_reads/*

mkdir tests/example_file/strmie_output
```

***Running the Complete Pipeline***

```bash
strmie --mode Complete_Pipeline \
       -f tests/example_file/assembled_reads/ \
       -o tests/example_file/strmie_output
```
---


## Case Study 2 – Single-End Reads (SE)

For single-end datasets, no merging is required. STRmie-HD can ingest **FASTQ.gz** (preferred, includes quality) or **FASTA.gz** files directly.  

> Example single-end test files are provided in the repository under `tests/input_file/`.  
> These can be used directly after cloning the repository, without additional preprocessing.

### Example workflow with test data (Single-End)

***Prepare directories for STRmie-HD output***

```bash
mkdir tests/example_file/strmie_output_se
```

***Running the Complete Pipeline***
```bash
strmie --mode Complete_Pipeline \
       -f tests/input_file/ \
       -o tests/example_file/strmie_output_se
```

---
## Case Study 3 – Oxford Nanopore reads

For noisy long reads, STRmie-HD provides an integrated Nanopore mode.

> Example Nanopore test files are provided under: `tests/example_file/nanopore_file/`.  
> These can be used directly after cloning the repository, without additional preprocessing.

### Example workflow with test data (Nanopore)

***Prepare directories output***

```bash
mkdir tests/example_file/strmie_output_nanopore
```

***Run STRmie-HD in Nanopore mode***
```bash
strmie --mode Complete_Pipeline \
       -f tests/example_file/nanopore_file/ \
       -o tests/example_file/strmie_output_nanopore \
       --nanopore 
```

---

## Running the Index Calculation

> ℹ️ Guidance: If you intend to run the **Index_Calculation** mode, see the see the {ref}`html-report-step-by-step-workflow` section.
There you will find instructions on how to use the interactive HTML report to manually correct allele peaks and export a curated table for recalculating indices.