After deploying over 50 Nextflow pipelines for genomics labs, pharmaceutical companies, and research hospitals, we’ve learned that most teams don’t need to build pipelines from scratch. The nf-core community maintains 141+ production-ready Nextflow pipelines covering nearly every common bioinformatics workflow.
The challenge isn’t finding a Nextflow pipeline—it’s choosing the right one, configuring it for your infrastructure, and getting it production-ready. This guide covers everything you need to know about Nextflow pipelines: what’s available, how to evaluate them, and how to deploy them successfully.
What is a Nextflow Pipeline?
A Nextflow pipeline is a workflow written in Nextflow, a domain-specific language designed for data-intensive computational pipelines. Each Nextflow pipeline defines a series of processes that transform input data (typically FASTQ files, BAMs, or other genomics formats) into outputs like variant calls, gene counts, or quality reports.
What makes Nextflow pipelines powerful:
- Portability — The same pipeline runs on your laptop, HPC cluster, or cloud (AWS, Google Cloud, Azure) without code changes
- Reproducibility — Containerized execution (Docker/Singularity) ensures identical results across environments
- Scalability — Automatic parallelization handles anything from a few samples to thousands
- Resumability — Failed runs resume from the last checkpoint, saving compute time and cost
If you’re new to Nextflow, start with our Nextflow tutorial for the fundamentals before diving into specific pipelines.
nf-core: The Gold Standard for Nextflow Pipelines
nf-core is a community effort that maintains curated, peer-reviewed Nextflow pipelines following strict best practices. When evaluating any Nextflow pipeline, nf-core pipelines should be your first choice because they guarantee:
| Standard | What It Means |
|---|---|
| Tested releases | Every release passes automated CI/CD tests |
| Documentation | Comprehensive usage docs, parameter descriptions, and tutorials |
| Container support | Docker and Singularity containers for every tool |
| Standardized inputs | Consistent samplesheet formats across pipelines |
| Active maintenance | Regular updates, security patches, and community support |
| Seqera Platform ready | Compatible with Seqera Platform for enterprise management |
As of February 2026, nf-core offers 141 pipelines across genomics, proteomics, imaging, and more.
Running Your First nf-core Pipeline
# Test any nf-core pipeline with minimal test data
nextflow run nf-core/rnaseq -profile test,docker
# Run with your data
nextflow run nf-core/rnaseq \
-profile docker \
--input samplesheet.csv \
--genome GRCh38 \
--outdir results
Nextflow Pipelines by Category
RNA Sequencing Pipelines
| Pipeline | Description | When to Use |
|---|---|---|
| nf-core/rnaseq | Comprehensive RNA-seq with STAR, HISAT2, or Salmon | Standard bulk RNA-seq experiments |
| nf-core/rnafusion | RNA fusion detection | Cancer transcriptomics, fusion gene discovery |
| nf-core/differentialabundance | Differential expression analysis | Downstream DE analysis from counts |
| nf-core/scrnaseq | Single-cell RNA-seq | 10x Genomics, Smart-seq2, Drop-seq data |
nf-core/rnaseq is the most popular Nextflow pipeline, with the largest contributor base and most active development. It supports multiple alignment and quantification strategies:
- STAR + Salmon — Recommended for most use cases
- STAR + RSEM — When you need transcript-level quantification
- HISAT2 + featureCounts — Lower memory requirements
- Salmon (pseudo-alignment) — Fastest option for gene-level counts
Variant Calling Pipelines
| Pipeline | Description | When to Use |
|---|---|---|
| nf-core/sarek | Germline and somatic variant calling | WGS, WES, targeted panels |
| nf-core/raredisease | Rare disease variant analysis | Clinical genetics, Mendelian disorders |
| nf-core/nanoseq | Oxford Nanopore data processing | Long-read sequencing analysis |
| nf-core/viralrecon | Viral genome analysis | SARS-CoV-2, influenza, other viral genomes |
nf-core/sarek handles the complete variant calling workflow from raw reads to annotated variants. It supports:
- Germline variant calling (GATK HaplotypeCaller, DeepVariant, Strelka2)
- Somatic variant calling for tumor/normal pairs
- Structural variant detection
- Copy number variation analysis
Epigenomics Pipelines
| Pipeline | Description | When to Use |
|---|---|---|
| nf-core/chipseq | ChIP-seq peak calling and QC | Histone modifications, transcription factor binding |
| nf-core/atacseq | ATAC-seq chromatin accessibility | Open chromatin profiling |
| nf-core/cutandrun | CUT&RUN/CUT&Tag analysis | Low-input chromatin profiling |
| nf-core/methylseq | Bisulfite sequencing | DNA methylation analysis |
Metagenomics Pipelines
| Pipeline | Description | When to Use |
|---|---|---|
| nf-core/mag | Metagenome-assembled genomes | Microbiome assembly and binning |
| nf-core/taxprofiler | Taxonomic classification | Multi-tool taxonomic profiling |
| nf-core/ampliseq | 16S/ITS amplicon analysis | Microbiome diversity studies |
Proteomics Pipelines
| Pipeline | Description | When to Use |
|---|---|---|
| nf-core/quantms | Quantitative mass spectrometry | Label-free and labeled proteomics |
| nf-core/mhcquant | MHC peptide identification | Immunopeptidomics |
Utility Pipelines
| Pipeline | Description | When to Use |
|---|---|---|
| nf-core/fetchngs | Download from SRA/ENA/DDBJ | Fetching public sequencing data |
| nf-core/demultiplex | Illumina demultiplexing | Processing raw BCL files |
| nf-core/fastqc | Quality control | Quick QC for any FASTQ data |
Evaluating Nextflow Pipelines
Not all Nextflow pipelines are created equal. Here’s how we evaluate pipelines before recommending them for production:
1. Check nf-core Status First
Always check if an nf-core version exists. The nf-core pipeline list is searchable by keyword.
# List all nf-core pipelines
nf-core list
# Search for specific functionality
nf-core list --keywords rnaseq
2. Evaluate Non-nf-core Pipelines
For pipelines outside nf-core, assess these criteria:
| Criterion | What to Check | Red Flags |
|---|---|---|
| Maintenance | Last commit date, open issues | No updates in 12+ months |
| Documentation | README, usage examples, parameter docs | Missing input/output specifications |
| Containerization | Docker/Singularity support | Hardcoded paths, no containers |
| Testing | CI/CD configuration, test profiles | No automated tests |
| Versioning | Semantic versioning, releases | Only “main” branch, no tags |
| Community | GitHub stars, forks, contributors | Single developer, no community |
3. The Awesome-Nextflow Repository
awesome-nextflow curates community Nextflow pipelines outside nf-core. Notable pipelines include:
- bactopia — Complete bacterial genome analysis
- master_of_pores — Nanopore direct RNA sequencing
- YAMP — Metagenomic analysis
- nf-kmer-similarity — K-mer based sample comparison
Deploying Nextflow Pipelines in Production
Running a Nextflow pipeline locally is straightforward. Running it reliably in production across thousands of samples requires careful configuration.
Configuration Hierarchy
Nextflow uses a layered configuration system:
1. $HOME/.nextflow/config (user defaults)
2. nextflow.config (pipeline directory)
3. -c custom.config (command line)
4. --params-file params.yaml (parameters only)
Later configurations override earlier ones.
Essential Production Configuration
Create a custom.config for your environment:
// custom.config for production deployment
// Resource limits for your environment
params {
max_cpus = 64
max_memory = '256.GB'
max_time = '168.h'
}
// Process-specific resources
process {
// Default resources
cpus = { 4 * task.attempt }
memory = { 8.GB * task.attempt }
time = { 4.h * task.attempt }
// Error handling with resource scaling
errorStrategy = { task.exitStatus in [143,137,104,134,139,140] ? 'retry' : 'finish' }
maxRetries = 3
// High-memory processes
withLabel: 'high_memory' {
memory = { 128.GB * task.attempt }
}
// Long-running processes
withLabel: 'long_runtime' {
time = { 24.h * task.attempt }
}
}
// Execution platform
profiles {
slurm {
process.executor = 'slurm'
process.queue = 'normal'
process.clusterOptions = '--account=myproject'
// Throttle job submission
executor {
queueSize = 100
submitRateLimit = '10/1min'
}
}
aws {
process.executor = 'awsbatch'
aws.region = 'us-east-1'
aws.batch.cliPath = '/home/ec2-user/miniconda/bin/aws'
// Spot instances for cost savings
process.queue = 'spot-queue'
}
}
// Singularity for HPC (no Docker daemon required)
singularity {
enabled = true
autoMounts = true
cacheDir = '/scratch/singularity_cache'
}
// Reporting
timeline.enabled = true
report.enabled = true
trace.enabled = true
Run with your configuration:
nextflow run nf-core/rnaseq \
-profile slurm,singularity \
-c custom.config \
--input samplesheet.csv \
--outdir results \
-resume
Best Practices for Production Pipelines
Based on our experience deploying Nextflow pipelines across HPC clusters and cloud environments:
1. Always use containers
// Specify container digests for reproducibility
process ALIGN {
container 'quay.io/biocontainers/bwa@sha256:abc123...'
}
2. Configure proper error handling
Exit codes 137 and 143 indicate out-of-memory kills. Configure automatic retries with increased resources:
process {
memory = { 8.GB * task.attempt }
errorStrategy = { task.exitStatus in [137,143] ? 'retry' : 'terminate' }
maxRetries = 3
}
3. Throttle job submission
Large pipelines can overwhelm HPC schedulers:
executor {
queueSize = 100
submitRateLimit = '10/1min'
}
4. Use value channels for reference files
Queue channels are consumed once. Use value channels for files needed by multiple processes:
ch_reference = Channel.value(file(params.reference))
5. Enable execution reports
timeline.enabled = true
timeline.file = "${params.outdir}/pipeline_info/timeline.html"
report.enabled = true
report.file = "${params.outdir}/pipeline_info/report.html"
trace.enabled = true
trace.file = "${params.outdir}/pipeline_info/trace.txt"
6. Set JVM heap size on HPC
Prevent Nextflow from consuming excessive memory on shared nodes:
export NXF_OPTS='-Xms1g -Xmx4g'
nextflow run pipeline.nf
Build vs. Buy: When to Create Custom Pipelines
With 141+ nf-core pipelines available, when should you build custom Nextflow pipelines?
Use Existing Pipelines When:
- ✅ An nf-core pipeline covers your analysis type
- ✅ You need standard outputs (counts, VCFs, BAMs)
- ✅ Reproducibility and community support matter
- ✅ You want to leverage ongoing maintenance and updates
Build Custom Pipelines When:
- ✅ Your workflow combines non-standard tools
- ✅ You need proprietary algorithms or methods
- ✅ Existing pipelines don’t support your data type
- ✅ You require specialized QC or reporting
The Hybrid Approach
Most production environments use a combination:
- nf-core pipelines for standard analyses (RNA-seq, variant calling)
- Custom pipelines for proprietary methods or specialized workflows
- nf-core modules as building blocks in custom pipelines
The nf-core modules repository provides 1,200+ individual process definitions you can import into custom pipelines:
include { FASTQC } from 'nf-core/modules/fastqc/main'
include { MULTIQC } from 'nf-core/modules/multiqc/main'
workflow {
FASTQC(ch_reads)
MULTIQC(FASTQC.out.zip.collect())
}
For teams building custom pipelines, our Nextflow tutorial covers DSL2 syntax, channels, operators, and best practices.
Nextflow Pipeline Monitoring with Seqera Platform
For enterprise deployments, the Seqera Platform (formerly Nextflow Tower) provides:
- Real-time monitoring — Track pipeline progress across all compute environments
- Launch UI — Start pipelines without command-line access
- Compute environments — Manage AWS Batch, Google Cloud, Azure, HPC from one interface
- Cost tracking — Monitor cloud spend per pipeline and project
- Team collaboration — Shared workspaces with role-based access
Enable Seqera Platform monitoring:
# Set your access token
export TOWER_ACCESS_TOKEN=your-token
# Run with Tower monitoring
nextflow run nf-core/rnaseq -with-tower
Common Nextflow Pipeline Errors
”WARN: Input tuple does not match input set cardinality”
Cause: Channel emitting wrong structure for process input.
Solution: Use .view() to debug channel contents:
ch_reads
.view { "DEBUG: $it" }
“Process terminated with exit code 137”
Cause: Out of memory (OOM kill).
Solution: Increase memory allocation:
process MEMORY_INTENSIVE {
memory { 16.GB * task.attempt }
maxRetries 3
}
“Unable to acquire lock”
Cause: Another Nextflow instance running in the same directory.
Solution: Use separate work directories or wait for the other run:
nextflow run pipeline.nf -work-dir /scratch/work_run2
Pipeline stuck at “executor > local”
Cause: Executor not configured for your cluster.
Solution: Specify the correct executor profile:
nextflow run pipeline.nf -profile slurm
Comparing Nextflow to Other Pipeline Tools
Nextflow isn’t the only workflow management system. How does it compare?
| Feature | Nextflow | Snakemake | WDL/Cromwell |
|---|---|---|---|
| Language | Groovy DSL | Python-like | WDL |
| Cloud native | Excellent | Requires setup | Good (Terra) |
| HPC support | Excellent | Good | Moderate |
| Community pipelines | 141+ (nf-core) | Snakemake Catalog | Dockstore |
| Learning curve | Moderate | Lower | Higher |
| Enterprise support | Seqera | Community | Broad Institute |
For a detailed comparison, see our Nextflow vs Snakemake guide.
Getting Started Checklist
Ready to run Nextflow pipelines? Follow this checklist:
- Install Nextflow — Follow our Conda installation guide or use
curl -s https://get.nextflow.io | bash - Install Docker or Singularity — Required for containerized execution
- Test with example data — Run
nextflow run nf-core/rnaseq -profile test,docker - Prepare your samplesheet — Follow nf-core samplesheet format
- Configure for your environment — Create custom.config for HPC/cloud
- Run with -resume — Always enable resumability
Get Expert Nextflow Pipeline Support
Selecting the right Nextflow pipeline is just the first step. Configuring pipelines for your specific infrastructure, optimizing performance, and troubleshooting production issues requires deep expertise in both bioinformatics and cloud/HPC systems.
Our Nextflow managed services team helps genomics organizations:
- Evaluate and select pipelines for your specific research or clinical workflows
- Configure production deployments across AWS Batch, Google Cloud, Azure, or on-premise HPC
- Optimize pipeline performance to reduce compute costs and processing time
- Build custom pipelines when existing solutions don’t fit your requirements
- Integrate with Seqera Platform for enterprise monitoring and collaboration
- Provide ongoing support with 24/7 monitoring for mission-critical workflows
We’ve deployed Nextflow pipelines processing petabytes of genomics data for pharmaceutical companies, clinical labs, and research institutions.
Talk to our Nextflow experts →
Related Resources
- Nextflow Tutorial 2026: Complete Guide
- How to Install Nextflow with Conda
- Nextflow vs Snakemake Comparison
- Top Workflow Automation Tools
External Resources: