Engineering

Nextflow Pipelines: We've Deployed 50+ in Production—Here's the Best for Every Use Case

Engineering Team

After deploying over 50 Nextflow pipelines for genomics labs, pharmaceutical companies, and research hospitals, we’ve learned that most teams don’t need to build pipelines from scratch. The nf-core community maintains 141+ production-ready Nextflow pipelines covering nearly every common bioinformatics workflow.

The challenge isn’t finding a Nextflow pipeline—it’s choosing the right one, configuring it for your infrastructure, and getting it production-ready. This guide covers everything you need to know about Nextflow pipelines: what’s available, how to evaluate them, and how to deploy them successfully.


What is a Nextflow Pipeline?

A Nextflow pipeline is a workflow written in Nextflow, a domain-specific language designed for data-intensive computational pipelines. Each Nextflow pipeline defines a series of processes that transform input data (typically FASTQ files, BAMs, or other genomics formats) into outputs like variant calls, gene counts, or quality reports.

What makes Nextflow pipelines powerful:

  • Portability — The same pipeline runs on your laptop, HPC cluster, or cloud (AWS, Google Cloud, Azure) without code changes
  • Reproducibility — Containerized execution (Docker/Singularity) ensures identical results across environments
  • Scalability — Automatic parallelization handles anything from a few samples to thousands
  • Resumability — Failed runs resume from the last checkpoint, saving compute time and cost

If you’re new to Nextflow, start with our Nextflow tutorial for the fundamentals before diving into specific pipelines.


nf-core: The Gold Standard for Nextflow Pipelines

nf-core is a community effort that maintains curated, peer-reviewed Nextflow pipelines following strict best practices. When evaluating any Nextflow pipeline, nf-core pipelines should be your first choice because they guarantee:

StandardWhat It Means
Tested releasesEvery release passes automated CI/CD tests
DocumentationComprehensive usage docs, parameter descriptions, and tutorials
Container supportDocker and Singularity containers for every tool
Standardized inputsConsistent samplesheet formats across pipelines
Active maintenanceRegular updates, security patches, and community support
Seqera Platform readyCompatible with Seqera Platform for enterprise management

As of February 2026, nf-core offers 141 pipelines across genomics, proteomics, imaging, and more.

Running Your First nf-core Pipeline

# Test any nf-core pipeline with minimal test data
nextflow run nf-core/rnaseq -profile test,docker

# Run with your data
nextflow run nf-core/rnaseq \
    -profile docker \
    --input samplesheet.csv \
    --genome GRCh38 \
    --outdir results

Nextflow Pipelines by Category

RNA Sequencing Pipelines

PipelineDescriptionWhen to Use
nf-core/rnaseqComprehensive RNA-seq with STAR, HISAT2, or SalmonStandard bulk RNA-seq experiments
nf-core/rnafusionRNA fusion detectionCancer transcriptomics, fusion gene discovery
nf-core/differentialabundanceDifferential expression analysisDownstream DE analysis from counts
nf-core/scrnaseqSingle-cell RNA-seq10x Genomics, Smart-seq2, Drop-seq data

nf-core/rnaseq is the most popular Nextflow pipeline, with the largest contributor base and most active development. It supports multiple alignment and quantification strategies:

  • STAR + Salmon — Recommended for most use cases
  • STAR + RSEM — When you need transcript-level quantification
  • HISAT2 + featureCounts — Lower memory requirements
  • Salmon (pseudo-alignment) — Fastest option for gene-level counts

Variant Calling Pipelines

PipelineDescriptionWhen to Use
nf-core/sarekGermline and somatic variant callingWGS, WES, targeted panels
nf-core/rarediseaseRare disease variant analysisClinical genetics, Mendelian disorders
nf-core/nanoseqOxford Nanopore data processingLong-read sequencing analysis
nf-core/viralreconViral genome analysisSARS-CoV-2, influenza, other viral genomes

nf-core/sarek handles the complete variant calling workflow from raw reads to annotated variants. It supports:

  • Germline variant calling (GATK HaplotypeCaller, DeepVariant, Strelka2)
  • Somatic variant calling for tumor/normal pairs
  • Structural variant detection
  • Copy number variation analysis

Epigenomics Pipelines

PipelineDescriptionWhen to Use
nf-core/chipseqChIP-seq peak calling and QCHistone modifications, transcription factor binding
nf-core/atacseqATAC-seq chromatin accessibilityOpen chromatin profiling
nf-core/cutandrunCUT&RUN/CUT&Tag analysisLow-input chromatin profiling
nf-core/methylseqBisulfite sequencingDNA methylation analysis

Metagenomics Pipelines

PipelineDescriptionWhen to Use
nf-core/magMetagenome-assembled genomesMicrobiome assembly and binning
nf-core/taxprofilerTaxonomic classificationMulti-tool taxonomic profiling
nf-core/ampliseq16S/ITS amplicon analysisMicrobiome diversity studies

Proteomics Pipelines

PipelineDescriptionWhen to Use
nf-core/quantmsQuantitative mass spectrometryLabel-free and labeled proteomics
nf-core/mhcquantMHC peptide identificationImmunopeptidomics

Utility Pipelines

PipelineDescriptionWhen to Use
nf-core/fetchngsDownload from SRA/ENA/DDBJFetching public sequencing data
nf-core/demultiplexIllumina demultiplexingProcessing raw BCL files
nf-core/fastqcQuality controlQuick QC for any FASTQ data

Evaluating Nextflow Pipelines

Not all Nextflow pipelines are created equal. Here’s how we evaluate pipelines before recommending them for production:

1. Check nf-core Status First

Always check if an nf-core version exists. The nf-core pipeline list is searchable by keyword.

# List all nf-core pipelines
nf-core list

# Search for specific functionality
nf-core list --keywords rnaseq

2. Evaluate Non-nf-core Pipelines

For pipelines outside nf-core, assess these criteria:

CriterionWhat to CheckRed Flags
MaintenanceLast commit date, open issuesNo updates in 12+ months
DocumentationREADME, usage examples, parameter docsMissing input/output specifications
ContainerizationDocker/Singularity supportHardcoded paths, no containers
TestingCI/CD configuration, test profilesNo automated tests
VersioningSemantic versioning, releasesOnly “main” branch, no tags
CommunityGitHub stars, forks, contributorsSingle developer, no community

3. The Awesome-Nextflow Repository

awesome-nextflow curates community Nextflow pipelines outside nf-core. Notable pipelines include:


Deploying Nextflow Pipelines in Production

Running a Nextflow pipeline locally is straightforward. Running it reliably in production across thousands of samples requires careful configuration.

Configuration Hierarchy

Nextflow uses a layered configuration system:

1. $HOME/.nextflow/config         (user defaults)
2. nextflow.config                (pipeline directory)
3. -c custom.config               (command line)
4. --params-file params.yaml      (parameters only)

Later configurations override earlier ones.

Essential Production Configuration

Create a custom.config for your environment:

// custom.config for production deployment

// Resource limits for your environment
params {
    max_cpus   = 64
    max_memory = '256.GB'
    max_time   = '168.h'
}

// Process-specific resources
process {
    // Default resources
    cpus   = { 4 * task.attempt }
    memory = { 8.GB * task.attempt }
    time   = { 4.h * task.attempt }

    // Error handling with resource scaling
    errorStrategy = { task.exitStatus in [143,137,104,134,139,140] ? 'retry' : 'finish' }
    maxRetries = 3

    // High-memory processes
    withLabel: 'high_memory' {
        memory = { 128.GB * task.attempt }
    }

    // Long-running processes
    withLabel: 'long_runtime' {
        time = { 24.h * task.attempt }
    }
}

// Execution platform
profiles {
    slurm {
        process.executor = 'slurm'
        process.queue = 'normal'
        process.clusterOptions = '--account=myproject'

        // Throttle job submission
        executor {
            queueSize = 100
            submitRateLimit = '10/1min'
        }
    }

    aws {
        process.executor = 'awsbatch'
        aws.region = 'us-east-1'
        aws.batch.cliPath = '/home/ec2-user/miniconda/bin/aws'

        // Spot instances for cost savings
        process.queue = 'spot-queue'
    }
}

// Singularity for HPC (no Docker daemon required)
singularity {
    enabled = true
    autoMounts = true
    cacheDir = '/scratch/singularity_cache'
}

// Reporting
timeline.enabled = true
report.enabled = true
trace.enabled = true

Run with your configuration:

nextflow run nf-core/rnaseq \
    -profile slurm,singularity \
    -c custom.config \
    --input samplesheet.csv \
    --outdir results \
    -resume

Best Practices for Production Pipelines

Based on our experience deploying Nextflow pipelines across HPC clusters and cloud environments:

1. Always use containers

// Specify container digests for reproducibility
process ALIGN {
    container 'quay.io/biocontainers/bwa@sha256:abc123...'
}

2. Configure proper error handling

Exit codes 137 and 143 indicate out-of-memory kills. Configure automatic retries with increased resources:

process {
    memory = { 8.GB * task.attempt }
    errorStrategy = { task.exitStatus in [137,143] ? 'retry' : 'terminate' }
    maxRetries = 3
}

3. Throttle job submission

Large pipelines can overwhelm HPC schedulers:

executor {
    queueSize = 100
    submitRateLimit = '10/1min'
}

4. Use value channels for reference files

Queue channels are consumed once. Use value channels for files needed by multiple processes:

ch_reference = Channel.value(file(params.reference))

5. Enable execution reports

timeline.enabled = true
timeline.file = "${params.outdir}/pipeline_info/timeline.html"

report.enabled = true
report.file = "${params.outdir}/pipeline_info/report.html"

trace.enabled = true
trace.file = "${params.outdir}/pipeline_info/trace.txt"

6. Set JVM heap size on HPC

Prevent Nextflow from consuming excessive memory on shared nodes:

export NXF_OPTS='-Xms1g -Xmx4g'
nextflow run pipeline.nf

Build vs. Buy: When to Create Custom Pipelines

With 141+ nf-core pipelines available, when should you build custom Nextflow pipelines?

Use Existing Pipelines When:

  • ✅ An nf-core pipeline covers your analysis type
  • ✅ You need standard outputs (counts, VCFs, BAMs)
  • ✅ Reproducibility and community support matter
  • ✅ You want to leverage ongoing maintenance and updates

Build Custom Pipelines When:

  • ✅ Your workflow combines non-standard tools
  • ✅ You need proprietary algorithms or methods
  • ✅ Existing pipelines don’t support your data type
  • ✅ You require specialized QC or reporting

The Hybrid Approach

Most production environments use a combination:

  1. nf-core pipelines for standard analyses (RNA-seq, variant calling)
  2. Custom pipelines for proprietary methods or specialized workflows
  3. nf-core modules as building blocks in custom pipelines

The nf-core modules repository provides 1,200+ individual process definitions you can import into custom pipelines:

include { FASTQC } from 'nf-core/modules/fastqc/main'
include { MULTIQC } from 'nf-core/modules/multiqc/main'

workflow {
    FASTQC(ch_reads)
    MULTIQC(FASTQC.out.zip.collect())
}

For teams building custom pipelines, our Nextflow tutorial covers DSL2 syntax, channels, operators, and best practices.


Nextflow Pipeline Monitoring with Seqera Platform

For enterprise deployments, the Seqera Platform (formerly Nextflow Tower) provides:

  • Real-time monitoring — Track pipeline progress across all compute environments
  • Launch UI — Start pipelines without command-line access
  • Compute environments — Manage AWS Batch, Google Cloud, Azure, HPC from one interface
  • Cost tracking — Monitor cloud spend per pipeline and project
  • Team collaboration — Shared workspaces with role-based access

Enable Seqera Platform monitoring:

# Set your access token
export TOWER_ACCESS_TOKEN=your-token

# Run with Tower monitoring
nextflow run nf-core/rnaseq -with-tower

Common Nextflow Pipeline Errors

”WARN: Input tuple does not match input set cardinality”

Cause: Channel emitting wrong structure for process input.

Solution: Use .view() to debug channel contents:

ch_reads
    .view { "DEBUG: $it" }

“Process terminated with exit code 137”

Cause: Out of memory (OOM kill).

Solution: Increase memory allocation:

process MEMORY_INTENSIVE {
    memory { 16.GB * task.attempt }
    maxRetries 3
}

“Unable to acquire lock”

Cause: Another Nextflow instance running in the same directory.

Solution: Use separate work directories or wait for the other run:

nextflow run pipeline.nf -work-dir /scratch/work_run2

Pipeline stuck at “executor > local”

Cause: Executor not configured for your cluster.

Solution: Specify the correct executor profile:

nextflow run pipeline.nf -profile slurm

Comparing Nextflow to Other Pipeline Tools

Nextflow isn’t the only workflow management system. How does it compare?

FeatureNextflowSnakemakeWDL/Cromwell
LanguageGroovy DSLPython-likeWDL
Cloud nativeExcellentRequires setupGood (Terra)
HPC supportExcellentGoodModerate
Community pipelines141+ (nf-core)Snakemake CatalogDockstore
Learning curveModerateLowerHigher
Enterprise supportSeqeraCommunityBroad Institute

For a detailed comparison, see our Nextflow vs Snakemake guide.


Getting Started Checklist

Ready to run Nextflow pipelines? Follow this checklist:

  • Install Nextflow — Follow our Conda installation guide or use curl -s https://get.nextflow.io | bash
  • Install Docker or Singularity — Required for containerized execution
  • Test with example data — Run nextflow run nf-core/rnaseq -profile test,docker
  • Prepare your samplesheet — Follow nf-core samplesheet format
  • Configure for your environment — Create custom.config for HPC/cloud
  • Run with -resume — Always enable resumability

Get Expert Nextflow Pipeline Support

Selecting the right Nextflow pipeline is just the first step. Configuring pipelines for your specific infrastructure, optimizing performance, and troubleshooting production issues requires deep expertise in both bioinformatics and cloud/HPC systems.

Our Nextflow managed services team helps genomics organizations:

  • Evaluate and select pipelines for your specific research or clinical workflows
  • Configure production deployments across AWS Batch, Google Cloud, Azure, or on-premise HPC
  • Optimize pipeline performance to reduce compute costs and processing time
  • Build custom pipelines when existing solutions don’t fit your requirements
  • Integrate with Seqera Platform for enterprise monitoring and collaboration
  • Provide ongoing support with 24/7 monitoring for mission-critical workflows

We’ve deployed Nextflow pipelines processing petabytes of genomics data for pharmaceutical companies, clinical labs, and research institutions.

Talk to our Nextflow experts →


External Resources:

Chat with real humans
Chat on WhatsApp