Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
dfd1c4b
Initial commit for this pipeline. Work in progress.
kockan Sep 18, 2025
434ca9a
Quick second pass to pick some more reasonable test defaults and add …
kockan Sep 18, 2025
3bc51cf
Typo
kockan Sep 18, 2025
e57eddb
Start from either a ubam or a pair of fastqs
kockan Sep 22, 2025
00985ed
Add task for FastQC
kockan Sep 24, 2025
db3f1c2
Update .dockstore.yml
kockan Sep 24, 2025
20ea403
Fix typo
kockan Sep 24, 2025
dc9bc6b
More typos
kockan Sep 24, 2025
470441c
More typos
kockan Sep 24, 2025
e275015
Use select_first to be able to use optional inputs for tasks in case …
kockan Sep 24, 2025
0ef3dd8
Try optional outputs
kockan Sep 24, 2025
c8cd45b
Refactor
kockan Sep 25, 2025
5dfda69
Test dockerfile changes
kockan Sep 25, 2025
69e9d85
Test dockerfile changes
kockan Sep 25, 2025
91ffec9
Add correct molecular index tag option to ExtractUmisFromBam
kockan Sep 25, 2025
6aca35b
Change I/O naming convention for bwa-mem task
kockan Sep 25, 2025
7abcb4c
Fix typo
kockan Sep 25, 2025
59c1d96
Add reference index and dictionary to tasks that require them
kockan Sep 25, 2025
5487cd7
Add index for final bam to task GenotypeSNPsHuman
kockan Sep 25, 2025
a717ee9
Documentation missed the bcftools version for docker image. Let's use…
kockan Sep 25, 2025
1252681
Decide on output naming convention and consistent command-line argume…
kockan Sep 26, 2025
c0fde25
Fix typos
kockan Sep 26, 2025
994c070
Refactoring and small improvements
kockan Sep 27, 2025
8ef3015
Add task DetermineHPVStatus
kockan Sep 28, 2025
8cdea5f
Fix typo
kockan Sep 28, 2025
a54a892
Cleanup comments, fix typos
kockan Sep 28, 2025
20fe418
Add metrics collection tasks
kockan Oct 11, 2025
fff0fe6
Test resource optimizations, first attempt
kockan Oct 17, 2025
a93fecd
Test resource optimizations, first attempt, use_ssd can not be an opt…
kockan Oct 17, 2025
982eb56
Fix syntax
kockan Oct 17, 2025
253b950
Add missing MAPQ filtering and secondary strain selection tasks
kockan Oct 18, 2025
0b35b8f
Add option to soft-clip supplementary alignments
kockan Oct 18, 2025
5f1fbe5
Add task for HPV integration breakpoint detection
kockan Oct 21, 2025
f1458b8
Debug
kockan Oct 21, 2025
6e262f7
Fix path to script
kockan Oct 21, 2025
2feefac
Add Sublineages task, first pass
kockan Oct 21, 2025
0c771a2
Add Sublineages task, continued
kockan Oct 22, 2025
c94b62d
Fix variable name
kockan Oct 22, 2025
56d768c
Fix output file
kockan Oct 22, 2025
613e1bf
Add task for high risk HPV SNPs, first attempt
kockan Oct 22, 2025
5d1163b
Cleanup old typos
kockan Oct 23, 2025
1af3305
Refactoring...
kockan Oct 23, 2025
7421363
Refactoring...
kockan Oct 23, 2025
7e03186
Fix typo
kockan Oct 23, 2025
5e16428
Fix heredoc indentation issue
kockan Oct 23, 2025
4c474be
Fix empty file checking logic
kockan Oct 23, 2025
9d5b135
Heredoc indentation
kockan Oct 23, 2025
39d6994
Heredoc indentation
kockan Oct 23, 2025
9c16093
Collect UMI grouping metrics
kockan Oct 25, 2025
8a81e6e
Revert: Collect UMI grouping metrics. Option doesn't exist for fgbio …
kockan Oct 25, 2025
2cf0937
Multi-threading is also not available in the current pipeline version…
kockan Oct 25, 2025
9073597
Fix runtime attribute on HPVDeepSeek main WDL. Initial commit for var…
kockan Oct 26, 2025
f2ca7e6
Update .dockstore.yml
kockan Oct 27, 2025
44153f8
Change variable name
kockan Oct 27, 2025
fe3a49b
Fix typo
kockan Oct 27, 2025
471d3e2
Prepare WDL for initial test run with a minimal set of required resou…
kockan Oct 27, 2025
33f1ba2
Workaround for MergeBamAlignment read group ID issue
kockan Oct 28, 2025
d084dae
Keep read groups consistent at every step, before and after merging a…
kockan Oct 29, 2025
8edaddf
Changes to the HPVDeepSeek genotyping WDL should make the previous wo…
kockan Oct 29, 2025
2c7c65e
Collect UMI duplication metrics first attempt. Also add more read gro…
kockan Oct 29, 2025
5150b79
Add remaining somatic variant calling filtering tasks
kockan Oct 30, 2025
d09e423
Start integrating somatic variant calling filters into the workflow. …
kockan Oct 30, 2025
d95484c
Fix typo
kockan Oct 30, 2025
5b0cc0a
Test out new resource pack for RunMappingFilter
kockan Oct 31, 2025
852f01c
Refactor workflow into proper submodules
kockan Nov 1, 2025
c87c91f
Fix typo
kockan Nov 1, 2025
8c99156
Fix outputs
kockan Nov 1, 2025
d5970bf
Fix broken indentation due to refactoring.
kockan Nov 2, 2025
bf23637
Now that the initial testing phase is over, let us use separate targe…
kockan Nov 3, 2025
10c49d7
Add initial version of the normalization for ctHPVDNA counts
kockan Nov 8, 2025
61e7521
Add normalization outputs
kockan Nov 8, 2025
1e71728
Fix variable substitutions
kockan Nov 8, 2025
5ccbe58
Skip samtools coverage header
kockan Nov 8, 2025
2acb72b
Fix order of operations
kockan Nov 8, 2025
c5268c7
Try bc for floating point arithmetic in bash
kockan Nov 8, 2025
40a5646
No bc in docker. Try python one-liner.
kockan Nov 8, 2025
0b88d39
Add missing file extension
kockan Nov 8, 2025
809d2f1
Fix typo
kockan Nov 8, 2025
9dcdfcf
Only accept fastqs as input.
kockan Nov 9, 2025
a8e4865
Add option to call duplex consensus reads.
kockan Nov 13, 2025
323aeff
Use paired strategy for grouping UMIs if calling duplex consensus
kockan Nov 13, 2025
2658138
Add optional CollectDuplexSeqMetrics if we're using duplex seq.
kockan Nov 14, 2025
cf6a28e
Move umi_grouped_bam to workflow output for temporary debugging
kockan Nov 17, 2025
4f65fd0
Output raw bams to table; mainly for debugging. Should be removed in …
kockan Nov 18, 2025
e317652
Output raw bams to table; mainly for debugging. Should be removed in …
kockan Nov 18, 2025
c2c778e
Output raw bams to table; mainly for debugging. Should be removed in …
kockan Nov 18, 2025
58d9ae1
Add downsampling option
kockan Nov 23, 2025
59425e7
Optional input
kockan Nov 23, 2025
487c3b3
Fix downsample probability calculation
kockan Nov 23, 2025
43ff990
Fix index output extension
kockan Nov 23, 2025
0d748a3
Test out some of the recommendations by miniwdl check
kockan Dec 6, 2025
5fbd5fd
More miniwdl check changes
kockan Dec 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .dockstore.yml
Original file line number Diff line number Diff line change
Expand Up @@ -153,3 +153,18 @@ workflows:
- name: CNVControlEventsQC
subclass: WDL
primaryDescriptorPath: /gCNV/CNVControlEventsQC.wdl
- name: HPVDeepSeek
subclass: WDL
primaryDescriptorPath: /HPVDeepSeek/HPVDeepSeek.wdl
- name: HPVDeepSeekGenotyping
subclass: WDL
primaryDescriptorPath: /HPVDeepSeek/HPVDeepSeekGenotyping.wdl
- name: HPVDeepSeekSomaticVariantCalling
subclass: WDL
primaryDescriptorPath: /HPVDeepSeek/HPVDeepSeekSomaticVariantCalling.wdl
- name: HPVDeepSeekTertiaryAnalysis
subclass: WDL
primaryDescriptorPath: /HPVDeepSeek/HPVDeepSeekTertiaryAnalysis.wdl
- name: HPVDeepSeekNormalization
subclass: WDL
primaryDescriptorPath: /HPVDeepSeek/HPVDeepSeekNormalization.wdl
183 changes: 183 additions & 0 deletions HPVDeepSeek/HPVDeepSeek.wdl
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
version 1.0

import "HPVDeepSeekGenotyping.wdl" as HPVDeepSeekGenotyping
import "HPVDeepSeekSomaticVariantCalling.wdl" as HPVDeepSeekSomaticVariantCalling
import "HPVDeepSeekTertiaryAnalysis.wdl" as HPVDeepSeekTertiaryAnalysis

workflow HPVDeepSeek {
input {
# HPVDeepSeekGenotyping inputs
String output_basename
File r1_fastq
File r2_fastq
File human_snp_targets_bed
File reference
File reference_fai
File reference_dict
File bwa_idx_amb
File bwa_idx_ann
File bwa_idx_bwt
File bwa_idx_pac
File bwa_idx_sa
File capture_targets_bed
File bait_interval_list
File target_interval_list
String bait_set_name
String read_group_id
String read_group_sample_name
String read_group_library_name = "LB_TEST"
String read_group_platform = "ILLUMINA"
String read_group_platform_unit = "PU_TEST"
String read_group_description = "KAPA_TE"

# HPVDeepSeekSomaticVariantCalling inputs
File target_intervals
File gnomad
File gnomad_idx
File pon
File pon_idx
File variants_for_contamination
File variants_for_contamination_idx
File realignment_index_bundle
String mapping_filter_python_script = "/usr/filter_alt_ref_positions.py"
File blastdb_nhr
File blastdb_nin
File blastdb_nsq
String blastn_path = "/usr/blastn_2.2.30+"
File funcotator_data_source
Boolean run_alignment_artifact_filter = false

# HPVDeepSeekTertiaryAnalysis inputs
File high_risk_snps_hpv
File hpv16_sublineages
}

call HPVDeepSeekGenotyping.HPVDeepSeekGenotyping {
input:
output_basename = output_basename,
r1_fastq = r1_fastq,
r2_fastq = r2_fastq,
human_snp_targets_bed = human_snp_targets_bed,
reference = reference,
reference_fai = reference_fai,
reference_dict = reference_dict,
bwa_idx_amb = bwa_idx_amb,
bwa_idx_ann = bwa_idx_ann,
bwa_idx_bwt = bwa_idx_bwt,
bwa_idx_pac = bwa_idx_pac,
bwa_idx_sa = bwa_idx_sa,
capture_targets_bed = capture_targets_bed,
bait_interval_list = bait_interval_list,
target_interval_list = target_interval_list,
bait_set_name = bait_set_name,
read_group_id = read_group_id,
read_group_sample_name = read_group_sample_name,
read_group_library_name = read_group_library_name,
read_group_platform = read_group_platform,
read_group_platform_unit = read_group_platform_unit,
read_group_description = read_group_description
}

call HPVDeepSeekSomaticVariantCalling.HPVDeepSeekSomaticVariantCalling {
input:
output_basename = output_basename,
tumor_bam = HPVDeepSeekGenotyping.final_bam,
tumor_bai = HPVDeepSeekGenotyping.final_bam_index,
target_intervals = target_intervals,
reference = reference,
reference_fai = reference_fai,
reference_dict = reference_dict,
gnomad = gnomad,
gnomad_idx = gnomad_idx,
pon = pon,
pon_idx = pon_idx,
variants_for_contamination = variants_for_contamination,
variants_for_contamination_idx = variants_for_contamination_idx,
realignment_index_bundle = realignment_index_bundle,
mapping_filter_python_script = mapping_filter_python_script,
blastdb_nhr = blastdb_nhr,
blastdb_nin = blastdb_nin,
blastdb_nsq = blastdb_nsq,
blastn_path = blastn_path,
funcotator_data_source = funcotator_data_source,
run_alignment_artifact_filter = run_alignment_artifact_filter
}

call HPVDeepSeekTertiaryAnalysis.HPVDeepSeekTertiaryAnalysis {
input:
output_basename = output_basename,
tumor_bam = HPVDeepSeekGenotyping.final_bam,
tumor_bai = HPVDeepSeekGenotyping.final_bam_index,
high_risk_snps_hpv = high_risk_snps_hpv,
reference = reference,
reference_fai = reference_fai,
reference_dict = reference_dict,
hpv16_sublineages = hpv16_sublineages
}

output {
# HPVDeepSeekGenotyping outputs
File raw_bam = HPVDeepSeekGenotyping.raw_bam
File raw_bam_index = HPVDeepSeekGenotyping.raw_bam_index
File final_bam = HPVDeepSeekGenotyping.final_bam
File final_bam_index = HPVDeepSeekGenotyping.final_bam_index
File umi_grouped_bam = HPVDeepSeekGenotyping.umi_grouped_bam
File umi_group_data = HPVDeepSeekGenotyping.umi_group_data
File umi_duplication_metrics = HPVDeepSeekGenotyping.umi_duplication_metrics
File vcf = HPVDeepSeekGenotyping.vcf
File coverage = HPVDeepSeekGenotyping.coverage
String top_hpv_contig = HPVDeepSeekGenotyping.top_hpv_contig
Int top_hpv_num_reads = HPVDeepSeekGenotyping.top_hpv_num_reads
Float top_hpv_coverage = HPVDeepSeekGenotyping.top_hpv_coverage
Boolean is_hpv_positive = HPVDeepSeekGenotyping.is_hpv_positive
String secondary_hpv_types = HPVDeepSeekGenotyping.secondary_hpv_types
File fastp_report_html = HPVDeepSeekGenotyping.fastp_report_html
File fastp_report_json = HPVDeepSeekGenotyping.fastp_report_json
File pre_trimmed_r1_fastqc_html = HPVDeepSeekGenotyping.pre_trimmed_r1_fastqc_html
File pre_trimmed_r2_fastqc_html = HPVDeepSeekGenotyping.pre_trimmed_r2_fastqc_html
File post_trimmed_r1_fastqc_html = HPVDeepSeekGenotyping.post_trimmed_r1_fastqc_html
File post_trimmed_r2_fastqc_html = HPVDeepSeekGenotyping.post_trimmed_r2_fastqc_html
File pre_consensus_alignment_summary_metrics = HPVDeepSeekGenotyping.pre_consensus_alignment_summary_metrics
File pre_consensus_flagstat = HPVDeepSeekGenotyping.pre_consensus_flagstat
File pre_consensus_insert_size_metrics = HPVDeepSeekGenotyping.pre_consensus_insert_size_metrics
File pre_consensus_insert_size_plot = HPVDeepSeekGenotyping.pre_consensus_insert_size_plot
File pre_consensus_ontarget_reads = HPVDeepSeekGenotyping.pre_consensus_ontarget_reads
File pre_consensus_hs_metrics = HPVDeepSeekGenotyping.pre_consensus_hs_metrics
File pre_consensus_per_base_coverage = HPVDeepSeekGenotyping.pre_consensus_per_base_coverage
File post_consensus_alignment_summary_metrics = HPVDeepSeekGenotyping.post_consensus_alignment_summary_metrics
File post_consensus_flagstat = HPVDeepSeekGenotyping.post_consensus_flagstat
File post_consensus_insert_size_metrics = HPVDeepSeekGenotyping.post_consensus_insert_size_metrics
File post_consensus_insert_size_plot = HPVDeepSeekGenotyping.post_consensus_insert_size_plot
File post_consensus_ontarget_reads = HPVDeepSeekGenotyping.post_consensus_ontarget_reads
File post_consensus_hs_metrics = HPVDeepSeekGenotyping.post_consensus_hs_metrics
File post_consensus_per_base_coverage = HPVDeepSeekGenotyping.post_consensus_per_base_coverage
File? family_sizes = HPVDeepSeekGenotyping.family_sizes
File? duplex_family_sizes = HPVDeepSeekGenotyping.duplex_family_sizes
File? duplex_yield_metrics = HPVDeepSeekGenotyping.duplex_yield_metrics
File? umi_counts = HPVDeepSeekGenotyping.umi_counts
File? duplex_qc = HPVDeepSeekGenotyping.duplex_qc

# HPVDeepSeekSomaticVariantCalling outputs
File unfiltered_vcf = HPVDeepSeekSomaticVariantCalling.unfiltered_vcf
File unfiltered_vcf_idx = HPVDeepSeekSomaticVariantCalling.unfiltered_vcf_idx
File mutect2_stats = HPVDeepSeekSomaticVariantCalling.mutect2_stats
File filter_mutect_calls_stats = HPVDeepSeekSomaticVariantCalling.filter_mutect_calls_stats
File filtered_vcf = HPVDeepSeekSomaticVariantCalling.filtered_vcf
File filtered_vcf_idx = HPVDeepSeekSomaticVariantCalling.filtered_vcf_idx
File funcotated_maf = HPVDeepSeekSomaticVariantCalling.funcotated_maf

# HPVDeepSeekTertiaryAnalysis outputs
File analysis_log = HPVDeepSeekTertiaryAnalysis.analysis_log
File breakpoints = HPVDeepSeekTertiaryAnalysis.breakpoints
File detailed_integration_summary = HPVDeepSeekTertiaryAnalysis.detailed_integration_summary
File integration_breakpoints = HPVDeepSeekTertiaryAnalysis.integration_breakpoints
File integration_summary = HPVDeepSeekTertiaryAnalysis.integration_summary
File multiple_sequence_alignment = HPVDeepSeekTertiaryAnalysis.multiple_sequence_alignment
File phylip_formatted_msa = HPVDeepSeekTertiaryAnalysis.phylip_formatted_msa
File phylogenetic_tree_stats = HPVDeepSeekTertiaryAnalysis.phylogenetic_tree_stats
File phylogenetic_tree = HPVDeepSeekTertiaryAnalysis.phylogenetic_tree
File phylogenetic_tree_visualization = HPVDeepSeekTertiaryAnalysis.phylogenetic_tree_visualization
File sublineage_call = HPVDeepSeekTertiaryAnalysis.sublineage_call
File high_risk_snps_found = HPVDeepSeekTertiaryAnalysis.high_risk_snps_found
}
}
Loading
Loading