When should I use one of these features?
Both external sub-DAGs and splices allow you to compose a large workflow from various sub-pieces that are defined in individual DAG files. This is the basic motivation for using either external sub-DAGs or splices: you want to create a single workflow from a number of DAG files, either because the smaller DAG files already exist, or because it's easier to deal with sub-parts of the workflow. (One use case might be that you have sub-workflows that you want to combine in different ways to make different overall workflows.)
Some reasons to use external sub-DAGs or splices:
- Create a workflow from separate sub-workflows
- Dynamically create parts of the workflow (external sub-DAGs only)
- Re-try multiple nodes as a unit (external sub-DAGs only)
- Short-circuit parts of the workflow (external sub-DAGs only)
Here's a table comparing external sub-DAGs and splices. Note that the bold entries are the ones that are advantageous for a given feature.
|Ability to incorporate separate sub-workflow files||yes||yes|
|Rescue DAG(s) created upon failure||yes||yes|
|DAG recovery (e.g., from submit machine crash)||yes||yes|
|Creates multiple DAGMan instances in the queue||yes||no|
|Possible combinatorial explosion of dependencies (see below)||no||yes||Until we implement socket nodes for splices|
|Sub-workflow files must exist at submission||no||yes|
|PRE/POST scripts allowed on sub-workflows||yes||no||Until we implement socket nodes for splices|
|Ability to retry sub-workflows||yes||no|
|Job/script throttling applies across entire workflow||no||yes|
|Separate job/script throttles for each sub-workflow||yes||no|
|Node categories can apply across entire workflow||no||yes|
|Ability to set priority on sub-workflows as nodes||yes||no|
|Ability to reduce workflow memory footprint||yes?||no||If used properly|
|Ability to have separate final nodes in sub-workflows||yes||no|
|Ability to abort sub-workflows individually||yes||no|
|Ability to associate variables with sub-workflow nodes||yes||no|
|Ability to configure sub-workflows individually||yes||no||Can be good or bad|
|Separate node status files, etc., for sub-workflows||yes||no|
|A single halt file or condor_hold suspends the entire workflow||no||yes|
Possible combinatorial explosion of dependencies
When one splice is the immediate parent of another splice, it is possible for an extremely large number of dependencies to be created. This is because every "terminal" node of the parent splice becomes a parent of every "inital" node in the child splice. So, for example, if the parent splice has 1000 "terminal" nodes and the child splice has 1000 "initial" nodes, 1 million dependencies will be created. (A "terminal" node is a node that has no children within its splice; and an "initial" node is a node that has no parents within its splice.)
Should I use external sub-DAGs or splices?
The simple answer is that, unless you need one of the features that's available with external sub-DAGs but not with splices (see the table above), you should use splices. Splices are generally simpler and have less overhead than external sub-DAGs (unless the workflow is specifically designed to minimize the external sub-DAG overhead). Also, workflow-wide throttling is generally more useful than separate throttles for sub-parts of the workflow.
How to use external sub-DAGs to reduce workflow memory footprint
Note: This document is valid for HTCondor version 8.5.5.