Specifying Software Requirements
Overview
Teaching: 10 min
Exercises: 0 minQuestions
How do I specify requirements/dependencies for a job?
What level of detail should I provide for a software requirement?
Objectives
Learn how to write software requirement descriptions.
Learn how to use SciCrunch to retrieve a unique identifier for a tool/version that is required.
Often tool descriptions will be written for a specific version of a software. To
make it easier for others to make use of your descriptions, you can include a
SoftwareRequirement
field in the hints
section.
This may also help to avoid confusion about which version of a tool the
description was written for.
cwlVersion: v1.0
class: CommandLineTool
label: "InterProScan: protein sequence classifier"
doc: |
Version 5.21-60 can be downloaded here:
https://github.com/ebi-pf-team/interproscan/wiki/HowToDownload
Documentation on how to run InterProScan 5 can be found here:
https://github.com/ebi-pf-team/interproscan/wiki/HowToRun
requirements:
ResourceRequirement:
ramMin: 10240
coresMin: 3
SchemaDefRequirement:
types:
- $import: InterProScan-apps.yml
hints:
SoftwareRequirement:
packages:
interproscan:
specs: [ "https://identifiers.org/rrid/RRID:SCR_005829" ]
version: [ "5.21-60" ]
inputs:
proteinFile:
type: File
inputBinding:
prefix: --input
applications:
type: InterProScan-apps.yml#apps[]?
inputBinding:
itemSeparator: ','
prefix: --applications
baseCommand: interproscan.sh
arguments:
- valueFrom: $(inputs.proteinFile.nameroot).i5_annotations
prefix: --outfile
- valueFrom: TSV
prefix: --formats
- --disable-precalc
- --goterms
- --pathways
- valueFrom: $(runtime.tmpdir)
prefix: --tempdir
outputs:
i5Annotations:
type: File
format: iana:text/tab-separated-values
outputBinding:
glob: $(inputs.proteinFile.nameroot).i5_annotations
$namespaces:
iana: https://www.iana.org/assignments/media-types/
s: http://schema.org/
$schemas:
- https://schema.org/docs/schema_org_rdfa.html
s:license: "https://www.apache.org/licenses/LICENSE-2.0"
s:copyrightHolder: "EMBL - European Bioinformatics Institute"
In this example, the software requirement being described is InterProScan version 5.21-60.
hints:
SoftwareRequirement:
packages:
interproscan:
specs: [ "https://identifiers.org/rrid/RRID:SCR_005829" ]
version: [ "5.21-60" ]
Depending on your CWL runner, these hints may be used to check that required software is installed and available before the job is run. To enable these checks with the reference implementation, use the dependency resolvers configuration.
As well as a version number, a unique resource identifier (URI) for the tool is
given in the form of an RRID. Resources with RRIDs can be looked up in the
SciCrunch registry, which provides a portal for finding, tracking,
and referring to scientific resources consistently. If you want to specify a
tool as a SoftwareRequirement
, search for the tool on SciCrunch and use the
RRID that it has been assigned in the registry. (Follow this tutorial
if you want to add a tool to SciCrunch.) You can use this RRID to refer
to the tool (via identifiers.org) in the specs
field of your
requirement description. Other good choices, in order of preference, are to
include the DOI for the main tool citation and the URL to the tool.
Key Points
Software requirements should be specified under
hints:SoftwareRequirement
.