eido command line usage

To use the command line application one just needs a path to a project configuration file. It is a positional argument in the eido command.

For this tutorial, let's grab a PEP from a public example repository that describes a few PRO-seq test samples:

git clone https://github.com/databio/ppqc.git --branch cfg2
Cloning into 'ppqc'...
remote: Enumerating objects: 119, done.
remote: Counting objects: 100% (119/119), done.
remote: Compressing objects: 100% (78/78), done.
remote: Total 119 (delta 64), reused 93 (delta 41), pack-reused 0
Receiving objects: 100% (119/119), 74.66 KiB | 2.20 MiB/s, done.
Resolving deltas: 100% (64/64), done.

cd ppqc
export DATA=$HOME

PEP inspection

First, let's use eido inspect to inspect a PEP.

  • To inspect the entire Project object just provide the path to the project configuration file.
eido inspect peppro_paper.yaml
Project 'PEPPRO' (/Users/mstolarczyk/Uczelnia/UVA/code/eido/docs_jupyter/ppqc/peppro_paper.yaml)
22 samples (showing first 20): K562_PRO-seq, K562_RNA-seq_10, K562_RNA-seq_20, K562_RNA-seq_30, K562_RNA-seq_40, K562_RNA-seq_50, K562_RNA-seq_60, K562_RNA-seq_70, K562_RNA-seq_80, K562_RNA-seq_90, K562_GRO-seq, HelaS3_GRO-seq, Jurkat_ChRO-seq_1, Jurkat_ChRO-seq_2, HEK_PRO-seq, HEK_ARF_PRO-seq, H9_PRO-seq_1, H9_PRO-seq_2, H9_PRO-seq_3, H9_treated_PRO-seq_1
Sections: name, sample_table, looper, sample_modifiers, pep_version

  • To inspect a specific sample, one needs to provide the sample name (via -n/--sample-name oprional argument)
eido inspect peppro_paper.yaml -n K562_PRO-seq K562_RNA-seq_10
Sample 'K562_PRO-seq' in Project (/Users/mstolarczyk/Uczelnia/UVA/code/eido/docs_jupyter/ppqc/peppro_paper.yaml)

sample_name:                    K562_PRO-seq
sample_desc:                    K562 PRO-seq
treatment:                      none
replicate:                      1
toggle:                         1
protocol:                       PRO
organism:                       human
read_type:                      SINGLE
cell_type:                      K562
purpose:                        gold standard
umi_status:                     FALSE

...                           (showing first 10)

Sample 'K562_RNA-seq_10' in Project (/Users/mstolarczyk/Uczelnia/UVA/code/eido/docs_jupyter/ppqc/peppro_paper.yaml)

sample_name:            K562_RNA-seq_10
sample_desc:            90% K562 PRO-seq + 10% K562 RNA-seq
treatment:              none
replicate:              1
toggle:                 1
protocol:               PRO
organism:               human
read_type:              SINGLE
cell_type:              K562
purpose:                mRNA contamination; FRiF/PRiF
umi_status:             FALSE

...                   (showing first 10)

PEP validation

Next, let's use eido to validate this project against the generic PEP schema. You just need to provide a path to the project config file and schema as an input.

eido validate peppro_paper.yaml -s http://schema.databio.org/pep/2.0.0.yaml -e
Validation successful

Any PEP should validate against that schema, which describes generic PEP format. We can go one step further and validate it against the PEPPRO schema, which describes Proseq projects specfically for this pipeline:

eido validate peppro_paper.yaml -s http://schema.databio.org/pipelines/ProseqPEP.yaml
Validation successful

This project would not validate against a different pipeline's schema.

Following jsonschema, eido produces comprehensive error messages that include the objects that did not pass validation. When validating PEPs that include lots of samples one can use option -e/--exclude-case to limit the error output just to the human readable message. This is the option used in the example below:

eido validate peppro_paper.yaml -s http://schema.databio.org/pipelines/bedmaker.yaml -e
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/eido", line 10, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/eido/eido.py", line 259, in main
    validate_project(p, args.schema, args.exclude_case)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/eido/eido.py", line 171, in validate_project
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/eido/eido.py", line 155, in _validate_object
    raise jsonschema.exceptions.ValidationError(e.message)
jsonschema.exceptions.ValidationError: 'input_file_path' is a required property

Optionally, to validate just the config part of the PEP or a specific sample, -n/--sample-name or -c/--just-config arguments should be used, respectively. Please refer to the help for more details:

eido -h