Usage

The sections below detail an overview of the API patterns used by the ESP python client and other low-level concepts behind how the module works. If you’re just looking to use the command line tool, see the Command Line section of the documentation. Alternatively, if you’re interested in API documentation, see the Client API page for the python client documentation, or the REST API page for ESP REST API documentation.

At a high-level, the python esp client was designed to simplify the process of interacting with the backend service. It provides a set of business models that mirror backend models managing how the application stores data.

Let’s start with a couple examples of how we might use the client API, and then we’ll dig into low-level details later in the documentation. For our first example, we’re going to use the client API to update values in a worksheet. As a really gentle introduction, we’re just going to change the ESP prefix in the sample names and save it as a new column:

>>> # setting up connection
>>> import esp
>>> esp.options(
>>>     email='user@localhost',
>>>     password='password'
>>> )
>>> from esp.models import Workflow, Protocol, Project, Sample
>>>
>>> exp = Experiment('My Experiment')
>>> sheet = exp.protocols[0]
>>> sheet['New Sample ID'] = sheet['Sample ID']
>>> sheet.push()

In a second example, we’re going to do something a bit more complex. Let’s say we want to quickly get a list the names and uuids of all Workflows that have a particular sample type associated with them. With the espclient API, you only need a couple lines of code:

>>> from esp.models import Workflow
>>>
>>> for obj in Workflow.all():
>>>     if 'Illumina Sample' in obj.sample_types:
>>>         print(': '.join([obj.name, obj.uuid]))
'MiSeq Sequencing: da8ea29e-5291-4058-a82e-a91f4a2dc6fc'
'HiSeq Sequencing: c1954976-0434-4387-806d-352e68f88d5e'

Finally, let’s say we want to do a more complex task related to learning something about how people are using the system. Let’s figure out who our most frequent user of the system is:

>>> from esp.models import User
>>>
>>> users = User.all()
>>> frequent_users = sorted(users, key=lambda x: len(x.meta.last_seen))
>>> print(frequent_users[0].name)
'admin'

Or, let’s say we want to run a pipeline task to query all historical values of a particular QC value from a specific project and figure out some basic stats about it (information that can be embedded in a Report):

>>> from esp.models import Project
>>> import numpy
>>>
>>> obj = Project('Miseq Sequencing')
>>> data = list(reduce(lambda x, y: x + y, [x.protocol('My QC Protocol')['Qubit'] for x in obj.experiments]))
>>> print([numpy.mean(data), numpy.std(dat)])
0.88, 0.1

The flexibility of the ESP client allows for a lot of complex interactions to be simplified with its fluid API and object model. To jump into more examples of how to leverage the API, see the Examples page.

Configuration

Before going too in-depth about what the module can do, we need to talk a bit about configuration. Connections to ESP are made implicitly when the first request is made to the server. For instance, you can start querying the ESP database with just:

>>> from esp.models import Sample
>>> obj = Sample('ESP000001')
>>> print(obj.uuid) # no explicit connections required

Implicit connections will be made using all default values for host, port, email, etc. These defaults can be overridden with user-specified parameters using the options() command. The following shows how to use this function (along with the values for all options set by default).

>>> import esp
>>> esp.options(
>>>     host='localhost',          # host for connecting to ESP
>>>     port='8002',               # port where ESP is available
>>>     email='admin@localhost',   # email for user
>>>     password='password',       # password for user.
>>>     ssl=True,                  # use https instead of http
>>> )

All options can alternatively be specified in a config file located at: ~/.lab7/client.yml. This config file must be in accordance with the YAML specification, and will be loaded whenever esp-client is used in a python shell or script. An example config with all of the application defaults is as follows:

host: localhost             # host where ESP is available
port: 8002                  # port where ESP is available
email: admin@localhost      # email to connect to ESP with
password: password          # password to connect to ESP with

Within a task script, you can set up your connection to ESP (via cookies file) using simply:

>>> # within task_script.py
>>> import esp
>>> esp.options(
>>>     url=os.getenv('LAB7_API_SERVER'),
>>>     cookies=os.getenv('LAB7_COOKIE_FILE')
>>> )

>>>
>>> # now query normally
>>> from esp.models import Sample
>>> obj = Sample('ESP000001')

Token-based Authentication

You can also authenticate your connection to ESP via an API token. First, generate an API token via the UI by clicking your user icon in the top right corner of the ESP interface. Then click User Settings -> Security tab and Generate API Key. After you have a token, you can connect to ESP without specifying a username or password:

>>> import esp
>>> esp.options(
>>>     host='localhost',          # host for connecting to ESP
>>>     port='8002',               # port where ESP is available
>>>     token='<TOKEN>',           # replace <TOKEN> with an API token (file path or string)
>>>     ssl=True,                  # use https instead of http
>>> )

Interaction Model

Most of the objects in the API work the same way and accordingly have the same interaction model for consistent behavior. For model-specific behavior, the business objects allow for overrides to provide model-specific context.

The sections below provide details on how to instantiate, create, and access data on each of the models provided within the API.

Instantiation

To query the database for an existing entry and create an object with it, you simply instantiate the object with its name or uuid:

>>> from esp.models import Sample
>>> obj = Sample('ESP000001')
>>> # queries are performed lazily when a property is accessed
>>> print(obj.uuid)
'b05b12b9-e348-4cb8-b763-bce2f52cd37a'
>>>
>>> # alternatively:
>>> obj = Sample('b05b12b9-e348-4cb8-b763-bce2f52cd37a')
>>> print(obj.name)
'ESP000001'

Note

Previous versions of ESP used the word Sample and SampleType instead of Entity and EntityType. The python client currently uses the old Sample* style nomenclature for Entity* objects in the system, which will be deprecated in favor of Entity and EntityType in future versions.

If the object doesn’t exist, an exception will be thown when properties are accessed for the first time:

>>> obj = Sample('none')
>>> print(obj.uuid)
Traceback (most recent call last):
    ...
KeyError: 'uuid'

You can also call the exists() method to see if the object exists in the ESP database:

>>> obj = Sample('none')
>>> print(obj.exists())
False

This pattern is the same across all other objects in the system:

>>> Workflow('My Workflow')
>>> Protocol('My Protocol')
>>> SampleType('My SampleType')
>>> Project('My Project')
>>> Experiment('My Experiment')
... etc ...

Property Access

After the object is instantiated, you can access properties on the entry by simply querying for attributes or items:

>>> obj = Sample('ESP000001')
>>> print('\n'.join([
>>>     obj.uuid,
>>>     obj['uuid']
>>>     obj.name,
>>>     obj.owner
>>> ]))
da695f33-976e-4965-80cb-7653feeef798
da695f33-976e-4965-80cb-7653feeef798
ESP000001
system admin (admin@localhost)
>>>
>>> # the summary() method will give an organized description of the sample
>>> obj.summary()
name: ESP000001
uuid: da695f33-976e-4965-80cb-7653feeef798
desc: Description for this sample.
owner: system admin (admin@localhost)
tags: ['sample', 'experiment', 'esp']
sample_type_uuid: 77777777-7777-4014-b700-053300000000
created_at: 2018-07-02T17:40:21.474150Z
updated_at: 2018-07-02T17:40:21.474150Z
in_workflow_instance: True
...

These access methods are proxies for accessing information in an internal data dictionary, which contains all results from the API request originally made for the sample:

>>> print(obj.data)
{
    'name': 'ESP000001',
    'desc': 'Description for sample.',
    'uuid': 'da695f33-976e-4965-80cb-7653feeef798',
    'sample_type_uuid': '77777777-7777-4014-b700-053300000000',
    'in_workflow_instance': True,
    'updated_at': '2018-07-02T17:40:21.474150Z',
    'created_at': '2018-07-02T17:40:21.474150Z',
    'owner': 'system admin (admin@localhost)',
    'tags': [],
    ...
}

For information on what information each model can potentially provide, see the API documentation for the application at <esp-host>:8002/main/static/util.html.

For any models linked to other types of models (i.e. Workflows contain Protocols), child objects will be implicitly instantiated when certain properties are called. Here’s an example of that pattern for a Workflow object:

>>> from esp.models import Workflow
>>> obj = Workflow('Illumina Sequencing')
>>>
>>> # obj.protocols is a list of protocol objects
>>> print(obj.protocols)
[<Protocol(name=Set Samples)>,
 <Protocol(name=Create Illumina Library)>,
 <Protocol(name=Setup Illumina Library Pools)>,
 <Protocol(name=Pool Illumina Libraries)>,
 <Protocol(name=Illumina Runsheet)>]
>>>
>>> # ... and you can access other properties on those objects
>>> print(list(map(lambda x: x.name, obj.protocols[1].variables)))
['Sample ID Sequence', 'Parent ID', 'Sample ID', 'Note', 'Index Type', 'I7 Index ID', 'I7 Index', 'I5 Index ID', 'I5 Index', 'Complete']
>>>
>>> # going one more level down (Workflow > Protocol > SampleType)
>>> print(obj.protocols[1].sample_type)
<SampleType(name=Illumina Sample)>
>>>
>>> print(obj.protocols[1].sample_type.uuid)
16caf7c6-326e-4cf0-b971-6804df417d97

For a comprehensive list of property links across each model, see the API section of the documentation.

Updates

To update the metadata for an object and have that change propagate to the backend, use the push() method after data have been updated.

>>> from esp.models import Sample
>>> obj = Sample('ESP00001')
>>> print([obj.desc, obj.tags])
'', []
>>> obj.desc = 'My custom description.'
>>> obj.tags.append('custom')
>>> # ... other operations ...
>>> obj.push()
>>>
>>> obj = Sample('ESP00001')
>>> print([obj.desc, obj.tags])
'My custom description.', ['custom']

Currently, this push() method is only configured to work for simple updates (name, descriptions, tags, etc …). For more complex updates, clear the existing entry using drop() and re-create the object using your new configuration:

>>> from esp.models import Protocol
>>> obj = Protocol('Set Samples')
>>> obj.drop()
>>>
>>> obj = Protocol.create({"name": "Set Samples", "variables": [{...}]})

For Experiment objects, the pattern for updating data is slightly different. Since Experiment objects contain tabular data, a more natural interaction model for those data is with a pandas.DataFrame object. With an Experiment, you can access data like so:

>>> from esp.models import Experiment
>>> exp = Experiment('Iris Experiment')
>>> ws = exp.protocol('Iris Features')
>>> ws.head()
sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)
              5.1               3.5                1.4               0.2
              4.9               3.0                1.4               0.2
              4.7               3.2                1.3               0.2
              4.6               3.1                1.5               0.2
              5.0               3.6                1.4               0.2

Accordingly, because each worksheet behaves like a pandas DataFrame, you can set values for columns like so:

>>> ws['sepal length (cm)'] += 4
>>> ws.head()
sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)
              9.1               3.5                1.4               0.2
              8.9               3.0                1.4               0.2
              8.7               3.2                1.3               0.2
              8.6               3.1                1.5               0.2
              9.0               3.6                1.4               0.2

To save the changes to the worksheet, use the save() method on your edited worksheet object:

>>> ws.save()

This will automatically detect the delta between the original copy of the DataFrame and the data currently in the worksheet. Only the delta between those two are pushed to the backend (for efficiency).

Object Creation

To create samples from structured metadata (see the sections below for more in-depth examples of this type of metadata, we can use the create() classmethod on our models:

>>> from esp.models import Experiment
>>> # from a dictionary containing metadata
>>> obj = Experiment.create({
>>>     'name': 'E001',
>>>     'project': 'My Project',
>>>     'workflow': 'Illumina Sequencing',
>>>     'samples': {
>>>         'count': 2,
>>>     }
>>> })
>>> obj.submit() # submit the experiment to the lab

For examples of how to create each type of model, see the API and Examples sections.

Imports

In addition to creating objects via JSON directly, we can also import model definitions from configuration files. In the above example, the experiment-config.yml file looks like:

name: MS001
workflow: Illumina Sequencing
samples:
  count: 2

And we can create a new Experiment object with this config directly via:

>>> obj = Experiment.create('/path/to/experiment-config.yml')
>>> obj.submit() # submit the experiment

With imported configuration files, you can also nest definitions. Here’s a comprehensive example of a workflow definition that defines new SampleType, Pipeline, Task, and Protocol objects:

Illumina Sequencing:

  desc: Illumina Sequencing Workflow

  tags:
    - illumina
    - sequencing

  sample_types:
    - Illumina Sample:
       desc: Sample type for illumina sequencing runs.
       tags:
         - illumina
         - sequencing
       sequences:
         - ILLUMINA SEQUENCE
       variables:
         - Sample Type: Illumina Library
         - Sample Source:
           - Buccal
           - Tissue

  links:
      Analyze Sequencing Results:
          Instrument: '{{ column_value("Instrument", "Created Illumina Library") }}'

  protocols:

    # placeholder until sample protocol can be first
    - Set Samples:
        protocol: standard
        desc: Placeholder for workflows that begin with a sample protocol.
        variables:
          - Note:
              rule: text

    # create library and assign index codes
    - Create Illumina Library:
        desc: Create Illumina library and assign index codes.
        protocol: sample
        sample: Illumina Sample
        relation: 1-to-1
        tags:
          - illumina
          - sequencing
        variables:
          - Instrument:
              rule: text
              value: MiSeq, HiSeq 2000/2500
          - Index Type:
              rule: dropdown
              required: true
              dropdown: '{{ ilmn_kits() }}'
          - I7 Index ID:
              rule: dropdown
              required: true
              dropdown: '{{ ilmn_adapter_names(column_value("Index Type"), position="i7") }}'
          - I7 Index:
              rule: text
              read_only: true
              value: '{{ ilmn_adapter_seq(column_value("Index Type"), column_value("I7 Index ID")) }}'
          - I5 Index ID:
              rule: dropdown
              required: true
              dropdown: '{{ ilmn_adapter_names(column_value("Index Type"), position="i5") }}'
          - I5 Index:
              rule: text
              read_only: true
              value: '{{ ilmn_adapter_seq(column_value("Index Type"), column_value("I5 Index ID"), position="i5") }}'

    # bioinformatics
    - Analyze Sequencing Results:
        protocol: pipeline
        desc: Run bioinformatics pipelines to analyze sequencing data.
        pipeline:
          Miseq Analysis Pipeline:
            desc: Run script to generate illumina runsheet.
            tasks:
              - BWA Align:
                  desc: Align fastq files to reference.
                  cmd: "echo 'This is a bam created with the command: bwa mem /data/{{ Run }}/{{ Sample }}.fq.gz' > {{ Sample }}.bam"
                  files:
                    - Aligned BAM:
                        file_type: bam
                        filename_template: "{{ Sample }}.bam"
              - GATK Unified Genotyper:
                  desc: Use GATK to genotype bam file.
                  cmd: "echo 'This is a vcf created with the command: gatk UnifiedGenotyper -I {{ Sample }}.bam -R /references/{{ Reference }}.fa' > {{ Sample }}.vcf"
                  files:
                    - Genotype VCF:
                        file_type: vcf
                        filename_template: "{{ Sample }}.vcf"
            deps:
              GATK Unified Genotyper: BWA Align

        variables:
          - Instrument:
              rule: text
              read_only: true
          - Run:
              rule: text
              required: true
              pipeline_param: true
          - Sample:
              rule: text
              pipeline_param: true
          - Reference:
              rule: dropdown
              pipeline_param: true
              default: GRCh38
              dropdown:
                - GRCh37
                - GRCh38

With the esp client, all of these definitions can be imported into the ESP database with just:

>>> from esp.models import Workflow
>>> Workflow.create('/path/to/workflow/config.yml')

This will implicitly call the create() method for each nested data structure, and return a Workflow object that is ready to operate on.

Exports

Most models within the ESP Python client also support exporting their data model to the yaml config format descried above. Here is an example of exporting an existing Workflow object from the ESP database:

>>> from esp.models import Workflow
>>> wf = Workflow('My Workflow')
>>> wf.export('/path/to/exported/workflow.yml', deep=False)

Currently, only “shallow” export is supported, where relationships are represented via the name of the external entity without embedding the complete entity. For example, to export a workflow, you must export each task within the workflow, and then export the workflow. Support for “deep” export will be added soon.

Variables

Variable types are used throughout the system for different types of models (e.g. Protocol, SampleType), and allow users to specify how structured metadata will be inputted into the system. Generally, there are a few reserved keywords that are available across all variable types. These keywords and their function are as follows:

  • rule - The type of variable/column to use.

  • value - The default value for the variable/column.

  • onchange - JavaScript code to run when the column value is changed in the UI.

An overview (with examples) of the variables available in the system are as follows:

  • numeric - Numeric variable type

    Numeric Column:
      rule: numeric
      value: 1.2
    
  • string - String variable type

    String Column:
      rule: string
      value: foobar
    
  • date - Variable type for date metadata.

    Date Column:
      rule: date
      value: 21Jul2019
    
  • checkbox - Variable type for boolean checkbox metadata.

    Checkbox Column:
      rule: checkbox
      value: no
    
  • dropdown - Variable type for a picklist. Available options are specified via the dropdown parameter.

    Dropdown Column:
      rule: dropdown
      dropdown:
        - one
        - two
      value: one
    
  • multiselect - Variable type for a multiple selection picklist. Available options are specified via the dropdown parameter.

    Multiple Picklist Column:
      rule: multiselect
      dropdown:
        - one
        - two
      value: one
    
  • attachment - Variable type for an attached file.

    Attachment Column:
      rule: attachment
    
  • location - Variable type for specifying a location in the system. Values take the format Location: Slot.

    Location Column:
      rule: location
      container: Freezer
    
  • barcode - Variable type for entering/rendering barcode information. Available barcode types are 1D, QR, and Mini Data Matrix. Also included is an optional paramter for setting the sample barcode (resource_barcode: true).

    Barcode Column:
      rule: barcode
      barcode_type: QR
      resource_barcode: true
    
  • approval - Variable type for requiring sign-off approval to complete a protocol. For this variable type, developers need to specify a workgroup that can be used to select users that can approve a column.

    Approval Column:
      rule: approval
      workgroup: Lab A
    
  • link - Variable type for embedded link. Link details must be specified via stringified JSON containing name and url arguments.

    Link Column:
      rule: link
      value: '{ "name": "L7", "url": "https://l7informatics.com", "target": "_blank" }'
    
  • resource_link - Variable type for embedded link to internal resource object in ESP database. Link types can be to Samples, Projects, Experiments, Files, etc …

    Resource Link Column:
      rule: resource_link
      resource_link_type: Sample
    
  • itemqtyadj - Variable type for embedding inventory quantity change adjustments in LIMS workflows.

    Item Quantity Column:
      rule: itemqtyadj
      item_type: Ethanol
    
  • instructions - Variable type for embedded instructions for a LIMS workflow segment. Instructions can be embedded in Markdown format and will be automatically rendered into HTML.

    Instructions Column:
      rule: instructions
      instructions: |+
    
        # My Instruction List
        * Step foo
        * Step bar
    

See the Client API and Examples pages of the documentation for more context and comprehensive examples of how to set different types of variables.