===== Usage ===== The sections below detail an overview of the API patterns used by the ESP python client and other low-level concepts behind how the module works. If you're just looking to use the command line tool, see the `Command Line <./cli.html>`_ section of the documentation. Alternatively, if you're interested in API documentation, see the `Client API <./api.html>`_ page for the python client documentation, or the `REST API <./rest.html>`_ page for ESP REST API documentation. At a high-level, the python esp client was designed to simplify the process of interacting with the backend service. It provides a set of business models that mirror backend models managing how the application stores data. Let's start with a couple examples of how we might use the client API, and then we'll dig into low-level details later in the documentation. For our first example, we're going to use the client API to update values in a worksheet. As a really gentle introduction, we're just going to change the ESP prefix in the sample names and save it as a new column: .. code-block:: python >>> # setting up connection >>> import esp >>> esp.options( >>> email='user@localhost', >>> password='password' >>> ) >>> from esp.models import Workflow, Protocol, Project, Sample >>> >>> exp = Experiment('My Experiment') >>> sheet = exp.protocols[0] >>> sheet['New Sample ID'] = sheet['Sample ID'] >>> sheet.push() In a second example, we're going to do something a bit more complex. Let's say we want to quickly get a list the names and uuids of all Workflows that have a particular sample type associated with them. With the espclient API, you only need a couple lines of code: .. code-block:: python >>> from esp.models import Workflow >>> >>> for obj in Workflow.all(): >>> if 'Illumina Sample' in obj.sample_types: >>> print(': '.join([obj.name, obj.uuid])) 'MiSeq Sequencing: da8ea29e-5291-4058-a82e-a91f4a2dc6fc' 'HiSeq Sequencing: c1954976-0434-4387-806d-352e68f88d5e' Finally, let's say we want to do a more complex task related to learning something about how people are using the system. Let's figure out who our most frequent user of the system is: .. code-block:: python >>> from esp.models import User >>> >>> users = User.all() >>> frequent_users = sorted(users, key=lambda x: len(x.meta.last_seen)) >>> print(frequent_users[0].name) 'admin' Or, let's say we want to run a pipeline task to query all historical values of a particular QC value from a specific project and figure out some basic stats about it (information that can be embedded in a Report): .. code-block:: python >>> from esp.models import Project >>> import numpy >>> >>> obj = Project('Miseq Sequencing') >>> data = list(reduce(lambda x, y: x + y, [x.protocol('My QC Protocol')['Qubit'] for x in obj.experiments])) >>> print([numpy.mean(data), numpy.std(dat)]) 0.88, 0.1 The flexibility of the ESP client allows for a lot of complex interactions to be simplified with its fluid API and object model. To jump into more examples of how to leverage the API, see the `Examples <./examples.html>`_ page. Configuration ============= Before going too in-depth about what the module can do, we need to talk a bit about configuration. Connections to ESP are made implicitly when the first request is made to the server. For instance, you can start querying the ESP database with just: .. code-block:: python >>> from esp.models import Sample >>> obj = Sample('ESP000001') >>> print(obj.uuid) # no explicit connections required Implicit connections will be made using all default values for ``host``, ``port``, ``email``, etc. These defaults can be overridden with user-specified parameters using the ``options()`` command. The following shows how to use this function (along with the values for all options set by default). .. code-block:: python >>> import esp >>> esp.options( >>> host='localhost', # host for connecting to ESP >>> port='8002', # port where ESP is available >>> email='admin@localhost', # email for user >>> password='password', # password for user. >>> ssl=True, # use https instead of http >>> ) All options can alternatively be specified in a config file located at: ``~/.lab7/client.yml``. This config file must be in accordance with the YAML specification, and will be loaded whenever esp-client is used in a python shell or script. An example config with all of the application defaults is as follows: .. code-block:: yaml host: localhost # host where ESP is available port: 8002 # port where ESP is available email: admin@localhost # email to connect to ESP with password: password # password to connect to ESP with Within a task script, you can set up your connection to ESP (via cookies file) using simply: .. code-block:: python >>> # within task_script.py >>> import esp >>> esp.options( >>> url=os.getenv('LAB7_API_SERVER'), >>> cookies=os.getenv('LAB7_COOKIE_FILE') >>> ) >>> >>> # now query normally >>> from esp.models import Sample >>> obj = Sample('ESP000001') Token-based Authentication -------------------------- You can also authenticate your connection to ESP via an API token. First, generate an API token via the UI by clicking your user icon in the top right corner of the ESP interface. Then click User Settings -> Security tab and Generate API Key. After you have a token, you can connect to ESP without specifying a username or password: .. code-block:: python >>> import esp >>> esp.options( >>> host='localhost', # host for connecting to ESP >>> port='8002', # port where ESP is available >>> token='', # replace with an API token (file path or string) >>> ssl=True, # use https instead of http >>> ) Interaction Model ================= Most of the objects in the API work the same way and accordingly have the same interaction model for consistent behavior. For model-specific behavior, the business objects allow for overrides to provide model-specific context. The sections below provide details on how to instantiate, create, and access data on each of the models provided within the API. Instantiation ============= To query the database for an existing entry and create an object with it, you simply instantiate the object with its name or uuid: .. code-block:: python >>> from esp.models import Sample >>> obj = Sample('ESP000001') >>> # queries are performed lazily when a property is accessed >>> print(obj.uuid) 'b05b12b9-e348-4cb8-b763-bce2f52cd37a' >>> >>> # alternatively: >>> obj = Sample('b05b12b9-e348-4cb8-b763-bce2f52cd37a') >>> print(obj.name) 'ESP000001' .. note:: Previous versions of ESP used the word ``Sample`` and ``SampleType`` instead of ``Entity`` and ``EntityType``. The python client currently uses the old ``Sample*`` style nomenclature for ``Entity*`` objects in the system, which will be deprecated in favor of ``Entity`` and ``EntityType`` in future versions. If the object doesn't exist, an exception will be thown when properties are accessed for the first time: .. code-block:: python >>> obj = Sample('none') >>> print(obj.uuid) Traceback (most recent call last): ... KeyError: 'uuid' You can also call the ``exists()`` method to see if the object exists in the ESP database: .. code-block:: python >>> obj = Sample('none') >>> print(obj.exists()) False This pattern is the same across all other objects in the system: .. code-block:: python >>> Workflow('My Workflow') >>> Protocol('My Protocol') >>> SampleType('My SampleType') >>> Project('My Project') >>> Experiment('My Experiment') ... etc ... Property Access =============== After the object is instantiated, you can access properties on the entry by simply querying for attributes or items: .. code-block:: python >>> obj = Sample('ESP000001') >>> print('\n'.join([ >>> obj.uuid, >>> obj['uuid'] >>> obj.name, >>> obj.owner >>> ])) da695f33-976e-4965-80cb-7653feeef798 da695f33-976e-4965-80cb-7653feeef798 ESP000001 system admin (admin@localhost) >>> >>> # the summary() method will give an organized description of the sample >>> obj.summary() name: ESP000001 uuid: da695f33-976e-4965-80cb-7653feeef798 desc: Description for this sample. owner: system admin (admin@localhost) tags: ['sample', 'experiment', 'esp'] sample_type_uuid: 77777777-7777-4014-b700-053300000000 created_at: 2018-07-02T17:40:21.474150Z updated_at: 2018-07-02T17:40:21.474150Z in_workflow_instance: True ... These access methods are proxies for accessing information in an internal ``data`` dictionary, which contains all results from the API request originally made for the sample: .. code-block:: python >>> print(obj.data) { 'name': 'ESP000001', 'desc': 'Description for sample.', 'uuid': 'da695f33-976e-4965-80cb-7653feeef798', 'sample_type_uuid': '77777777-7777-4014-b700-053300000000', 'in_workflow_instance': True, 'updated_at': '2018-07-02T17:40:21.474150Z', 'created_at': '2018-07-02T17:40:21.474150Z', 'owner': 'system admin (admin@localhost)', 'tags': [], ... } For information on what information each model can potentially provide, see the API documentation for the application at `:8002/main/static/util.html`. For any models linked to other types of models (i.e. Workflows contain Protocols), child objects will be implicitly instantiated when certain properties are called. Here's an example of that pattern for a Workflow object: .. code-block:: python >>> from esp.models import Workflow >>> obj = Workflow('Illumina Sequencing') >>> >>> # obj.protocols is a list of protocol objects >>> print(obj.protocols) [, , , , ] >>> >>> # ... and you can access other properties on those objects >>> print(list(map(lambda x: x.name, obj.protocols[1].variables))) ['Sample ID Sequence', 'Parent ID', 'Sample ID', 'Note', 'Index Type', 'I7 Index ID', 'I7 Index', 'I5 Index ID', 'I5 Index', 'Complete'] >>> >>> # going one more level down (Workflow > Protocol > SampleType) >>> print(obj.protocols[1].sample_type) >>> >>> print(obj.protocols[1].sample_type.uuid) 16caf7c6-326e-4cf0-b971-6804df417d97 For a comprehensive list of property links across each model, see the `API <./api.html>`_ section of the documentation. Updates ======= To update the metadata for an object and have that change propagate to the backend, use the ``push()`` method after data have been updated. .. code-block:: python >>> from esp.models import Sample >>> obj = Sample('ESP00001') >>> print([obj.desc, obj.tags]) '', [] >>> obj.desc = 'My custom description.' >>> obj.tags.append('custom') >>> # ... other operations ... >>> obj.push() >>> >>> obj = Sample('ESP00001') >>> print([obj.desc, obj.tags]) 'My custom description.', ['custom'] Currently, this ``push()`` method is only configured to work for simple updates (name, descriptions, tags, etc ...). For more complex updates, clear the existing entry using ``drop()`` and re-create the object using your new configuration: .. code-block:: python >>> from esp.models import Protocol >>> obj = Protocol('Set Samples') >>> obj.drop() >>> >>> obj = Protocol.create({"name": "Set Samples", "variables": [{...}]}) For ``Experiment`` objects, the pattern for updating data is slightly different. Since ``Experiment`` objects contain tabular data, a more natural interaction model for those data is with a ``pandas.DataFrame`` object. With an ``Experiment``, you can access data like so: .. code-block:: python >>> from esp.models import Experiment >>> exp = Experiment('Iris Experiment') >>> ws = exp.protocol('Iris Features') >>> ws.head() sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) 5.1 3.5 1.4 0.2 4.9 3.0 1.4 0.2 4.7 3.2 1.3 0.2 4.6 3.1 1.5 0.2 5.0 3.6 1.4 0.2 Accordingly, because each worksheet behaves like a ``pandas`` DataFrame, you can set values for columns like so: .. code-block:: python >>> ws['sepal length (cm)'] += 4 >>> ws.head() sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) 9.1 3.5 1.4 0.2 8.9 3.0 1.4 0.2 8.7 3.2 1.3 0.2 8.6 3.1 1.5 0.2 9.0 3.6 1.4 0.2 To save the changes to the worksheet, use the ``save()`` method on your edited worksheet object: .. code-block:: python >>> ws.save() This will automatically detect the delta between the original copy of the DataFrame and the data currently in the worksheet. Only the delta between those two are pushed to the backend (for efficiency). Object Creation =============== To create samples from structured metadata (see the sections below for more in-depth examples of this type of metadata, we can use the ``create()`` classmethod on our models: .. code-block:: python >>> from esp.models import Experiment >>> # from a dictionary containing metadata >>> obj = Experiment.create({ >>> 'name': 'E001', >>> 'project': 'My Project', >>> 'workflow': 'Illumina Sequencing', >>> 'samples': { >>> 'count': 2, >>> } >>> }) >>> obj.submit() # submit the experiment to the lab For examples of how to create each type of model, see the `API <./api.html>`_ and `Examples <./examples.html>`_ sections. Imports ======= In addition to creating objects via JSON directly, we can also import model definitions from configuration files. In the above example, the ``experiment-config.yml`` file looks like: .. code-block:: yaml name: MS001 workflow: Illumina Sequencing samples: count: 2 And we can create a new ``Experiment`` object with this config directly via: .. code-block:: python >>> obj = Experiment.create('/path/to/experiment-config.yml') >>> obj.submit() # submit the experiment With imported configuration files, you can also nest definitions. Here's a comprehensive example of a workflow definition that defines new SampleType, Pipeline, Task, and Protocol objects: .. code-block:: yaml Illumina Sequencing: desc: Illumina Sequencing Workflow tags: - illumina - sequencing sample_types: - Illumina Sample: desc: Sample type for illumina sequencing runs. tags: - illumina - sequencing sequences: - ILLUMINA SEQUENCE variables: - Sample Type: Illumina Library - Sample Source: - Buccal - Tissue links: Analyze Sequencing Results: Instrument: '{{ column_value("Instrument", "Created Illumina Library") }}' protocols: # placeholder until sample protocol can be first - Set Samples: protocol: standard desc: Placeholder for workflows that begin with a sample protocol. variables: - Note: rule: text # create library and assign index codes - Create Illumina Library: desc: Create Illumina library and assign index codes. protocol: sample sample: Illumina Sample relation: 1-to-1 tags: - illumina - sequencing variables: - Instrument: rule: text value: MiSeq, HiSeq 2000/2500 - Index Type: rule: dropdown required: true dropdown: '{{ ilmn_kits() }}' - I7 Index ID: rule: dropdown required: true dropdown: '{{ ilmn_adapter_names(column_value("Index Type"), position="i7") }}' - I7 Index: rule: text read_only: true value: '{{ ilmn_adapter_seq(column_value("Index Type"), column_value("I7 Index ID")) }}' - I5 Index ID: rule: dropdown required: true dropdown: '{{ ilmn_adapter_names(column_value("Index Type"), position="i5") }}' - I5 Index: rule: text read_only: true value: '{{ ilmn_adapter_seq(column_value("Index Type"), column_value("I5 Index ID"), position="i5") }}' # bioinformatics - Analyze Sequencing Results: protocol: pipeline desc: Run bioinformatics pipelines to analyze sequencing data. pipeline: Miseq Analysis Pipeline: desc: Run script to generate illumina runsheet. tasks: - BWA Align: desc: Align fastq files to reference. cmd: "echo 'This is a bam created with the command: bwa mem /data/{{ Run }}/{{ Sample }}.fq.gz' > {{ Sample }}.bam" files: - Aligned BAM: file_type: bam filename_template: "{{ Sample }}.bam" - GATK Unified Genotyper: desc: Use GATK to genotype bam file. cmd: "echo 'This is a vcf created with the command: gatk UnifiedGenotyper -I {{ Sample }}.bam -R /references/{{ Reference }}.fa' > {{ Sample }}.vcf" files: - Genotype VCF: file_type: vcf filename_template: "{{ Sample }}.vcf" deps: GATK Unified Genotyper: BWA Align variables: - Instrument: rule: text read_only: true - Run: rule: text required: true pipeline_param: true - Sample: rule: text pipeline_param: true - Reference: rule: dropdown pipeline_param: true default: GRCh38 dropdown: - GRCh37 - GRCh38 With the esp client, all of these definitions can be imported into the ESP database with just: .. code-block:: python >>> from esp.models import Workflow >>> Workflow.create('/path/to/workflow/config.yml') This will implicitly call the ``create()`` method for each nested data structure, and return a ``Workflow`` object that is ready to operate on. Exports ======= Most models within the ESP Python client also support exporting their data model to the yaml config format descried above. Here is an example of exporting an existing ``Workflow`` object from the ESP database: .. code-block:: python >>> from esp.models import Workflow >>> wf = Workflow('My Workflow') >>> wf.export('/path/to/exported/workflow.yml', deep=False) Currently, only "shallow" export is supported, where relationships are represented via the name of the external entity without embedding the complete entity. For example, to export a workflow, you must export each task within the workflow, and then export the workflow. Support for "deep" export will be added soon. Variables ========= Variable types are used throughout the system for different types of models (e.g. ``Protocol``, ``SampleType``), and allow users to specify how structured metadata will be inputted into the system. Generally, there are a few reserved keywords that are available across all variable types. These keywords and their function are as follows: * ``rule`` - The type of variable/column to use. * ``value`` - The default value for the variable/column. * ``onchange`` - JavaScript code to run when the column value is changed in the UI. An overview (with examples) of the variables available in the system are as follows: * ``numeric`` - Numeric variable type .. code-block:: yaml Numeric Column: rule: numeric value: 1.2 * ``string`` - String variable type .. code-block:: yaml String Column: rule: string value: foobar * ``date`` - Variable type for date metadata. .. code-block:: yaml Date Column: rule: date value: 21Jul2019 * ``checkbox`` - Variable type for boolean checkbox metadata. .. code-block:: yaml Checkbox Column: rule: checkbox value: no * ``dropdown`` - Variable type for a picklist. Available options are specified via the ``dropdown`` parameter. .. code-block:: yaml Dropdown Column: rule: dropdown dropdown: - one - two value: one * ``multiselect`` - Variable type for a multiple selection picklist. Available options are specified via the ``dropdown`` parameter. .. code-block:: yaml Multiple Picklist Column: rule: multiselect dropdown: - one - two value: one * ``attachment`` - Variable type for an attached file. .. code-block:: yaml Attachment Column: rule: attachment * ``location`` - Variable type for specifying a location in the system. Values take the format ``Location: Slot``. .. code-block:: yaml Location Column: rule: location container: Freezer * ``barcode`` - Variable type for entering/rendering barcode information. Available barcode types are ``1D``, ``QR``, and ``Mini Data Matrix``. Also included is an optional paramter for setting the sample barcode (``resource_barcode: true``). .. code-block:: yaml Barcode Column: rule: barcode barcode_type: QR resource_barcode: true * ``approval`` - Variable type for requiring sign-off approval to complete a protocol. For this variable type, developers need to specify a workgroup that can be used to select users that can approve a column. .. code-block:: yaml Approval Column: rule: approval workgroup: Lab A * ``link`` - Variable type for embedded link. Link details must be specified via stringified JSON containing ``name`` and ``url`` arguments. .. code-block:: yaml Link Column: rule: link value: '{ "name": "L7", "url": "https://l7informatics.com", "target": "_blank" }' * ``resource_link`` - Variable type for embedded link to internal resource object in ESP database. Link types can be to ``Samples``, ``Projects``, ``Experiments``, ``Files``, etc ... .. code-block:: yaml Resource Link Column: rule: resource_link resource_link_type: Sample * ``itemqtyadj`` - Variable type for embedding inventory quantity change adjustments in LIMS workflows. .. code-block:: yaml Item Quantity Column: rule: itemqtyadj item_type: Ethanol * ``instructions`` - Variable type for embedded instructions for a LIMS workflow segment. Instructions can be embedded in `Markdown `_ format and will be automatically rendered into HTML. .. code-block:: yaml Instructions Column: rule: instructions instructions: |+ # My Instruction List * Step foo * Step bar See the `Client API <./api.html>`_ and `Examples <./examples.html>`_ pages of the documentation for more context and comprehensive examples of how to set different types of variables.