Skip to content
This repository was archived by the owner on Jul 16, 2020. It is now read-only.

Using Lapdog

Aaron Graubert edited this page May 22, 2019 · 5 revisions

Usage

This page walks through how to use Lapdog, from creating a new workspace through to running jobs. Steps which can be taken multiple ways (ie using the CLI, UI, or Python) will be described in each of those ways

Creating a Workspace

UI

  1. Expand the workspace menu in the top left of every page
  2. Click on the button at the top of the menu which reads Create new Workspace
  3. Fill in the Namespace and Workspace fields
    • Sometimes the fields won't recognize that you have clicked into them. If this occurs, click in the blank space immediately to the right of the labels. This issue is being investigated
  4. (Optional) Use the drop-down menu to select a parent workspace to clone from
  5. Press the CREATE WORKSPACE button to create the new workspace

CLI

  1. lapdog workspace {namespace} {workspace} -c
    • (Optional) To clone from a parent workspace, provide the parent as an argument to -c in the form namespace/workspace

Python

  1. Create a new Workspace Manager with the desired namespace and workspace: ws = lapdog.WorkspaceManager(namespace, workspace)
  2. Call ws.create_workspace()
    • (Optional) To clone from a parent workspace, provide another Workspace Manager as an argument to ws.create_workspace

Upload data

CLI (Samples only)

  1. lapdog upload {namespace}/{workspace} -f {sample data file}
    • Your sample data file should be a TSV or JSON file containing the desired data
    • Any fields in your input data which contain valid local filepaths will be uploaded to the workspace bucket automatically

Python

  1. Prepare a pandas DataFrame of your entity metadata. The Python module supports uploading all entity types, but each type must be stored in a separate DataFrame
  2. Upload the data: ws.upload_{entity type}s(input_dataframe)
    • For example, if uploading samples, use ws.upload_samples(input_dataframe)

Upload Workspace Attributes

CLI

  1. Prepare a JSON file of attributes, in the form {"attribute_name": "attribute_value", ...}
  2. lapdog attributes {namespace}/{workspace} {workspace data file} -f
    • Any attribute values which reference valid local filepaths will be uploaded to the workspace bucket automatically

Python

  1. Update attributes: ws.update_attributes(attribute_dictionary)
    • Any attribute values which reference valid local filepaths will be uploaded to the workspace bucket automatically
    • If you prefer, you may use keyword arguments instead of passing a dictionary: ws.update_attributes(attribute_name="attribute_value", ...)

Upload Methods

Import Existing Methods

If you wish to import a published configuration of an existing method into your workspace, visit the FireCloud Method Repository and export your chosen configuration into your workspace.

UI

  1. From the homepage of the desired workspace, click the VIEW/EDIT METHOD CONFIGURATIONS button
  2. Click UPLOAD NEW METHOD CONFIGURATIONS
  3. Select your method configuration file to upload
    • If your configuration has methodRepoMethod.methodVersion set to "latest", Lapdog will automatically set the version to the latest version of the method
  4. (Optional) Select your method wdl file to upload
    • New uploads will be considered when inferring the latest version of a method
    • The method repo namespace and name will be read from methodRepoMethod in the configuration
  5. Click UPLOAD

CLI

  1. lapdog method {namespace}/{workspace} {method config}
    • (Optional) You can upload a new WDL by adding the argument -w {method WDL}
    • If your configuration has methodRepoMethod.methodVersion set to "latest", Lapdog will automatically set the version to the latest version of the method
    • The method repo namespace and name will be read from methodRepoMethod in the configuration

Python

  1. ws.update_configuration(config_json)
    • (Optional) You can upload a new WDL by providing the filepath as the second argument to ws.update_configuration
    • If your configuration has methodRepoMethod.methodVersion set to "latest", Lapdog will automatically set the version to the latest version of the method
    • The method repo namespace and name will be read from methodRepoMethod in the configuration

Running Jobs

Prerequisite

Before you can run any jobs through the Lapdog Engine, the Engine must be initialized for your namespace, and you must register with the Engine for your namespace. You must find an administrator with permissions to create new Google Cloud Projects to initialize the namespace. Read more here

Information Required:

  1. The FireCloud Namespace to initialize
  2. The Billing Account ID to link to the Lapdog Engine. All Lapdog costs associated with the namespace will be billed through that account

The administrator will need to run lapdog initialize-project. This command takes no arguments, but will prompt for the above information while running. It may take anywhere from 1 to 10 minutes to complete.

Afterwards, individual users will need to register with the Lapdog Engine for that namespace. Registration happens automatically in the UI. From within python you will need to run ws.gateway.register(ws.workspace, ws.bucket_id)

UI

  1. From the homepage of your desired workspace, click the EXECUTE NEW JOB button
  2. Select the desired method configuration from the top dropdown menu
  3. Select the type of entity you wish to run from the second dropdown menu
  4. Enter the name of the entity in the Workflow Entity field
  5. If your selected entity type does not match the root entity type of your selected method, you must enter an expression. If this is required the Entity Expression field will be enabled
  6. After entering the information, Lapdog will validate the configuration against the input entity.
    • If there are any errors with your inputs, the error will be displayed
    • If the input is valid, the RUN button will become enabled
  7. (Optional) Set Advanced Options
    • If the number of workflows to launch is greater than the default concurrency limit (250), you will receive a warning
    • In the Advanced Options Menu:
      • Use the Cromwell Memory slider to increase the memory available to the Cromwell Server
      • Use the Max Concurrent Workflows slider to increase the number of workflows that the Cromwell Server will be able to run at once. The maximum allowed value of this slider is set by the value of Cromwell Memory
      • Use the Workflow Dispatch Rate to set the number of workflows that get dispatched every cycle, up to the Max Concurrent Workflows limit. Do not modify this value unless you know what you are doing. If this is set too high, your submission could encounter an error if Cromwell falls behind the workflow queue.
  8. Click RUN
  9. Lapdog will prepare the input data for submission, including determining the value of your method inputs for each workflow
    • If any errors occur a window will pop up with the error message
    • If Lapdog successfully submits the job, you will be redirected to the monitoring page. It may take a few minutes before any data is received from the Cromwell Server

CLI

  1. lapdog exec {namespace}/{workspace} {configuration} {entity}
    • (Optional) If your entity is not the same type as the method configuration's root entity type, you must provide the argument: -x {entity type} {entity expression}
  2. If lapdog is able to successfully start the job, you will be given a Global Submission ID, Local Submission ID, and Operation ID. These are useful for monitoring job progress

Python

  1. Execute the submission: ws.execute(config_name, entity_name)
    • (Optional) if your entity is not the same type as the method configuration's root entity type, you must provide the entity expression and entity types as the 3rd and 4th arguments (respectively)
    • (Optional) You can set the Cromwell memory, concurrency limit, and dispatch rate using the keyword arguments memory, batch_limit, and query_limit, respectively
  2. If lapdog is able to successfully start the job, you will be given a Global Submission ID, Local Submission ID, and Operation ID. These are useful for monitoring job progress

Tracking and completing submissions

The UI is by far the easiest way to monitor submission progress. You can monitor submission progress in Python using SubmissionAdapter objects which can be generated by calling ws.get_adapter(local_submission_id).

To monitor jobs in the UI, you can enter the Global submission ID in the input field at the top of every page, or by navigating to the desired workspace and selecting the Local submission ID from the list of submissions.

UI

  1. While viewing a submission, you can update the page to display the most recent data received from the submission's Cromwell server by refreshing the page
  2. Options
    • You can abort a submission while it's running by pressing the ABORT JOB button. If the job is currently running, this will send a "soft" abort. Cromwell will begin aborting all running workflows and then shut down after all workflows have stopped. If the job's current status is "Aborting", this will send a "hard" abort. Cromwell will immediately shut down which may leave some workflows running until they naturally shut down.
    • You can upload a finished submission by pressing the UPLOAD RESULTS button. This button appears if the submission either Succeeded or Failed, and at least one workflow Succeeded. The outputs from all successful workflows will be uploaded to FireCloud
    • You can re-run failed workflows by pressing the RERUN FAILURES button. This button appears if the submission either Failed or Errored, and there was at least one workflow which Failed or Errored. The entities from all Failed/Errored workflows will be added to a new entity set which you will then have the option to immediately re-run.

CLI

  1. lapdog finish {global submission id}
    • This will upload the results of any succeeded workflows if the submission has finished

Python

  1. Uploading results
    • lapdog.complete_execution(global_submission_id)
    • ws.complete_execution(global_or_local_submission_id)
  2. Abort Jobs
    • ws.get_adapter(local_submission_id).abort()

Clone this wiki locally