-
Notifications
You must be signed in to change notification settings - Fork 1
Using Lapdog
This page walks through how to use Lapdog, from creating a new workspace through to running jobs. Steps which can be taken multiple ways (ie using the CLI, UI, or Python) will be described in each of those ways
- Expand the workspace menu in the top left of every page
- Click on the button at the top of the menu which reads
Create new Workspace - Fill in the
NamespaceandWorkspacefields- Sometimes the fields won't recognize that you have clicked into them. If this occurs, click in the blank space immediately to the right of the labels. This issue is being investigated
- (Optional) Use the drop-down menu to select a parent workspace to clone from
- Press the
CREATE WORKSPACEbutton to create the new workspace
-
lapdog workspace {namespace} {workspace} -c- (Optional) To clone from a parent workspace, provide the parent as an argument to
-cin the formnamespace/workspace
- (Optional) To clone from a parent workspace, provide the parent as an argument to
- Create a new Workspace Manager with the desired namespace and workspace:
ws = lapdog.WorkspaceManager(namespace, workspace) - Call
ws.create_workspace()- (Optional) To clone from a parent workspace, provide another Workspace Manager as an argument to
ws.create_workspace
- (Optional) To clone from a parent workspace, provide another Workspace Manager as an argument to
-
lapdog upload {namespace}/{workspace} -f {sample data file}- Your sample data file should be a TSV or JSON file containing the desired data
- Any fields in your input data which contain valid local filepaths will be uploaded to the workspace bucket automatically
- Prepare a pandas DataFrame of your entity metadata. The Python module supports uploading all entity types, but each type must be stored in a separate DataFrame
- Upload the data:
ws.upload_{entity type}s(input_dataframe)- For example, if uploading samples, use
ws.upload_samples(input_dataframe)
- For example, if uploading samples, use
- Prepare a JSON file of attributes, in the form
{"attribute_name": "attribute_value", ...} -
lapdog attributes {namespace}/{workspace} {workspace data file} -f- Any attribute values which reference valid local filepaths will be uploaded to the workspace bucket automatically
- Update attributes:
ws.update_attributes(attribute_dictionary)- Any attribute values which reference valid local filepaths will be uploaded to the workspace bucket automatically
- If you prefer, you may use keyword arguments instead of passing a dictionary:
ws.update_attributes(attribute_name="attribute_value", ...)
If you wish to import a published configuration of an existing method into your workspace, visit the FireCloud Method Repository and export your chosen configuration into your workspace.
- From the homepage of the desired workspace, click the
VIEW/EDIT METHOD CONFIGURATIONSbutton - Click
UPLOAD NEW METHOD CONFIGURATIONS - Select your method configuration file to upload
- If your configuration has
methodRepoMethod.methodVersionset to"latest", Lapdog will automatically set the version to the latest version of the method
- If your configuration has
- (Optional) Select your method wdl file to upload
- New uploads will be considered when inferring the latest version of a method
- The method repo namespace and name will be read from
methodRepoMethodin the configuration
- Click
UPLOAD
-
lapdog method {namespace}/{workspace} {method config}- (Optional) You can upload a new WDL by adding the argument
-w {method WDL} - If your configuration has
methodRepoMethod.methodVersionset to"latest", Lapdog will automatically set the version to the latest version of the method - The method repo namespace and name will be read from
methodRepoMethodin the configuration
- (Optional) You can upload a new WDL by adding the argument
-
ws.update_configuration(config_json)- (Optional) You can upload a new WDL by providing the filepath as the second argument to
ws.update_configuration - If your configuration has
methodRepoMethod.methodVersionset to"latest", Lapdog will automatically set the version to the latest version of the method - The method repo namespace and name will be read from
methodRepoMethodin the configuration
- (Optional) You can upload a new WDL by providing the filepath as the second argument to
Before you can run any jobs through the Lapdog Engine, the Engine must be initialized for your namespace, and you must register with the Engine for your namespace. You must find an administrator with permissions to create new Google Cloud Projects to initialize the namespace. Read more here
Information Required:
- The FireCloud Namespace to initialize
- The Billing Account ID to link to the Lapdog Engine. All Lapdog costs associated with the namespace will be billed through that account
- See this example of where to find the billing account ID
The administrator will need to run lapdog initialize-project. This command takes no arguments, but will prompt for the above information while running. It may take anywhere from 1 to 10 minutes to complete.
Afterwards, individual users will need to register with the Lapdog Engine for that namespace. Registration happens automatically in the UI. From within python you will need to run ws.gateway.register(ws.workspace, ws.bucket_id)
- From the homepage of your desired workspace, click the
EXECUTE NEW JOBbutton - Select the desired method configuration from the top dropdown menu
- Select the type of entity you wish to run from the second dropdown menu
- Enter the name of the entity in the
Workflow Entityfield - If your selected entity type does not match the root entity type of your selected method, you must enter an expression. If this is required the
Entity Expressionfield will be enabled - After entering the information, Lapdog will validate the configuration against the input entity.
- If there are any errors with your inputs, the error will be displayed
- If the input is valid, the
RUNbutton will become enabled
- (Optional) Set Advanced Options
- If the number of workflows to launch is greater than the default concurrency limit (250), you will receive a warning
- In the Advanced Options Menu:
- Use the
Cromwell Memoryslider to increase the memory available to the Cromwell Server - Use the
Max Concurrent Workflowsslider to increase the number of workflows that the Cromwell Server will be able to run at once. The maximum allowed value of this slider is set by the value ofCromwell Memory - Use the
Workflow Dispatch Rateto set the number of workflows that get dispatched every cycle, up to theMax Concurrent Workflowslimit. Do not modify this value unless you know what you are doing. If this is set too high, your submission could encounter an error if Cromwell falls behind the workflow queue.
- Use the
- Click
RUN - Lapdog will prepare the input data for submission, including determining the value of your method inputs for each workflow
- If any errors occur a window will pop up with the error message
- If Lapdog successfully submits the job, you will be redirected to the monitoring page. It may take a few minutes before any data is received from the Cromwell Server
-
lapdog exec {namespace}/{workspace} {configuration} {entity}- (Optional) If your entity is not the same type as the method configuration's root entity type, you must provide the argument:
-x {entity type} {entity expression}
- (Optional) If your entity is not the same type as the method configuration's root entity type, you must provide the argument:
- If lapdog is able to successfully start the job, you will be given a Global Submission ID, Local Submission ID, and Operation ID. These are useful for monitoring job progress
- Execute the submission:
ws.execute(config_name, entity_name)- (Optional) if your entity is not the same type as the method configuration's root entity type, you must provide the entity expression and entity types as the 3rd and 4th arguments (respectively)
- (Optional) You can set the Cromwell memory, concurrency limit, and dispatch rate using the keyword arguments
memory,batch_limit, andquery_limit, respectively
- If lapdog is able to successfully start the job, you will be given a Global Submission ID, Local Submission ID, and Operation ID. These are useful for monitoring job progress
The UI is by far the easiest way to monitor submission progress. You can monitor submission progress in Python using SubmissionAdapter objects which can be generated by calling ws.get_adapter(local_submission_id).
To monitor jobs in the UI, you can enter the Global submission ID in the input field at the top of every page, or by navigating to the desired workspace and selecting the Local submission ID from the list of submissions.
- While viewing a submission, you can update the page to display the most recent data received from the submission's Cromwell server by refreshing the page
- Options
- You can abort a submission while it's running by pressing the
ABORT JOBbutton. If the job is currently running, this will send a "soft" abort. Cromwell will begin aborting all running workflows and then shut down after all workflows have stopped. If the job's current status is "Aborting", this will send a "hard" abort. Cromwell will immediately shut down which may leave some workflows running until they naturally shut down. - You can upload a finished submission by pressing the
UPLOAD RESULTSbutton. This button appears if the submission either Succeeded or Failed, and at least one workflow Succeeded. The outputs from all successful workflows will be uploaded to FireCloud - You can re-run failed workflows by pressing the
RERUN FAILURESbutton. This button appears if the submission either Failed or Errored, and there was at least one workflow which Failed or Errored. The entities from all Failed/Errored workflows will be added to a new entity set which you will then have the option to immediately re-run.
- You can abort a submission while it's running by pressing the
-
lapdog finish {global submission id}- This will upload the results of any succeeded workflows if the submission has finished
- Uploading results
lapdog.complete_execution(global_submission_id)ws.complete_execution(global_or_local_submission_id)
- Abort Jobs
ws.get_adapter(local_submission_id).abort()