Skip to content

Commit 9d94796

Browse files
authored
Merge pull request #17 from dannon/workflow-management-tools
Add workflow management tools and standardize error handling
2 parents 444406d + 8a696b8 commit 9d94796

File tree

8 files changed

+540
-203
lines changed

8 files changed

+540
-203
lines changed

mcp-server-galaxy-py/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,10 @@ The Python implementation provides the following MCP tools:
102102
- `list_history_ids`: Get simplified list of history IDs and names
103103
- `get_history_details`: Get detailed information about a specific history
104104
- `upload_file`: Upload local files to Galaxy
105+
- `list_workflows`: List available workflows in Galaxy instance
106+
- `get_workflow_details`: Get detailed information about a specific workflow
107+
- `invoke_workflow`: Execute/run a workflow with specified inputs
108+
- `cancel_workflow_invocation`: Cancel a running workflow invocation
105109
- `get_invocations`: View workflow executions
106110
- `get_iwc_workflows`: Access Interactive Workflow Composer workflows
107111
- `search_iwc_workflows`: Search IWC workflows by keywords

mcp-server-galaxy-py/USAGE_EXAMPLES.md

Lines changed: 130 additions & 179 deletions
Original file line numberDiff line numberDiff line change
@@ -6,243 +6,194 @@ This document provides common usage patterns and examples for the Galaxy MCP ser
66

77
### 1. Connect to Galaxy
88

9-
First, you need to establish a connection to your Galaxy instance:
10-
11-
```python
12-
# Option 1: Use environment variables (recommended)
13-
# Set GALAXY_URL and GALAXY_API_KEY in your environment or .env file
14-
connect()
15-
16-
# Option 2: Provide credentials directly
17-
connect(url="https://your-galaxy-instance.org", api_key="your-api-key")
18-
```
19-
20-
#### Get server information
21-
22-
Once connected, you can retrieve comprehensive information about the Galaxy server:
23-
24-
```python
25-
server_info = get_server_info()
26-
# Returns: {
27-
# "url": "https://your-galaxy-instance.org/",
28-
# "version": {"version_major": "23.1", "version_minor": "1", ...},
29-
# "config": {
30-
# "brand": "Galaxy",
31-
# "allow_user_creation": true,
32-
# "enable_quotas": false,
33-
# "ftp_upload_site": "ftp.galaxy.org",
34-
# "support_url": "https://help.galaxyproject.org/",
35-
# ...
36-
# }
37-
# }
38-
```
9+
**User:** "I want to analyze some genomics data. Can you help me connect to my Galaxy server?"
10+
11+
**AI Assistant:** Absolutely! I can help you connect to Galaxy. I'll need your Galaxy server URL and API key.
12+
*Uses `connect()` to establish connection to the specified Galaxy instance*
13+
14+
"Great! I've successfully connected to your Galaxy instance at https://your-galaxy-instance.org.
15+
16+
Let me get some basic information about your server to make sure everything is working properly."
17+
*Uses `get_server_info()` to retrieve server details*
18+
19+
"Perfect! Your Galaxy server is running version 23.1 and has the following features available:
20+
- User registration: Enabled
21+
- File upload via FTP: Available at ftp.galaxy.org
22+
- Support resources: https://help.galaxyproject.org/
23+
24+
You're all set to start your analysis! What would you like to do first?"
3925

4026
### 2. Working with Histories
4127

42-
#### List all histories
28+
**User:** "What histories do I have in my account?"
29+
30+
**AI Assistant:** Let me check your histories.
31+
*Uses `get_histories()` to retrieve a list of all user histories.*
32+
33+
"You have 5 histories. Here are the most recent three:
34+
- RNA-seq Analysis - 2023-10-27
35+
- Unnamed history
36+
- ChIP-seq inputs
4337

44-
```python
45-
histories = get_histories()
46-
# Returns: [{"id": "abc123", "name": "My Analysis", ...}, ...]
47-
```
38+
Would you like to see the contents of any of these?"
4839

49-
#### Get just IDs and names (simplified)
40+
**User:** "Yeah, show me what's in the 'RNA-seq Analysis' one."
5041

51-
```python
52-
history_list = list_history_ids()
53-
# Returns: [{"id": "abc123", "name": "My Analysis"}, ...]
54-
```
42+
**AI Assistant:** Okay, I'll get the details for that history.
43+
*Uses `get_history_details()` with the correct history ID.*
5544

56-
#### Get history details
45+
"The 'RNA-seq Analysis - 2023-10-27' history contains 8 datasets:
46+
- `sample1_R1.fastq.gz` (FASTQ file)
47+
- `sample1_R2.fastq.gz` (FASTQ file)
48+
- `FastQC on sample1_R1` (HTML report)
49+
- `FastQC on sample1_R2` (HTML report)
50+
- `HISAT2 on sample1` (BAM alignment file)
51+
- `featureCounts on sample1` (Counts file)
52+
- `DESeq2 results` (Tabular file)
53+
- `Volcano plot` (Image)
5754

58-
```python
59-
# IMPORTANT: Pass only the ID string, not the entire history object
60-
history_id = "abc123" # or history_list[0]["id"]
61-
details = get_history_details(history_id)
62-
# Returns: {"history": {...}, "contents": [...]}
63-
```
55+
You can ask me to download any of these files, or use them as inputs for a new analysis."
6456

65-
### 3. Working with Tools
57+
### 3. Tool Discovery and Execution
6658

67-
#### Search for tools
59+
**User:** "I need to check the quality of a FASTQ file I have on my computer. How can I do that?"
6860

69-
```python
70-
tools = search_tools("fastqc")
71-
# Returns: {"tools": [...]}
72-
```
61+
**AI Assistant:** I can help with that. The best tool for checking FASTQ quality is `FastQC`. I'll walk you through the process.
7362

74-
#### Get tool details
63+
First, I need you to upload your FASTQ file.
7564

76-
```python
77-
tool_details = get_tool_details("toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.72")
78-
# Returns: Detailed tool information including parameters
79-
```
65+
*Uses `upload_file()` after the user provides the file.*
8066

81-
#### Run a tool
67+
"Okay, I've uploaded your file `my_reads.fastq.gz` to a new history.
8268

83-
```python
84-
# First, create or select a history
85-
history_id = "abc123"
69+
Now, I will run FastQC on it."
8670

87-
# Prepare tool inputs (depends on the specific tool)
88-
inputs = {
89-
"input_file": {"src": "hda", "id": "dataset_id"},
90-
"param1": "value1"
91-
}
71+
*Uses `search_tools('fastqc')` to find the tool ID, then `run_tool()` with the uploaded file as input.*
9272

93-
# Run the tool
94-
result = run_tool(history_id, tool_id, inputs)
95-
```
73+
"I've started the FastQC analysis. This should only take a minute. The job is running, and the results will appear in your current history.
9674

97-
### 4. File Operations
75+
Is there anything else you'd like to do?"
9876

99-
#### Upload a file
77+
### 4. Workflow Operations
10078

101-
```python
102-
# Upload to default history
103-
upload_result = upload_file("/path/to/your/file.txt")
79+
**User:** "I need a workflow for ChIP-seq analysis. Can you find one for me from the public workflow registries?"
10480

105-
# Upload to specific history
106-
upload_result = upload_file("/path/to/your/file.txt", history_id="abc123")
107-
```
81+
**AI Assistant:** Of course. I'll search the Interactive Workflow Composer (IWC) for ChIP-seq workflows.
82+
*Uses `search_iwc_workflows('ChIP-seq')` to find relevant workflows.*
10883

109-
### 5. Workflow Operations
84+
"I found a popular workflow called 'ChIP-seq analysis for paired-end data'. It seems to be a good match.
11085

111-
#### Browse IWC workflows
86+
Would you like me to import it into your Galaxy account so you can use it?"
11287

113-
```python
114-
# Get all workflows from Interactive Workflow Composer
115-
iwc_workflows = get_iwc_workflows()
88+
**User:** "Yes, please import it."
11689

117-
# Search for specific workflows
118-
matching_workflows = search_iwc_workflows("RNA-seq")
119-
```
90+
**AI Assistant:** Okay, I am importing the workflow now.
91+
*Uses `import_workflow_from_iwc()` with the appropriate workflow ID.*
12092

121-
#### Import a workflow
93+
"The workflow has been successfully imported into your account. You can now see it in your list of available workflows and run it on your data."
12294

123-
```python
124-
# Import from IWC using TRS ID
125-
imported = import_workflow_from_iwc("github.com/galaxyproject/iwc/tree/main/workflows/epigenetics/chipseq-pe")
126-
```
95+
#### Workflow Discovery and Management
12796

128-
## Common Patterns
97+
**User:** "What workflows are available in my Galaxy instance?"
12998

130-
### Pattern 1: Complete Analysis Pipeline
99+
**AI Assistant:** I'll check your available workflows.
100+
*Uses `list_workflows()` to retrieve user's workflows*
131101

132-
```python
133-
# 1. Connect to Galaxy
134-
connect()
102+
"You have 12 workflows available. Here are some highlights:
103+
- RNA-seq Analysis Pipeline v2.1
104+
- ChIP-seq Peak Calling Workflow
105+
- Genome Assembly with Quality Control
135106

136-
# 2. Create a new history for the analysis
137-
new_history = create_history("RNA-seq Analysis")
138-
history_id = new_history["id"]
107+
Would you like details about any specific workflow?"
139108

140-
# 3. Upload data files
141-
upload_file("/data/sample1_R1.fastq", history_id)
142-
upload_file("/data/sample1_R2.fastq", history_id)
109+
**User:** "Tell me about the RNA-seq workflow"
143110

144-
# 4. Search and run quality control
145-
qc_tools = search_tools("fastqc")
146-
tool_id = qc_tools["tools"][0]["id"]
111+
**AI Assistant:** Let me get the details of that workflow for you.
112+
*Uses `get_workflow_details()` with the RNA-seq workflow ID*
147113

148-
# 5. Get history contents to find dataset IDs
149-
history_details = get_history_details(history_id)
150-
datasets = history_details["contents"]
114+
"The RNA-seq Analysis Pipeline v2.1 has 8 steps:
115+
1. Input: Raw FASTQ files (paired-end reads)
116+
2. Quality control with FastQC
117+
3. Trimming with Trim Galore
118+
4. Alignment with HISAT2
119+
5. Quantification with featureCounts
120+
6. Differential expression with DESeq2
121+
7. Visualization plots
122+
8. Summary report
151123

152-
# 6. Run FastQC on uploaded files
153-
for dataset in datasets:
154-
if dataset["extension"] == "fastq":
155-
inputs = {"input_file": {"src": "hda", "id": dataset["id"]}}
156-
run_tool(history_id, tool_id, inputs)
157-
```
124+
This workflow requires paired-end RNA-seq FASTQ files as input. Would you like to run this workflow?"
158125

159-
### Pattern 2: Working with Existing Data
126+
#### Running Workflows
160127

161-
```python
162-
# 1. Connect and list histories
163-
connect()
164-
histories = list_history_ids()
128+
**User:** "I want to run the RNA-seq workflow on my uploaded FASTQ files"
165129

166-
# 2. Find a specific history
167-
target_history = None
168-
for h in histories:
169-
if "Project X" in h["name"]:
170-
target_history = h
171-
break
130+
**AI Assistant:** I'll help you run that workflow. Let me first check your uploaded files and set up the analysis.
131+
*Uses `get_history_contents()` to find FASTQ files, then `invoke_workflow()` with appropriate inputs*
172132

173-
if target_history:
174-
# 3. Get history details
175-
details = get_history_details(target_history["id"])
133+
"I found your paired FASTQ files:
134+
- sample1_R1.fastq.gz
135+
- sample1_R2.fastq.gz
176136

177-
# 4. Find specific datasets
178-
for item in details["contents"]:
179-
if item["name"] == "results.txt":
180-
print(f"Found results: {item['id']}")
181-
```
137+
I'm now starting the RNA-seq workflow with these inputs. The workflow has been submitted with invocation ID: inv_abc123.
182138

183-
## Error Handling
139+
The workflow is currently in 'scheduled' state. I'll monitor its progress for you."
184140

185-
### Common Issues and Solutions
141+
**User:** "How is my workflow doing?"
186142

187-
1. **"History ID invalid" error**
143+
**AI Assistant:** Let me check the status of your RNA-seq workflow.
144+
*Uses `get_invocations()` to check workflow progress*
188145

189-
- Problem: Passing the entire history object instead of just the ID
190-
- Solution: Use `history["id"]` not `history`
146+
"Your workflow (invocation inv_abc123) is currently 'running':
147+
- Steps 1-3: Completed ✓
148+
- Step 4: Currently running (HISAT2 alignment)
149+
- Steps 5-8: Waiting
191150

192-
2. **"Not connected to Galaxy" error**
151+
The workflow is progressing normally. The alignment step typically takes 15-30 minutes depending on data size."
193152

194-
- Problem: Trying to use tools before connecting
195-
- Solution: Always call `connect()` first
153+
#### Workflow Management and Troubleshooting
196154

197-
3. **"Tool not found" error**
198-
- Problem: Using incorrect tool ID format
199-
- Solution: Use the full tool ID from `search_tools()` or `get_tool_panel()`
155+
**User:** "Something seems wrong with my workflow, it's been running for hours. Can you cancel it?"
200156

201-
## Best Practices
157+
**AI Assistant:** Let me check the workflow status and cancel it if needed.
158+
*Uses `get_invocations()` to check status, then `cancel_workflow_invocation()` if appropriate*
202159

203-
1. **Always connect first**: Before using any other tools, establish a connection
204-
2. **Use IDs correctly**: When functions ask for an ID, pass just the ID string, not the entire object
205-
3. **Check return types**: Some functions return lists, others return dictionaries
206-
4. **Handle errors gracefully**: Wrap operations in try-except blocks
207-
5. **Use environment variables**: Store credentials in .env file for security
160+
"I can see your workflow has been stuck in the alignment step for 3 hours, which is unusual. I've cancelled the workflow invocation (inv_abc123).
208161

209-
## Advanced Usage
162+
The issue might be:
163+
- Input files are very large
164+
- Server resources are limited
165+
- There may be a parameter issue
210166

211-
### Custom Tool Parameters
167+
Would you like me to help you restart the workflow with adjusted parameters, or would you prefer to check the input data first?"
212168

213-
Different tools require different input formats. Here's how to determine the correct format:
169+
#### Complete Analysis Workflow
214170

215-
```python
216-
# 1. Get tool details to see required parameters
217-
tool_info = get_tool_details(tool_id, io_details=True)
171+
**User:** "I have some new RNA-seq data. Can you help me run a complete analysis from start to finish?"
218172

219-
# 2. Examine the inputs section
220-
for input_param in tool_info["inputs"]:
221-
print(f"Parameter: {input_param['name']}")
222-
print(f"Type: {input_param['type']}")
223-
print(f"Optional: {input_param.get('optional', False)}")
224-
```
173+
**AI Assistant:** Absolutely! I'll guide you through a complete RNA-seq analysis workflow. Let me help you set this up step by step.
225174

226-
### Working with Collections
175+
*Uses multiple MCP tools: `create_history()`, `upload_file()`, `search_iwc_workflows()`, `get_workflow_details()`, `invoke_workflow()`, `get_invocations()`*
227176

228-
Galaxy collections group related datasets. Here's how to work with them:
177+
"Here's what I'll do:
178+
1. ✓ Created new history: 'RNA-seq Analysis - [Date]'
179+
2. ✓ Uploaded your FASTQ files (2 files detected)
180+
3. ✓ Found optimal workflow: 'RNA-seq differential expression' from IWC
181+
4. ✓ Started workflow execution (invocation: inv_def456)
229182

230-
```python
231-
# Check if a history item is a collection
232-
history_details = get_history_details(history_id)
233-
for item in history_details["contents"]:
234-
if item["history_content_type"] == "dataset_collection":
235-
print(f"Collection: {item['name']}")
236-
print(f"Collection type: {item['collection_type']}")
237-
```
183+
Your analysis is now running! The workflow includes:
184+
- Quality control and trimming
185+
- Alignment to reference genome
186+
- Gene expression quantification
187+
- Differential expression analysis
188+
- Result visualization
238189

239-
## Troubleshooting
190+
I'll notify you when each major step completes. The full analysis typically takes 1-2 hours."
240191

241-
If you encounter issues:
192+
## General Advice and Troubleshooting
242193

243-
1. Check the logs for detailed error messages
244-
2. Verify your Galaxy URL ends with a slash (/)
245-
3. Ensure your API key has the necessary permissions
246-
4. Test with simple operations first (e.g., `get_user()`)
194+
Here are some general tips for building a robust agent using this MCP server:
247195

248-
For more help, consult the Galaxy API documentation or the MCP server logs.
196+
- **Always Connect First:** Ensure the `connect()` tool is called successfully before any other tools are used.
197+
- **Use IDs Correctly:** Many tools require an ID for a history, dataset, or workflow. Make sure to extract and use the specific ID string from the objects returned by other tools.
198+
- **Handle Errors Gracefully:** The tools will return error messages for common problems like invalid IDs, missing data, or connection issues. Your agent should be designed to catch these errors and report them to the user in a helpful way.
199+
- **Check Server Logs:** If you encounter unexpected issues, the MCP server logs are the best place to look for detailed error messages and clues about what went wrong.

0 commit comments

Comments
 (0)