Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions memgraph-deep-path-traversal-troubleshooting/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
---
name: memgraph-deep-path-traversal-troubleshooting
description: Troubleshoot Memgraph variable-length path queries and choose the right traversal algorithm. Use when user has slow path queries, uses [*] expansion, or asks about DFS, BFS, shortest path, or relationship expansion. Guides choosing *BFS over * when all paths are not needed.
metadata:
author: memgraph
version: "0.0.1"
---

# Memgraph Deep Path Traversal Troubleshooting

Guide path traversal choice in Memgraph to avoid unnecessarily expensive queries.

## When to Use

- User has slow queries with variable-length paths (`[*]`, `[*..5]`, etc.)
- User mentions DFS, BFS, shortest path, path finding, or relationship expansion
- User uses patterns like `(a)-[*]->(b)` or `(a)-[r *]->(b)`
- User asks how to speed up path traversal

## Critical Question: Do You Need All Paths?

**Always ask:** Does the user really need *all* paths between each (source, target) pair, or is one path per pair enough?

| Syntax | Algorithm | Returns | Cost |
|--------|-----------|---------|------|
| `(a)-[*]->(b)` | DFS | **All** paths between all (source, target) pairs | High — can timeout on larger graphs |
| `(a)-[*BFS]->(b)` | BFS | **One** path per (source, target) pair | Low — one shortest path per pair, not all |

DFS runs and returns all paths between all pairs. BFS returns one path per pair. If the user only needs one path per pair, recommend `*BFS` instead of `*`.

## Quick Reference

| Need | Use | Example |
|------|-----|---------|
| All paths between all (source,target) pairs | `[*]` (DFS) | `MATCH path=(a)-[*]->(b) RETURN path` |
| One path per (source,target) pair (unweighted) | `[*BFS]` | `MATCH path=(a)-[*BFS]->(b) RETURN path` |
| Shortest path by edge weight | `[*WSHORTEST]` | `MATCH path=(a)-[:R *WSHORTEST (r,n\|r.weight)]-(b) RETURN path` |
| All shortest paths (same weight) | `[*ALLSHORTEST]` | `MATCH path=(a)-[*ALLSHORTEST (r,n\|r.weight)]-(b) RETURN path` |
| K shortest paths | `[*KSHORTEST]` | `MATCH path=(a)-[*KSHORTEST\|5]->(b) RETURN path` |

## DFS (`*`) — Use With Caution

Variable-length `[*]` uses depth-first search and returns **every** path between all (source, target) pairs. Without length constraints it can traverse the entire graph.

**When DFS is appropriate:**
- Path existence check (any path = success)
- Enumerating all possible routes (e.g. route planning alternatives)
- Small or constrained graphs

**Mitigation if DFS is required:**
- Add length constraint: `[*..5]` or `[*3..5]`
- Filter by relationship type: `[:CloseTo *]`
- Use `USING HOPS LIMIT x` directive
- Consider whether `*BFS` or `*WSHORTEST` can satisfy the use case

## BFS (`*BFS`) — Preferred for One Path Per Pair

Use `*BFS` when the user needs one shortest path per (source, target) pair. For each pair, Memgraph returns one path and stops (does not enumerate all paths).

```cypher
MATCH path=(n {name: "A"})-[*BFS]->(m {name: "E"})
RETURN path;
```

## Constraining Expansion by Property Values

DFS, BFS, WSHORTEST, and ALLSHORTEST support a lambda filter to restrict which edges/nodes are traversed. **KSHORTEST does not support filter lambdas.**

### DFS and BFS

Filter is a lambda over `(r, n)` — relationship and node expanded to. Add `p` for current path: `(r, n, p | ...)`.

```cypher
// Filter: only expand over r.eu_border = false and n.drinks_USD < 15
MATCH path=(n {id: 0})-[* (r, n | r.eu_border = false AND n.drinks_USD < 15)]->(m {id: 8})
RETURN path;
```

```cypher
// BFS with same filter
MATCH path=(n {id: 0})-[*BFS (r, n | r.eu_border = false AND n.drinks_USD < 15)]-(m {id: 8})
RETURN path;
```

With path context (`p`): use `last(relationships(p))` to inspect how the current node was reached.

### WSHORTEST and ALLSHORTEST

Filter comes *after* the weight expression. Lambda can use `(r, n)` or `(r, n, p, w)` for path and current weight.

```cypher
// Weight from r.weight, filter: r.eu_border = false AND n.drinks_USD < 15
MATCH path=(n {id: 0})-[*WSHORTEST (r, n | r.weight) total_weight (r, n | r.eu_border = false AND n.drinks_USD < 15)]-(m {id: 46})
RETURN path, total_weight;
```

## Documentation

For full syntax (filtering, length bounds, property filters), see [Memgraph deep path traversal](https://memgraph.com/docs/advanced-algorithms/deep-path-traversal) and [constraining by property values](https://memgraph.com/docs/advanced-algorithms/deep-path-traversal#constraining-the-expansion-based-on-property-values-1).
102 changes: 102 additions & 0 deletions memgraph-performance-troubleshooting/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
name: memgraph-performance-troubleshooting
description: Guide using EXPLAIN and PROFILE clauses to troubleshoot slow Memgraph Cypher queries. Use when debugging query performance, identifying bottlenecks, or deciding which indexes to add. Covers query plans, operator analysis, and index recommendations.
metadata:
author: memgraph
version: "0.0.1"
---

# Memgraph Performance Troubleshooting

Use EXPLAIN and PROFILE to understand and improve Cypher query performance in Memgraph.

## When to Use

- User asks about slow queries, query performance, or optimization in Memgraph
- User mentions EXPLAIN, PROFILE, query plan, or bottlenecks
- User wants to reason about which operators are expensive
- User needs to decide whether to add indexes

## Workflow: EXPLAIN First, Then PROFILE

### Step 1: EXPLAIN (no execution)

**Always run EXPLAIN first** — it does not execute the query, so it is safe for expensive queries.

```cypher
EXPLAIN MATCH (n:Person {age: 42}) RETURN n;
```

EXPLAIN returns the planned operators. Read the plan from **bottom to top**; execution flows from bottom (Once) upward to Produce.

### Step 2: PROFILE (executes query, shows metrics)

Use PROFILE when you need to see **which operator runs the heaviest** and to reason about bottlenecks.

```cypher
PROFILE MATCH (n:Person {age: 42}) RETURN n;
```

PROFILE executes the query and returns:
- **OPERATOR** — same as EXPLAIN
- **ACTUAL HITS** — how many times each operator was pulled; fewer is better
- **RELATIVE TIME** — % of time spent per operator (identifies bottleneck)
- **ABSOLUTE TIME** — wall time per operator

Focus on operators with high RELATIVE TIME or high ACTUAL HITS.

## ScanAll + Filter → Add an Index

If the plan shows **ScanAll** followed by **Filter**, the query likely scans all nodes and filters afterward. That is inefficient — add an index so Memgraph can use indexed lookups.

| Pattern | Fix |
|--------|-----|
| `ScanAll` + `Filter (n :Label)` | Label index: `CREATE INDEX ON :Label;` |
| `ScanAllByLabel` + `Filter {n.property}` | Label-property index: `CREATE INDEX ON :Label(property);` |
| `ScanAll` + `Filter {n.prop}` (no label) | Add a label to the query and create `CREATE INDEX ON :Label(property);` |

### Example: Before vs After Index

**Before index:**
```
| * Produce {n} |
| * Filter (n :Person), {n.prop} |
| * ScanAllByLabel (n :Person) |
| * Once |
```

**After label-property index:**
```
CREATE INDEX ON :Person(prop);
```

```
| * Produce {n} |
| * ScanAllByLabelProperties (n :Person {prop}) |
| * Once |
```

`ScanAllByLabelProperties` is more efficient than `ScanAllByLabel` + `Filter`.

## Quick Index Reference

| Need | Index |
|------|-------|
| Match by label only | `CREATE INDEX ON :Label;` |
| Match by label + property | `CREATE INDEX ON :Label(property);` |
| Multiple properties together | `CREATE INDEX ON :Label(prop1, prop2);` |

For more index types (composite, edge-type, point, etc.), see [Memgraph indexes docs](https://memgraph.com/docs/fundamentals/indexes).

## Other Useful Operators

| Operator | Meaning |
|----------|---------|
| `ScanAll` | Scans all nodes — expensive on large graphs |
| `ScanAllByLabel` | Uses label index — better than ScanAll |
| `ScanAllByLabelProperties` | Uses label-property index — best when filtering by property |
| `Filter` | Filters rows; if after a ScanAll/ScanAllByLabel, consider adding an index |

## After Adding Indexes

**Always run EXPLAIN or PROFILE first** — that gives direct feedback on plan and performance. Use `ANALYZE GRAPH;` as a last resort when Memgraph picks a suboptimal index among multiple label-property indexes. Run it once after creating indexes and loading data so Memgraph can use value-distribution statistics for better index selection.