diff --git a/memgraph-deep-path-traversal-troubleshooting/SKILL.md b/memgraph-deep-path-traversal-troubleshooting/SKILL.md new file mode 100644 index 0000000..e54100b --- /dev/null +++ b/memgraph-deep-path-traversal-troubleshooting/SKILL.md @@ -0,0 +1,99 @@ +--- +name: memgraph-deep-path-traversal-troubleshooting +description: Troubleshoot Memgraph variable-length path queries and choose the right traversal algorithm. Use when user has slow path queries, uses [*] expansion, or asks about DFS, BFS, shortest path, or relationship expansion. Guides choosing *BFS over * when all paths are not needed. +metadata: + author: memgraph + version: "0.0.1" +--- + +# Memgraph Deep Path Traversal Troubleshooting + +Guide path traversal choice in Memgraph to avoid unnecessarily expensive queries. + +## When to Use + +- User has slow queries with variable-length paths (`[*]`, `[*..5]`, etc.) +- User mentions DFS, BFS, shortest path, path finding, or relationship expansion +- User uses patterns like `(a)-[*]->(b)` or `(a)-[r *]->(b)` +- User asks how to speed up path traversal + +## Critical Question: Do You Need All Paths? + +**Always ask:** Does the user really need *all* paths between each (source, target) pair, or is one path per pair enough? + +| Syntax | Algorithm | Returns | Cost | +|--------|-----------|---------|------| +| `(a)-[*]->(b)` | DFS | **All** paths between all (source, target) pairs | High — can timeout on larger graphs | +| `(a)-[*BFS]->(b)` | BFS | **One** path per (source, target) pair | Low — one shortest path per pair, not all | + +DFS runs and returns all paths between all pairs. BFS returns one path per pair. If the user only needs one path per pair, recommend `*BFS` instead of `*`. + +## Quick Reference + +| Need | Use | Example | +|------|-----|---------| +| All paths between all (source,target) pairs | `[*]` (DFS) | `MATCH path=(a)-[*]->(b) RETURN path` | +| One path per (source,target) pair (unweighted) | `[*BFS]` | `MATCH path=(a)-[*BFS]->(b) RETURN path` | +| Shortest path by edge weight | `[*WSHORTEST]` | `MATCH path=(a)-[:R *WSHORTEST (r,n\|r.weight)]-(b) RETURN path` | +| All shortest paths (same weight) | `[*ALLSHORTEST]` | `MATCH path=(a)-[*ALLSHORTEST (r,n\|r.weight)]-(b) RETURN path` | +| K shortest paths | `[*KSHORTEST]` | `MATCH path=(a)-[*KSHORTEST\|5]->(b) RETURN path` | + +## DFS (`*`) — Use With Caution + +Variable-length `[*]` uses depth-first search and returns **every** path between all (source, target) pairs. Without length constraints it can traverse the entire graph. + +**When DFS is appropriate:** +- Path existence check (any path = success) +- Enumerating all possible routes (e.g. route planning alternatives) +- Small or constrained graphs + +**Mitigation if DFS is required:** +- Add length constraint: `[*..5]` or `[*3..5]` +- Filter by relationship type: `[:CloseTo *]` +- Use `USING HOPS LIMIT x` directive +- Consider whether `*BFS` or `*WSHORTEST` can satisfy the use case + +## BFS (`*BFS`) — Preferred for One Path Per Pair + +Use `*BFS` when the user needs one shortest path per (source, target) pair. For each pair, Memgraph returns one path and stops (does not enumerate all paths). + +```cypher +MATCH path=(n {name: "A"})-[*BFS]->(m {name: "E"}) +RETURN path; +``` + +## Constraining Expansion by Property Values + +DFS, BFS, WSHORTEST, and ALLSHORTEST support a lambda filter to restrict which edges/nodes are traversed. **KSHORTEST does not support filter lambdas.** + +### DFS and BFS + +Filter is a lambda over `(r, n)` — relationship and node expanded to. Add `p` for current path: `(r, n, p | ...)`. + +```cypher +// Filter: only expand over r.eu_border = false and n.drinks_USD < 15 +MATCH path=(n {id: 0})-[* (r, n | r.eu_border = false AND n.drinks_USD < 15)]->(m {id: 8}) +RETURN path; +``` + +```cypher +// BFS with same filter +MATCH path=(n {id: 0})-[*BFS (r, n | r.eu_border = false AND n.drinks_USD < 15)]-(m {id: 8}) +RETURN path; +``` + +With path context (`p`): use `last(relationships(p))` to inspect how the current node was reached. + +### WSHORTEST and ALLSHORTEST + +Filter comes *after* the weight expression. Lambda can use `(r, n)` or `(r, n, p, w)` for path and current weight. + +```cypher +// Weight from r.weight, filter: r.eu_border = false AND n.drinks_USD < 15 +MATCH path=(n {id: 0})-[*WSHORTEST (r, n | r.weight) total_weight (r, n | r.eu_border = false AND n.drinks_USD < 15)]-(m {id: 46}) +RETURN path, total_weight; +``` + +## Documentation + +For full syntax (filtering, length bounds, property filters), see [Memgraph deep path traversal](https://memgraph.com/docs/advanced-algorithms/deep-path-traversal) and [constraining by property values](https://memgraph.com/docs/advanced-algorithms/deep-path-traversal#constraining-the-expansion-based-on-property-values-1). diff --git a/memgraph-performance-troubleshooting/SKILL.md b/memgraph-performance-troubleshooting/SKILL.md new file mode 100644 index 0000000..af918e9 --- /dev/null +++ b/memgraph-performance-troubleshooting/SKILL.md @@ -0,0 +1,102 @@ +--- +name: memgraph-performance-troubleshooting +description: Guide using EXPLAIN and PROFILE clauses to troubleshoot slow Memgraph Cypher queries. Use when debugging query performance, identifying bottlenecks, or deciding which indexes to add. Covers query plans, operator analysis, and index recommendations. +metadata: + author: memgraph + version: "0.0.1" +--- + +# Memgraph Performance Troubleshooting + +Use EXPLAIN and PROFILE to understand and improve Cypher query performance in Memgraph. + +## When to Use + +- User asks about slow queries, query performance, or optimization in Memgraph +- User mentions EXPLAIN, PROFILE, query plan, or bottlenecks +- User wants to reason about which operators are expensive +- User needs to decide whether to add indexes + +## Workflow: EXPLAIN First, Then PROFILE + +### Step 1: EXPLAIN (no execution) + +**Always run EXPLAIN first** — it does not execute the query, so it is safe for expensive queries. + +```cypher +EXPLAIN MATCH (n:Person {age: 42}) RETURN n; +``` + +EXPLAIN returns the planned operators. Read the plan from **bottom to top**; execution flows from bottom (Once) upward to Produce. + +### Step 2: PROFILE (executes query, shows metrics) + +Use PROFILE when you need to see **which operator runs the heaviest** and to reason about bottlenecks. + +```cypher +PROFILE MATCH (n:Person {age: 42}) RETURN n; +``` + +PROFILE executes the query and returns: +- **OPERATOR** — same as EXPLAIN +- **ACTUAL HITS** — how many times each operator was pulled; fewer is better +- **RELATIVE TIME** — % of time spent per operator (identifies bottleneck) +- **ABSOLUTE TIME** — wall time per operator + +Focus on operators with high RELATIVE TIME or high ACTUAL HITS. + +## ScanAll + Filter → Add an Index + +If the plan shows **ScanAll** followed by **Filter**, the query likely scans all nodes and filters afterward. That is inefficient — add an index so Memgraph can use indexed lookups. + +| Pattern | Fix | +|--------|-----| +| `ScanAll` + `Filter (n :Label)` | Label index: `CREATE INDEX ON :Label;` | +| `ScanAllByLabel` + `Filter {n.property}` | Label-property index: `CREATE INDEX ON :Label(property);` | +| `ScanAll` + `Filter {n.prop}` (no label) | Add a label to the query and create `CREATE INDEX ON :Label(property);` | + +### Example: Before vs After Index + +**Before index:** +``` +| * Produce {n} | +| * Filter (n :Person), {n.prop} | +| * ScanAllByLabel (n :Person) | +| * Once | +``` + +**After label-property index:** +``` +CREATE INDEX ON :Person(prop); +``` + +``` +| * Produce {n} | +| * ScanAllByLabelProperties (n :Person {prop}) | +| * Once | +``` + +`ScanAllByLabelProperties` is more efficient than `ScanAllByLabel` + `Filter`. + +## Quick Index Reference + +| Need | Index | +|------|-------| +| Match by label only | `CREATE INDEX ON :Label;` | +| Match by label + property | `CREATE INDEX ON :Label(property);` | +| Multiple properties together | `CREATE INDEX ON :Label(prop1, prop2);` | + +For more index types (composite, edge-type, point, etc.), see [Memgraph indexes docs](https://memgraph.com/docs/fundamentals/indexes). + +## Other Useful Operators + +| Operator | Meaning | +|----------|---------| +| `ScanAll` | Scans all nodes — expensive on large graphs | +| `ScanAllByLabel` | Uses label index — better than ScanAll | +| `ScanAllByLabelProperties` | Uses label-property index — best when filtering by property | +| `Filter` | Filters rows; if after a ScanAll/ScanAllByLabel, consider adding an index | + +## After Adding Indexes + +**Always run EXPLAIN or PROFILE first** — that gives direct feedback on plan and performance. Use `ANALYZE GRAPH;` as a last resort when Memgraph picks a suboptimal index among multiple label-property indexes. Run it once after creating indexes and loading data so Memgraph can use value-distribution statistics for better index selection.