Fix hierarchical delegation with language-agnostic tool matching #3926

devin-ai-integration · 2025-11-16T04:04:55Z

Fix hierarchical delegation with language-agnostic tool matching (Issue #3925)

Summary

Fixes issue #3925 where the hierarchical process behaves differently depending on the language used. When using non-English prompts, the manager agent would repeatedly delegate to itself instead of using specialized coworkers because tool names were hardcoded in English.

Root Cause: Tool name matching relied on exact English strings ("Delegate work to coworker", "Ask question to coworker"), which failed when LLMs generated tool calls in other languages or used variations of the names.

Solution: Added language-agnostic tool matching using stable short identifiers:

delegate_work / delegate_work_to_coworker → "Delegate work to coworker"
ask_question / ask_question_to_coworker → "Ask question to coworker"

Changes:

Tool Selection (tool_usage.py): Added _normalize_tool_name(), _get_tool_aliases(), and enhanced _select_tool() to support:
- Exact name matching (case-insensitive) - backward compatible
- Normalized/slugified name matching (spaces/hyphens → underscores)
- Stable short identifier matching
- Fuzzy matching fallback (0.85 threshold) - existing behavior
Delegation Detection (tool_usage.py): Updated delegation counting logic to recognize short identifiers in addition to English names
Memory Filter (base_agent_executor_mixin.py): Updated short-term memory filter to exclude delegation actions using both English names and short identifiers
Tests (test_tool_usage.py): Added 7 unit tests covering normalization, alias generation, tool selection with various inputs, and memory filtering

Review & Testing Checklist for Human

CRITICAL: Test with actual non-English prompts - The issue was reported with Japanese, but I haven't tested with real non-English prompts. Please verify that hierarchical crews now work correctly when the LLM generates tool calls in Japanese, Spanish, or other languages.
Verify backward compatibility - Test existing English-based hierarchical crews to ensure they work exactly as before. The changes should be transparent to existing users.
Check delegation detection patterns - Review the patterns in _create_short_term_memory() to ensure they're not too broad (could exclude legitimate actions from memory) or too narrow (could include delegation actions in memory).
Edge case testing - Test with unusual tool name variations (mixed case, extra spaces, special characters) to verify the normalization logic handles them correctly.

Recommended Test Plan

Create a simple hierarchical crew with a manager and 2-3 specialized agents
Test with English prompts - should work as before
Test with Japanese prompts (or use an LLM configured to respond in Japanese) - manager should now properly delegate to coworkers instead of delegating to itself
Check the short-term memory to verify delegation actions are not being saved

Notes

All CI checks passing (lint, type-checker, tests across Python 3.10-3.13)
Maintained backward compatibility by keeping exact name matching and fuzzy matching as fallback
The fix uses stable identifiers rather than full i18n support, which would require translating all tool names and invalidating VCR cassettes

Link to Devin run: https://app.devin.ai/sessions/8edde7b0966342a09a28a2c9bb976301
Requested by: João (joao@crewai.com)

- Add language-agnostic tool name matching in ToolUsage._select_tool() - Support stable identifiers (delegate_work, ask_question) in addition to English names - Update delegation counting to recognize short identifiers - Update short-term memory filter to skip delegation actions with any alias - Add comprehensive unit tests for language-agnostic matching - Fixes issue #3925 where hierarchical process fails with non-English prompts Co-Authored-By: João <joao@crewai.com>

devin-ai-integration · 2025-11-16T04:04:58Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

Co-Authored-By: João <joao@crewai.com>

…test Co-Authored-By: João <joao@crewai.com>

…18n mock Co-Authored-By: João <joao@crewai.com>

Fix lint errors: remove trailing whitespace from docstrings

37b4252

Co-Authored-By: João <joao@crewai.com>

devin-ai-integration bot changed the title ~~Fix hierarchical delegation with language-agnostic tool matching (Issue #3925)~~ Fix hierarchical delegation with language-agnostic tool matching (#3925) Nov 16, 2025

devin-ai-integration bot and others added 2 commits November 16, 2025 04:11

Fix test bug: set cache return value to None for delegation counting …

922165d

…test Co-Authored-By: João <joao@crewai.com>

Fix test instantiation issues: use description parameter instead of i…

57f386d

…18n mock Co-Authored-By: João <joao@crewai.com>

devin-ai-integration bot changed the title ~~Fix hierarchical delegation with language-agnostic tool matching (#3925)~~ Fix hierarchical delegation with language-agnostic tool matching Nov 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix hierarchical delegation with language-agnostic tool matching #3926

Fix hierarchical delegation with language-agnostic tool matching #3926

Uh oh!

devin-ai-integration bot commented Nov 16, 2025 •

edited

Loading

Uh oh!

devin-ai-integration bot commented Nov 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix hierarchical delegation with language-agnostic tool matching #3926

Are you sure you want to change the base?

Fix hierarchical delegation with language-agnostic tool matching #3926

Uh oh!

Conversation

devin-ai-integration bot commented Nov 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix hierarchical delegation with language-agnostic tool matching (Issue #3925)

Summary

Review & Testing Checklist for Human

Recommended Test Plan

Notes

Uh oh!

devin-ai-integration bot commented Nov 16, 2025

🤖 Devin AI Engineer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration bot commented Nov 16, 2025 •

edited

Loading