-
Notifications
You must be signed in to change notification settings - Fork 167
Description
I asked OpenHands to review our codebase and search for possible classes that we could simplify. Moreover, I asked it to come up with design patterns that we should implement for that, and I gave a threshold of what constitutes a class that is too big. This was with the idea of reducing technical debt and making life easier in the future when adding new features.
Summary
A comprehensive code architecture analysis identified 6 classes that exceed complexity thresholds and should be refactored. These classes have:
- More than 300 lines of code
- More than 7-8 public methods
- Multiple mixed responsibilities
- High change frequency (hotspots)
📊 Prioritization Matrix
| Priority | Class | File | Lines | Methods | Changes (3mo) | Impact |
|---|---|---|---|---|---|---|
| 🔴 1 | LLM |
openhands-sdk/.../llm/llm.py |
1,472 | 37 | 22 | Critical |
| 🟠 2 | Skill |
openhands-sdk/.../skills/skill.py |
1,099 | 28 | 21 | High |
| 🟡 3 | RemoteConversation |
openhands-sdk/.../remote_conversation.py |
1,257 | 66 | 16 | Medium-High |
| 🟡 4 | LocalConversation |
openhands-sdk/.../local_conversation.py |
976 | 23 | 17 | Medium |
| 🟢 5 | EventService |
openhands-agent-server/.../event_service.py |
691 | 40 | 11 | Medium |
| 🟢 6 | ConversationService |
openhands-agent-server/.../conversation_service.py |
751 | 30 | 10 | Lower |
🔴 Priority 1: LLM Class
File: openhands-sdk/openhands/sdk/llm/llm.py
Current State
| Metric | Value | Threshold |
|---|---|---|
| Lines | 1,472 | 300 |
| Methods | 37 | 7-8 |
| Dependencies | 15+ | 5-6 |
| Git changes (3mo) | 22 | - |
Responsibilities (God Class)
- Configuration management (30+ config fields)
- API key/authentication handling (AWS, OpenRouter, etc.)
- Model capability detection (vision, caching, responses API)
- Message formatting for Chat API
- Message formatting for Responses API
- Completion execution with retry logic
- Responses API execution with streaming
- Metrics and telemetry tracking
- Serialization/deserialization
- Subscription-based authentication flow
Code Smells
- God Class: 10+ unrelated responsibilities
- Hardcoded Dependencies: Creates
Metrics,Telemetry,_tokenizerinternally - Long Methods:
completion()~160 lines,responses()~200 lines - Divergent Change: Auth, formatting, capabilities, metrics all modify this file
Proposed Solution
1. Extract LLMCapabilities class
# New file: openhands/sdk/llm/capabilities.py
class LLMCapabilities:
"""Detects and caches model capabilities."""
def __init__(self, model: str, base_url: str | None, model_info: dict | None): ...
def supports_vision(self) -> bool: ...
def supports_prompt_cache(self) -> bool: ...
def uses_responses_api(self) -> bool: ...
def validate_context_window(self) -> None: ...Methods to extract:
_init_model_info_and_caps()_validate_context_window_size()vision_is_active()_supports_vision()is_caching_prompt_active()uses_responses_api()_model_name_for_capabilities()
2. Extract MessageFormatter class
# New file: openhands/sdk/llm/formatter.py
class MessageFormatter:
"""Formats Message objects for different LLM APIs."""
def __init__(self, capabilities: LLMCapabilities, config: FormatterConfig): ...
def format_for_chat(self, messages: list[Message]) -> list[dict]: ...
def format_for_responses(self, messages: list[Message]) -> tuple[str | None, list[dict]]: ...
def apply_prompt_caching(self, messages: list[Message]) -> None: ...Methods to extract:
format_messages_for_llm()format_messages_for_responses()_apply_prompt_caching()
3. Dependency Injection for Metrics & Telemetry
# Before (hardcoded in validator):
self._metrics = Metrics(model_name=self.model)
self._telemetry = Telemetry(...)
# After (injectable):
class LLM(BaseModel):
metrics_factory: Callable[[], Metrics] | None = Field(default=None, exclude=True)
telemetry_factory: Callable[[], Telemetry] | None = Field(default=None, exclude=True)Target Structure
openhands/sdk/llm/
├── llm.py # ~400 lines (config + orchestration)
├── capabilities.py # ~150 lines (NEW)
├── formatter.py # ~200 lines (NEW)
├── utils/
│ ├── metrics.py # (existing)
│ ├── telemetry.py # (existing)
│ └── retry_mixin.py # (existing)
🟠 Priority 2: Skill Module
File: openhands-sdk/openhands/sdk/context/skills/skill.py
Current State
| Metric | Value | Threshold |
|---|---|---|
| Lines | 1,099 | 300 |
| Methods | 28 | 7-8 |
| Module functions | 10+ | - |
| Git changes (3mo) | 21 | - |
Responsibilities (God Class)
- Skill model definition with validation
- Legacy OpenHands skill loading (
_load_legacy_openhands_skill) - AgentSkills-format loading (
_load_agentskills_skill) - Third-party file handling (
.cursorrules,agents.md) - Skill resources management
- Trigger matching
- Variable extraction
- Prompt generation (
to_prompt()) - Directory discovery (
load_skills_from_dir(),load_project_skills()) - Marketplace integration (
load_marketplace_skill_names(),load_public_skills())
Code Smells
- God Class: Too many loading strategies in one class
- Divergent Change: AgentSkills format, legacy format, third-party files all modify this
- Long Method:
_create_skill_from_metadata()~95 lines
Proposed Solution
1. Strategy Pattern: Skill Loaders
# New: openhands/sdk/context/skills/loaders/__init__.py
from typing import Protocol
class SkillLoader(Protocol):
def can_load(self, path: Path) -> bool: ...
def load(self, path: Path, strict: bool) -> Skill: ...
# New: openhands/sdk/context/skills/loaders/agentskills.py
class AgentSkillsLoader(SkillLoader):
"""Loads SKILL.md files following AgentSkills standard."""
# New: openhands/sdk/context/skills/loaders/legacy.py
class LegacyOpenHandsLoader(SkillLoader):
"""Loads legacy .md files with frontmatter."""
# New: openhands/sdk/context/skills/loaders/third_party.py
class ThirdPartyLoader(SkillLoader):
"""Loads .cursorrules, agents.md, etc."""2. Extract SkillDiscovery class
# New: openhands/sdk/context/skills/discovery.py
class SkillDiscovery:
"""Discovers and loads skills from various sources."""
def __init__(self, loaders: list[SkillLoader]): ...
def load_from_directory(self, dir: Path) -> list[Skill]: ...
def load_project_skills(self, work_dir: Path) -> list[Skill]: ...
def load_user_skills(self) -> list[Skill]: ...3. Extract MarketplaceClient class
# New: openhands/sdk/context/skills/marketplace.py
class MarketplaceClient:
"""Interacts with public skill marketplace."""
def load_skill_names(self) -> list[str]: ...
def load_public_skills(self, skill_names: list[str] | None) -> list[Skill]: ...Target Structure
openhands/sdk/context/skills/
├── skill.py # ~200 lines (pure data model)
├── discovery.py # ~150 lines (NEW)
├── marketplace.py # ~100 lines (NEW)
├── loaders/ # (NEW)
│ ├── __init__.py
│ ├── agentskills.py
│ ├── legacy.py
│ └── third_party.py
├── utils.py # (existing)
└── trigger.py # (existing)
🟡 Priority 3: RemoteConversation
File: openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py
Current State
| Metric | Value | Threshold |
|---|---|---|
| Lines | 1,257 | 300 |
| Methods | 66 (across 4 classes) | 7-8 |
| Git changes (3mo) | 16 | - |
Issue
Contains 4 classes in one file:
WebSocketCallbackClient(lines 97-220)RemoteEventsList(lines 221-380)RemoteState(lines 381-550)RemoteConversation(lines 551-1257)
Code Smells
- Shotgun Surgery: Protocol changes touch multiple inline classes
- Long Method:
__init__is 175+ lines
Proposed Solution
Extract to separate modules (pure reorganization, no behavior change):
openhands/sdk/conversation/impl/remote/
├── __init__.py # Re-exports RemoteConversation
├── conversation.py # RemoteConversation (~350 lines)
├── state.py # RemoteState (~170 lines)
├── events.py # RemoteEventsList (~160 lines)
└── websocket.py # WebSocketCallbackClient (~125 lines)
🟡 Priority 4: LocalConversation
File: openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
Current State
| Metric | Value | Threshold |
|---|---|---|
| Lines | 976 | 300 |
| Methods | 23 | 7-8 |
| Git changes (3mo) | 17 | - |
Code Smells
- Hardcoded Dependencies: Creates
ConversationState,StuckDetectorinternally - Long Method:
run()is ~135 lines - Divergent Change: Plugin logic, agent logic, run loop intertwined
Proposed Solution
1. Extract PluginManager class
# New: openhands/sdk/conversation/plugin_manager.py
class PluginManager:
"""Manages plugin loading and resolution."""
def __init__(self, plugin_specs: list[PluginSource] | None): ...
def ensure_loaded(self) -> list[ResolvedPluginSource]: ...
def get_combined_hook_config(self, base_config: HookConfig | None) -> HookConfig: ...2. Dependency Injection
# Add optional factories with backward-compatible defaults
class LocalConversation:
def __init__(
self,
...,
state_factory: Callable[..., ConversationState] | None = None,
stuck_detector_factory: Callable[[ConversationState], StuckDetector] | None = None,
): ...🟢 Priority 5: EventService
File: openhands-agent-server/openhands/agent_server/event_service.py
Current State
| Metric | Value | Threshold |
|---|---|---|
| Lines | 691 | 300 |
| Methods | 40 | 7-8 |
| Git changes (3mo) | 11 | - |
Code Smells
- God Class: Manages events AND conversation lifecycle
- Feature Envy:
generate_title(),ask_agent(),condense()delegate to conversation
Proposed Solution
Apply Facade Pattern - split into focused services:
EventService (keep):
get_event(),search_events(),count_events(),batch_get_events()subscribe_to_events(),unsubscribe_from_events()send_message()
ConversationLifecycleManager (extract):
start(),run(),pause(),close()generate_title(),ask_agent(),condense()set_confirmation_policy(),set_security_analyzer()
🟢 Priority 6: ConversationService
File: openhands-agent-server/openhands/agent_server/conversation_service.py
Current State
| Metric | Value | Threshold |
|---|---|---|
| Lines | 751 | 300 |
| Methods | 30 | 7-8 |
| Git changes (3mo) | 10 | - |
Proposed Solution
- Extract inline subscriber classes to separate module
- Extract webhook handling to
WebhookManager
Implementation Plan
Phase 1: Low-Risk Extractions
- Extract
LLM.MessageFormatter(isolated, high impact) - Split
remote_conversation.pyinto modules (reorganization only) - Extract Skill loader strategies
Phase 2: Dependency Injection
- Add factory parameters to
LLMfor Metrics/Telemetry - Add factory parameters to
LocalConversationfor State/Detector
Phase 3: Server-Side Refactoring
- Split
EventServiceresponsibilities - Extract
ConversationServicesubscribers
Acceptance Criteria
For each refactoring:
- No external behavior changes
- All existing tests pass
- New unit tests for extracted classes
- Line count of original class reduced by 30%+
- Each new class has single responsibility