LocalEvalSetResultsManager writes double-encoded JSON files

** Please make sure you read the contribution guide and file the issues in the right place. **
[Contribution guide.](https://google.github.io/adk-docs/contributing-guide/)

**Describe the bug**
`LocalEvalSetResultsManager.save_eval_set_result()` writes JSON results as a JSON-encoded string (double-encoded), so the file contents look like `"{\"key\": \"value\"}"` instead of a JSON object.

**To Reproduce**

based on https://github.com/google/adk-python/blob/v1.21.0/tests/integration/test_single_agent.py#L19-L25

```python
from google.adk.evaluation.agent_evaluator import AgentEvaluator
from google.adk.evaluation.eval_config import get_eval_metrics_from_config
from google.adk.evaluation.local_eval_set_results_manager import (
    LocalEvalSetResultsManager,
)
from google.adk.evaluation.simulation.user_simulator_provider import (
    UserSimulatorProvider,
)
import pytest


@pytest.mark.asyncio
async def test_eval_agent(tmp_path):
  test_file = (
      "tests/integration/fixture/home_automation_agent/simple_test.test.json"
  )
  eval_config = AgentEvaluator.find_config_for_test_file(test_file)
  eval_set = AgentEvaluator._load_eval_set_from_file(
      test_file, eval_config, initial_session={}
  )
  eval_metrics = get_eval_metrics_from_config(eval_config)
  user_simulator_provider = UserSimulatorProvider(
      user_simulator_config=eval_config.user_simulator_config
  )

  agent_for_eval = await AgentEvaluator._get_agent_for_eval(
      module_name="tests.integration.fixture.home_automation_agent",
      agent_name=None,
  )

  eval_results_by_eval_id = (
      await AgentEvaluator._get_eval_results_by_eval_id(
          agent_for_eval=agent_for_eval,
          eval_set=eval_set,
          eval_metrics=eval_metrics,
          num_runs=4,
          user_simulator_provider=user_simulator_provider,
      )
  )

  results_manager = LocalEvalSetResultsManager(agents_dir=str(tmp_path))
  for eval_case_results in eval_results_by_eval_id.values():
    results_manager.save_eval_set_result(
        app_name="test_app",
        eval_set_id=eval_set.eval_set_id,
        eval_case_results=eval_case_results,
    )

  failures = []
  for eval_case_results in eval_results_by_eval_id.values():
    eval_metric_results = (
        AgentEvaluator._get_eval_metric_results_with_invocation(
            eval_case_results
        )
    )
    failures_per_eval_case = AgentEvaluator._process_metrics_and_get_failures(
        eval_metric_results=eval_metric_results,
        print_detailed_results=True,
        agent_module=None,
    )
    failures.extend(failures_per_eval_case)

  failure_message = "Following are all the test failures.\n" + "\n".join(
      failures
  )
  assert not failures, failure_message
```

```json
"{\"eval_set_result_id\":\"test_app_b305bd06-38c5-4796-b9c7-d9c7454338b9_1766325534.0213041\",
```

**Expected behavior**
The saved file should contain a JSON object (e.g. {"eval_set_result_id": "...", ... }), not a JSON string.

**Screenshots**
N/A

**Desktop (please complete the following information):**
 - OS: macOS
 - Python version(python -V): Python 3.13.8
 - ADK version(pip show google-adk): v1.21.0

 **Model Information:**
 - Are you using LiteLLM: No
 - Which model is being used: (gemini-2.0-flash-001)

**Additional context**
Likely caused by double-encoding: model_dump_json() returns a JSON string which is then passed through json.dumps() again. Using model_dump() + json.dump() (or writing model_dump_json() directly) would avoid double encoding.

https://github.com/google/adk-python/blob/29c1115959b0084ac1169748863b35323da3cf50/src/google/adk/evaluation/local_eval_set_results_manager.py#L56-L64

	# Convert to json and write to file.
	eval_set_result_json = eval_set_result.model_dump_json()
	eval_set_result_file_path = os.path.join(
	app_eval_history_dir,
	eval_set_result.eval_set_result_name + _EVAL_SET_RESULT_FILE_EXTENSION,
	)
	logger.info("Writing eval result to file: %s", eval_set_result_file_path)
	with open(eval_set_result_file_path, "w", encoding="utf-8") as f:
	f.write(json.dumps(eval_set_result_json, indent=2))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LocalEvalSetResultsManager writes double-encoded JSON files #3993

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LocalEvalSetResultsManager writes double-encoded JSON files #3993

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions