Skip to content

Commit 5f2960b

Browse files
jefcodercforge42
authored andcommitted
fix: add some feats in validator
1 parent 8a05ffc commit 5f2960b

17 files changed

+185
-123
lines changed

README.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,13 +54,20 @@ AutoCodeAgent allows you to handle complex tasks such as:
5454

5555
- *"I want to review the picture on Wikipedia for three different actors. Use browser_navigation to visit each actor's Wikipedia page, please use your vision capability guess the actor's age in the picture. Your goal is to guess the actor's age in the picture. Then, create a summary when you compare the picture age with the actual actor's age. Once you have completed the report, send it by email to (your_email). The actors are: Brad Pitt Robert De Niro Marlon Rando. Good luck!"*
5656

57-
- *"Visit 4 different electronics e-commerce sites to get the average price of the top 3 search results for the query: iPhone 13 Pro. The websites are: https://www.bestbuy.com/, https://www.croma.com/, https://www.mediaworld.it/, https://www.boulanger.com/. Then, provide me with a price comparison report. If you find a currency other than the euro, search Google for the latest exchange rate and convert the prices. Finally, save the report in the simple rag database and send me the same report via email to (your_email)"*
57+
- *"Navigate with browser different electronics e-commerce sites to get the average price of the top 3 search results for the query: iPhone 13 Pro. The websites are: https://www.bestbuy.com/, https://www.croma.com/, https://www.mediaworld.it/, https://www.boulanger.com/. Then, provide me with a price comparison report. If you find a currency other than the euro, search Google for the latest exchange rate and convert the prices. Finally, save the report in the llama index database and send me the same report via email to (your_email)"*
5858

5959
- *"Go to LinkedIn Feed and log in using your email (your_email) and password (your_password). Scroll down to the first post and leave a comment that is both intelligent and contextually relevant, taking into account the text and image. Your comment must contain at least 40 words. Once you have posted your comment, email the execution result to (your_email)."*
6060

6161
- *"Please visit Booking.com and search for a Hotel in Milan that is available from June 1st to June 10th. Extract the name and price of the first hotel in the result. Then save it on simple rag database, send an email to (your_email) with the hotel's name and price."*
6262

63-
- *"Calculate the area of the triangle formed by Paris, Moscow, and Rome in square kilometers, and send me an email at samuele.giampieri1@gmail.com with the coordinates of the cities and the calculated area."*
63+
- *"Calculate the area of the triangle formed by Paris, Moscow, and Rome in square kilometers, and send me an email at your_email@gmail.com with the coordinates of the cities and the calculated area."*
64+
65+
- *"Search for the latest news about Open AI, summarize it and send me an email at your_email@gmail.com with the summary."*
66+
67+
- *"Search for the latest articles on cybersecurity, extract full-page content along with any notable images and captions using your web search and browser navigation tools, compile everything into an HTML report, and send it via email to my team at your_email@gmail.com with the subject 'Cybersecurity Trends Update'."*
68+
69+
- *"Search for the latest news about the latest Ferrari model, summarize it, and save it in the LlamaIndex database. After that, make 3 different queries on the database to check if the information was stored correctly. Finally, send me a report by email to your_email@gmail.com"*
70+
6471

6572
AutoCodeAgent 2.0 introduces RAG (Retrieval-Augmented Generation) capabilities, empowering the system with multi RAG techniques, each having its own ingestion and retrieval tools.
6673
The system uses many persistent Database integrated in Docker, like Vector ChromaDB, Graph Neo4j, and Others.
@@ -634,5 +641,3 @@ We welcome contributions from the community! If you'd like to contribute, please
634641
By contributing, you agree that your changes will be licensed under the same license as the project.
635642

636643
Thank you for helping improve this project! 🚀
637-
638-
33 Bytes
Binary file not shown.
2.15 KB
Binary file not shown.
0 Bytes
Binary file not shown.

code_agent/agent_subtask_executor.py

Lines changed: 92 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import json
2-
import traceback
2+
import re
33
import inspect
44
from .function_validator import FunctionValidator
55
from models.models import call_model
@@ -19,43 +19,97 @@ def __init__(self, agent):
1919
lib_name for tool in self.agent.tools for lib_name in tool["lib_names"]
2020
]
2121

22+
23+
2224
def execute_subtasks(self):
23-
"""
24-
Iterates over the subtasks in the JSON plan, validates each subtask’s code,
25-
attempts to execute it (with regeneration on error), and then calls the subtask function.
25+
"""
26+
Iterates over the subtasks in the JSON plan, validates each subtask’s code,
27+
executes it (with regeneration on error based on in-memory log inspection),
28+
and then calls the subtask function.
2629
27-
:return: A dictionary with the results of the executed subtasks.
28-
"""
29-
results = {}
30-
subtasks = self.agent.json_plan.get("subtasks", [])
31-
32-
for index, subtask in enumerate(subtasks):
33-
# --- Step 1: Validate subtask code ---
34-
output_validator, subtask = self._validate_subtask_code(subtask, index, results)
35-
code_string = output_validator["code_string"]
36-
37-
# --- Step 2: Execute the subtask code (with retries) ---
38-
temp_namespace, code_string = self._execute_subtask_code(subtask, code_string, index)
39-
40-
# --- Step 3: Run the subtask function from the namespace ---
41-
subtask_name = subtask["subtask_name"]
42-
input_tool_name = subtask.get("input_from_subtask", "")
43-
if subtask_name in temp_namespace:
44-
tool_func = temp_namespace[subtask_name]
45-
sig = inspect.signature(tool_func)
46-
if index > 0:
47-
previous_result = results.get(input_tool_name, {})
48-
result = tool_func(previous_result)
49-
else:
50-
if "previous_output" in sig.parameters:
51-
result = tool_func({})
30+
:return: A dictionary with the results of the executed subtasks.
31+
"""
32+
results = {}
33+
subtasks = self.agent.json_plan.get("subtasks", [])
34+
35+
error_pattern = re.compile(r"\[ERROR\]")
36+
37+
for index, subtask in enumerate(subtasks):
38+
# --- Step 1: Validate subtask code ---
39+
output_validator, subtask = self._validate_subtask_code(subtask, index, results)
40+
code_string = output_validator["code_string"]
41+
42+
subtask_name = subtask["subtask_name"]
43+
input_tool_name = subtask.get("input_from_subtask", "")
44+
attempts = 0
45+
success = False
46+
47+
# Regeneration loop for execution errors (based on log inspection)
48+
while attempts < self.execution_max_regeneration_attempts and not success:
49+
# --- Step 2: Execute the subtask code ---
50+
temp_namespace, code_string = self._execute_subtask_code(subtask, code_string, index)
51+
52+
if subtask_name not in temp_namespace:
53+
error_msg = f"Subtask '{subtask_name}' not found in the execution namespace."
54+
self.agent.logger.error(
55+
self.agent.enrich_log(error_msg, "add_red_divider"),
56+
extra={'no_memory': True}
57+
)
58+
raise Exception(error_msg)
59+
60+
tool_func = temp_namespace[subtask_name]
61+
sig = inspect.signature(tool_func)
62+
63+
log_start_index = len(self.agent.execution_logs)
64+
65+
# --- Step 3: Call the subtask function ---
66+
if index > 0:
67+
previous_result = results.get(input_tool_name, {})
68+
result = tool_func(previous_result)
69+
else:
70+
if "previous_output" in sig.parameters:
71+
result = tool_func({})
72+
else:
73+
result = tool_func()
74+
results[subtask_name] = result
75+
76+
# After calling the function, retrieve new log entries.
77+
new_logs = self.agent.execution_logs[log_start_index:]
78+
if any(error_pattern.search(log) for log in new_logs):
79+
error_message = "\n".join(new_logs)
80+
self.agent.logger.error(
81+
self.agent.enrich_log(
82+
f"❌ Errors found after executing subtask '{subtask_name}' "
83+
f"(attempt {attempts + 1}/{self.execution_max_regeneration_attempts}):\n{error_message}",
84+
"add_red_divider"
85+
),
86+
extra={'no_memory': True}
87+
)
88+
# Regenerate the subtask code based on the error logs.
89+
regen_subtask = self.regenerate_subtask(error_message, subtask)
90+
self._update_subtask_in_plan(subtask_name, regen_subtask)
91+
subtask = regen_subtask
92+
code_string = subtask["code"]
93+
attempts += 1
5294
else:
53-
result = tool_func()
54-
results[subtask_name] = result
95+
success = True
96+
97+
if not success:
98+
error_msg = (
99+
f"❌❌❌ Subtask '{subtask_name}' still fails after {attempts} execution regeneration attempts."
100+
)
101+
self.agent.logger.error(
102+
self.agent.enrich_log(error_msg, "add_red_divider"),
103+
extra={'no_memory': True}
104+
)
105+
raise Exception(error_msg)
55106

56-
results_str = json.dumps(results, indent=4)
107+
# --- Logging the successful execution ---
108+
results_str = json.dumps(results, indent=4)
57109
if index == len(subtasks) - 1:
58-
self.agent.logger.info(f"✅ Last subtask '{subtask_name}' executed successfully. this is the final result: {results_str}")
110+
self.agent.logger.info(
111+
f"✅ Last subtask '{subtask_name}' executed successfully. This is the final result: {results_str}"
112+
)
59113
if len(results_str) > 500:
60114
results_str = results_str[:500] + "... [truncated]"
61115

@@ -68,15 +122,9 @@ def execute_subtasks(self):
68122
),
69123
extra={'no_memory': True}
70124
)
71-
else:
72-
error_msg = f"Subtask '{subtask_name}' not found in the execution namespace."
73-
self.agent.logger.error(
74-
self.agent.enrich_log(error_msg, "add_red_divider"),
75-
extra={'no_memory': True}
76-
)
77-
raise Exception(error_msg)
78125

79-
return results
126+
return results
127+
80128

81129
def _update_subtask_in_plan(self, subtask_name, new_subtask):
82130
"""
@@ -170,18 +218,7 @@ def _validate_subtask_code(self, subtask, index, results):
170218
return output_validator, subtask
171219

172220
def _execute_subtask_code(self, subtask, code_string, index):
173-
"""
174-
Executes the code string within a dedicated namespace and attempts regeneration if an error occurs.
175-
176-
:param subtask: The current subtask (a dict).
177-
:param code_string: The Python code (as a string) to execute.
178-
:param index: The index of the current subtask.
179-
:return: A tuple (temp_namespace, code_string) where temp_namespace is the dictionary in which the code was executed.
180-
:raises Exception: If execution fails after the maximum regeneration attempts.
181-
"""
182-
attempts = 0
183221
temp_namespace = {"logger": self.agent.logger}
184-
185222
self.agent.logger.info(
186223
self.agent.enrich_log(
187224
f"⌛ Executing subtask nr.{index + 1} of {len(self.agent.json_plan.get('subtasks', []))}: {subtask['subtask_name']}",
@@ -190,35 +227,9 @@ def _execute_subtask_code(self, subtask, code_string, index):
190227
extra={'no_memory': True}
191228
)
192229

193-
while attempts < self.execution_max_regeneration_attempts:
194-
try:
195-
exec(code_string, temp_namespace)
196-
return temp_namespace, code_string
197-
except Exception:
198-
error_message = traceback.format_exc()
199-
self.agent.logger.error(
200-
self.agent.enrich_log(
201-
f"❌ Error during execution of subtask '{subtask['subtask_name']}' "
202-
f"(attempt {attempts + 1}/{self.execution_max_regeneration_attempts}):\n{error_message}",
203-
"add_red_divider"
204-
),
205-
extra={'no_memory': True}
206-
)
207-
regen_subtask = self.regenerate_subtask(error_message, subtask)
208-
self._update_subtask_in_plan(subtask["subtask_name"], regen_subtask)
209-
# In this case we assume that the regenerated subtask has an updated "code" field.
210-
code_string = regen_subtask["code"]
211-
attempts += 1
212-
213-
error_msg = (
214-
f"❌❌❌ Subtask '{subtask['subtask_name']}' still fails after "
215-
f"{attempts} execution regeneration attempts."
216-
)
217-
self.agent.logger.error(
218-
self.agent.enrich_log(error_msg, "add_red_divider"),
219-
extra={'no_memory': True}
220-
)
221-
raise Exception(error_msg)
230+
exec(code_string, temp_namespace)
231+
return temp_namespace, code_string
232+
222233

223234
def regenerate_subtask(self, subtask_errors, subtask):
224235
"""

0 commit comments

Comments
 (0)