The LLMInterface defines generate_response -> Tuple[str, Dict[str, Any]]:, but none of the llm clients actually return that type, they all attempt to just return a standalone str.
I also noticed that the callers of generate_response typically expect it to be a single string value as well.
As of this filing, this inconsistency is the reason for all the remaining type errors in the codebase. There is also a risk a sneaky runtime bugs if someone builds off the LLMInterface defined return type, but then gets surprise null values for the second part of the tuple.