You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2025-10-28-Kimi-K2-Accuracy.md
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -47,7 +47,8 @@ To isolate the problem, I devised a crucial experiment. Instead of using vLLM's
47
47
A deeper look revealed that the Kimi tokenizer's `apply_chat_template` function signature includes `**kwargs` to accept extra, model-specific parameters. One such parameter, `add_generation_prompt=True`, is essential for correctly formatting the prompt to signal the start of the assistant's turn, guiding it towards generating a tool call.
48
48
49
49
A correct prompt should end with special tokens that prime the model to act as the assistant:
This seemingly small error was enough to confuse the model's generation logic.
86
+
This critical formatting error created a malformed prompt that was enough to confuse the model's generation logic.
86
87
87
88
**The Fix:**
88
89
@@ -94,9 +95,9 @@ Finally, I noticed that even when the model generated a syntactically correct to
94
95
95
96
**The Investigation:**
96
97
97
-
By inspecting the raw `text_completion` output from vLLM, the culprit became obvious. The model would occasionally generate tool-call IDs that didn't strictly conform to Kimi's official specification. For instance, consider this output:
98
+
By inspecting the raw `text_completion` output from vLLM, the culprit became obvious. I found that in certain edge cases, particularly when misled by a malformed conversation history, the model would generate tool-call IDs that didn't strictly conform to Kimi's official specification. For instance, consider this output:
0 commit comments