v0.12.8: ToolChoice, Parallel Tool Calls, Strategy Pattern Parser, NPU Fix#205
Merged
v0.12.8: ToolChoice, Parallel Tool Calls, Strategy Pattern Parser, NPU Fix#205
Conversation
These files should not be included in pub.dev package: - test_reports/ directory with test output - macos/Resources/litertlm-server.jar downloaded at build time
…rmats Extract monolithic FunctionCallParser into separate format implementations: - FunctionCallFormat interface + Factory - JsonFunctionCallFormat (gemmaIt, hammer, default) - QwenFunctionCallFormat (<tool_call> XML tags) - DeepSeekFunctionCallFormat (Unicode special tokens) - LlamaFunctionCallFormat (<|python_tag|> syntax) - PhiFunctionCallFormat (<|tool_calls|> JSON arrays) - FunctionGemmaCallFormat (<start_function_call> format) - Shared JsonParsingUtils with parseMultipleJsonObjects() for parallel calls
…ration - ToolChoice.auto/required/none controls system prompt injection - ParallelFunctionCallResponse for multiple tool calls in one response - chat.dart uses parseAll() to detect and return parallel calls - Prompt wording changes per ToolChoice mode in createToolsPrompt()
- NPU: pass nativeLibraryDir to Backend.NPU() for Qualcomm/MediaTek/Tensor - Integration tests for 5 models: FunctionGemma, Gemma 3 1B, Qwen 2.5, DeepSeek R1, Gemma 3n E2B - Tests cover: install, auto, required, none, streaming, parallel calls - Verified parallel function calls on Qwen, DeepSeek, Gemma 3n (2 calls each)
…nse in example app - Version 0.12.8 in pubspec.yaml, podspecs, CLAUDE.md - CHANGELOG.md: add 0.12.8 section - README.md: ToolChoice table, parallel calls example, L2-norm note - Example app: handle ParallelFunctionCallResponse in chat_screen and gemma_input_field - chat.dart: use parseAll() for end-of-stream parallel calls
…ffer detection
- Add ModelType.phi enum value and wire PhiFunctionCallFormat in factory
- Add toolChoice != ToolChoice.none guard to sync and streaming parsing paths
- Use FunctionCallParser.isFunctionCallStart() instead of hardcoded { / ``` check
- Increase _maxFunctionBufferLength from 150 to 1024 for verbose formats
- Fix mutual exclusion of _pendingFunctionCall/_pendingParallelCall in example app
- Add _pendingParallelCall handling in onError callback
…tions, history - C1: Mid-stream buffer uses parseAll() instead of parse() for parallel calls - C2: DeepSeek regex uses [\s\S]*? to cross newlines between tokens - C3: Zero-argument functions return empty args map instead of null - C4: Streaming history records Message.toolCall() for function calls - C5: emittedFunctionCall flag moved before stream loop, set mid-stream - I1: addQueryChunk() skips tool prompt injection for ToolChoice.none
… TFLite DLL copy on Windows/Linux - Switch all embedding tests from fromNetwork to fromAsset (models already in assets) - Switch inference/tool calling tests from fromNetwork to fromFile (models pushed via adb) - Add prepare_test_models.sh script to push models to device via adb - Fix Windows CMakeLists: add POST_BUILD copy for tensorflowlite_c.dll (#200) - Fix Linux CMakeLists: add install rule for TFLite C library (#200) - Remove networkUrl from tool_calling_test Gemma 3n E2B config - Add .litertlm to example .gitignore
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
auto/required/none) — control tool calling behavior increateChat()ParallelFunctionCallResponsefor multiple function calls in one response,parseAll()extractionFunctionCallFormatimplementations (Gemma, Qwen, DeepSeek, Llama, Phi, FunctionGemma) withFunctionCallFormatFactoryrouting<tool_call>format — Qwen/Mistral-style function call parsingnativeLibraryDirto LiteRT-LMBackend.NPU()ParallelFunctionCallResponsein chat_screen and gemma_input_field