Skip to content

Conversation

@yuerqiqi
Copy link
Contributor

@yuerqiqi yuerqiqi commented Oct 29, 2025

This PR implements core functionality for mllm-cli, including Go bindings for the C/C++ backend and an OpenAI API-compatible client-server architecture for interacting with the Qwen3 model. Corresponding documentation for the CLI client and its services has also been added and updated.

@coderabbitai
Copy link

coderabbitai bot commented Oct 29, 2025

Walkthrough

This pull request extends the MLLM framework with a complete Go-based CLI client-server infrastructure, new C API session management capabilities, Go bindings to the C SDK, and Android build configurations. It enables the C SDK binding by default and introduces HTTP-based multi-turn chat functionality with streaming responses.

Changes

Cohort / File(s) Summary
C API Extensions
mllm/c_api/Object.h, mllm/c_api/Runtime.h, mllm/c_api/Runtime.cpp
Introduces new C API surface for session management (startService, stopService, createQwen3Session, insertSession, freeSession) and request/response handling (sendRequest, pollResponse). Adds kCustomObject enum value and v_custom_ptr union member to support custom session objects. Implements session lifecycle wrapper and asynchronous response polling.
Build Configuration
CMakeLists.txt
Enables MLLM C SDK binding by default by changing MLLM_BUILD_SDK_C_BINDING option from OFF to ON.
Go SDK Bindings
mllm-cli/mllm/c.go
Exposes Go API wrapping C SDK functions with Session type, lifecycle management (StartService, StopService, NewSession), and request-response handling (SendRequest, PollResponse). Includes finalizers for automatic resource cleanup and C helper wrappers for custom pointer management.
Go Service Layer
mllm-cli/pkg/mllm/service.go
Introduces thread-safe Service component for session registration and retrieval with mutex synchronization. Provides NewService, RegisterSession, GetSession, and graceful Shutdown.
Go HTTP Server
mllm-cli/pkg/server/server.go, mllm-cli/pkg/server/handlers.go
Implements HTTP server wrapper with /v1/chat/completions endpoint. Includes chatCompletionsHandler that decodes OpenAI-like requests, manages sessions, polls responses, and streams server-sent events (SSE) with UUID request tracking.
Go API Types
mllm-cli/pkg/api/types.go
Defines OpenAI-compatible request/response types: RequestMessage, OpenAIRequest (with session ID support), ResponseDelta, ResponseChoice, and OpenAIResponseChunk for streaming SSE protocol.
Go CLI Client
mllm-cli/cmd/mllm-client/main.go
Interactive CLI client maintaining multi-turn chat history with streaming support. Communicates with server via HTTP, handles X-Session-ID header updates, supports /exit and /quit commands, and progressively prints streamed assistant responses.
Go CLI Server
mllm-cli/cmd/mllm-server/main.go
Server entrypoint accepting required --model-path flag. Initializes MLLM service, creates and registers session, instantiates HTTP server on port 8080, and performs graceful shutdown on SIGINT/SIGTERM.
Go Dependencies
mllm-cli/go.mod
Updates Go toolchain to 1.22.2, restructures dependencies under require block with bubbles v0.21.0 and gorilla/websocket v1.5.3, and adds indirect google/uuid v1.6.0.
Build Tasks & Configuration
task.py, tasks/build_android_debug.yaml, tasks/build_android_go_dialog_test.yaml, tasks/build_android_mllm_client.yaml, tasks/build_android_mllm_server.yaml
Introduces ShellCommandTask class for executing shell commands via task framework. Adds Android NDK cross-compilation configurations for debug builds and Go binaries (dialog test, WebSocket client, web server) targeting arm64 with mobile tags and CGO linking.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant Client as CLI Client
    participant Server as HTTP Server
    participant Service as MLLM Service
    participant CApi as C API Layer

    User->>Client: /exit or chat input
    Client->>Client: Validate input
    alt Client exit command
        Client->>Client: Clear & exit
    else User message
        Client->>Client: Add to history
        Client->>Server: POST /v1/chat/completions<br/>(messages, session_id, stream=true)
        activate Server
        Server->>Service: GetSession(session_id)
        activate Service
        Service-->>Server: *Session
        deactivate Service
        Server->>Server: Generate request ID<br/>(UUID if missing)
        Server->>CApi: SendRequest(session_id, json)
        activate CApi
        CApi-->>Server: Request ID
        deactivate CApi
        
        Server->>Client: Set SSE headers<br/>Start streaming
        
        loop Poll until completion
            Server->>CApi: PollResponse(session_id)
            CApi-->>Server: JSON response chunk
            Server->>Server: Parse chunk<br/>Check finish_reason
            Server->>Client: Stream data line<br/>(SSE format)
            Client->>Client: Accumulate content
            Client->>User: Print streamed text
            
            alt Finish reason = "stop"
                Server->>Server: Mark complete
            else Empty response
                Server->>Server: Idle, retry
            end
        end
        
        Server->>Client: [DONE]
        deactivate Server
        Client->>Client: Update history<br/>with full response
    end
Loading
sequenceDiagram
    participant main as main()
    participant Service as StartService()
    participant Session as NewSession()
    participant CApi as C API Layer
    participant HTTP as HTTP Server

    main->>main: Parse --model-path flag
    main->>main: Check required args
    
    main->>Service: StartService(workers)
    activate Service
    Service->>CApi: startService()
    CApi-->>Service: MllmCAny result
    Service-->>main: bool (success)
    deactivate Service
    
    main->>Session: NewSession(model_path)
    activate Session
    Session->>CApi: createQwen3Session(model_path)
    CApi-->>Session: MllmCAny session handle
    Session->>Session: Set finalizer for cleanup
    Session-->>main: *Session, error
    deactivate Session
    
    main->>main: Insert(session_id)
    main->>main: RegisterSession(id, session)
    
    main->>HTTP: NewServer(":8080", service)
    activate HTTP
    HTTP->>HTTP: Register /v1/chat/completions
    HTTP-->>main: *Server
    deactivate HTTP
    
    main->>HTTP: Start()
    HTTP->>HTTP: Listen & accept connections
    
    main->>main: Wait for SIGINT/SIGTERM
    
    main->>HTTP: Shutdown(ctx)
    HTTP->>HTTP: Graceful server close
    
    main->>Service: Shutdown()
    Service->>Session: Close()
    Session->>CApi: freeSession()
    Service->>Service: Clear sessions map
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • C API layer additions require careful review of memory management, session wrapper lifecycle, and new public function signatures (startService, stopService, session management, request/response handling)
  • Go/C FFI bindings need validation of finalizer correctness, cHandle management, and error translation patterns
  • HTTP server and streaming handler demand attention to SSE protocol correctness, session resolution logic, and UUID generation/tracking
  • Multi-turn client state management should be verified for correct session persistence and error recovery (message rollback on failure)
  • Thread-safety in Service requires mutex usage validation
  • Build infrastructure (multiple Android configurations) involves verifying cross-compilation flags and CGO setup across different targets

Areas requiring extra attention:

  • Memory leaks in C session lifecycle (allocation via createQwen3Session vs. deallocation via freeSession/finalizer)
  • Go finalizer behavior with shared C pointers in concurrent scenarios
  • SSE streaming completion detection and [DONE] marker correctness
  • Session ID generation and header synchronization between client and server
  • Error paths in session lookup and request forwarding

Poem

🐰 Hop and bind, a bridge we weave,
From Go to C, we interleave,
Sessions stream on SSE's flow,
Multi-turn chats steal the show!
Android builds now cross compile,
MLLM's CLI with style!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'feat: Add mllm-cli support for Qwen3 and update docs' is clearly related to the main changes, which include extensive MLLM CLI infrastructure (client/server), Qwen3 session support in C API, and build configurations. It captures the primary intent of adding CLI support and Qwen3 integration.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chenghuaWang
Copy link
Collaborator

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Oct 29, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (23)
tasks/build_android_mllm_server.yaml (1)

11-13: Consider parameterizing the NDK path.

The hardcoded NDK path /opt/ndk/android-ndk-r28b reduces portability. While ANDROID_NDK_HOME is already exported, consider allowing it to be overridden from the environment if it's already set.

-      export ANDROID_NDK_HOME=/opt/ndk/android-ndk-r28b
+      export ANDROID_NDK_HOME=${ANDROID_NDK_HOME:-/opt/ndk/android-ndk-r28b}
tasks/build_android_mllm_client.yaml (1)

11-13: Consider parameterizing the NDK path.

Same recommendation as the server build: allow ANDROID_NDK_HOME to be overridden from the environment.

tasks/build_android_go_dialog_test.yaml (1)

12-14: Consider parameterizing the NDK path.

Allow ANDROID_NDK_HOME to be overridden from the environment for better portability.

mllm/c_api/Runtime.cpp (3)

35-41: Add braces around single-statement if blocks.

For consistency and safety, wrap single-statement if blocks in braces.

 int32_t isOk(MllmCAny ret) {
-  if (ret.type_id == kRetCode && ret.v_return_code == 0)
+  if (ret.type_id == kRetCode && ret.v_return_code == 0) {
       return true;
-  if (ret.type_id == kCustomObject && ret.v_custom_ptr != nullptr)
+  }
+  if (ret.type_id == kCustomObject && ret.v_custom_ptr != nullptr) {
       return true;
+  }
   return false;
 }

165-166: Consider safer string copy alternatives.

strncpy doesn't guarantee null-termination if the source is too long. Since response.length() + 1 bytes are allocated, consider using strcpy or memcpy followed by explicit null-termination.

-    char* c_response = new char[response.length() + 1];
-    strncpy(c_response, response.c_str(), response.length() + 1);
+    char* c_response = new char[response.length() + 1];
+    std::memcpy(c_response, response.c_str(), response.length());
+    c_response[response.length()] = '\0';

172-173: Simplify null pointer check before delete.

The null check is unnecessary since delete nullptr is safe in C++.

 void freeResponseString(const char* response_str) {
-    if (response_str != nullptr) {
-        delete[] response_str;
-    }
+    delete[] response_str;
 }
tasks/build_android_debug.yaml (1)

12-12: Consider using a relative or configurable install prefix.

The hard-coded /root/mllm-install-android-arm64-v8a path assumes a specific user directory and won't work in environments where builds run under different users or in CI systems with different home directories.

Consider using a relative path or an environment variable:

-        - "-DCMAKE_INSTALL_PREFIX=/root/mllm-install-android-arm64-v8a"
+        - "-DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX:-./install-android-arm64-v8a}"
mllm-cli/pkg/server/handlers.go (4)

23-27: Consider logging the decode error for debugging.

When JSON decoding fails, the actual error is not logged, making it harder to diagnose malformed requests during troubleshooting.

Apply this diff:

 		var requestPayload map[string]interface{}
 		if err := json.NewDecoder(r.Body).Decode(&requestPayload); err != nil {
+			log.Printf("ERROR: Failed to decode request body: %v", err)
 			http.Error(w, "Invalid request body", http.StatusBadRequest)
 			return
 		}

52-55: Consider logging the reason for SendRequest failure.

When session.SendRequest returns false, no additional context is provided about why the request failed. This makes debugging difficult when requests are rejected by the underlying session.

If the underlying SendRequest implementation can provide error details, consider enhancing the API to return an error instead of just a boolean. Otherwise, add internal logging within SendRequest to capture failure reasons.


60-60: Add safety check for Flusher type assertion.

The type assertion w.(http.Flusher) doesn't check the ok return value. While most HTTP ResponseWriters support Flusher, it's safer to verify.

Apply this diff:

-		flusher, _ := w.(http.Flusher)
+		flusher, ok := w.(http.Flusher)
+		if !ok {
+			http.Error(w, "Streaming not supported", http.StatusInternalServerError)
+			return
+		}

63-90: LGTM with one consideration on polling interval.

The streaming logic correctly handles client disconnects, end markers, and empty responses. The implementation properly filters out empty deltas before sending to avoid unnecessary SSE messages.

One consideration: the 10ms polling interval (line 84) creates a relatively tight loop. Depending on the underlying model's response generation speed, you might consider making this configurable or slightly increasing it to reduce CPU usage during idle periods.

If desired, consider making the polling interval configurable:

+const responsePollingInterval = 50 * time.Millisecond
+
 		for {
 			// ... existing code ...
 			} else {
-				time.Sleep(10 * time.Millisecond)
+				time.Sleep(responsePollingInterval)
 			}
 		}
mllm-cli/cmd/mllm-client/main.go (3)

19-19: Consider making the server URL configurable.

The server URL is hard-coded to http://localhost:8080/v1/chat/completions, which limits flexibility for connecting to different servers or ports.

Make the URL configurable via a command-line flag:

+import "flag"
+
 func main() {
-	serverURL := "http://localhost:8080/v1/chat/completions"
+	serverURL := flag.String("server-url", "http://localhost:8080/v1/chat/completions", "Server URL for chat completions")
+	flag.Parse()
 	var history []api.RequestMessage

Then use *serverURL throughout the code.


30-30: Handle potential error from ReadString.

The error from reader.ReadString('\n') is silently ignored, which could mask I/O issues.

Apply this diff:

-		userInput, _ := reader.ReadString('\n')
+		userInput, err := reader.ReadString('\n')
+		if err != nil {
+			log.Printf("ERROR: Failed to read input: %v", err)
+			continue
+		}

76-82: Consider logging JSON unmarshal errors.

When json.Unmarshal fails to parse a chunk, the error is silently ignored. This could make debugging issues with malformed server responses more difficult.

Apply this diff:

 				var chunk api.OpenAIResponseChunk
-				if json.Unmarshal([]byte(jsonData), &chunk) == nil && len(chunk.Choices) > 0 {
+				if err := json.Unmarshal([]byte(jsonData), &chunk); err != nil {
+					log.Printf("WARN: Failed to unmarshal chunk: %v", err)
+				} else if len(chunk.Choices) > 0 {
 					content := chunk.Choices[0].Delta.Content
 					fmt.Print(content)
 					fullResponse.WriteString(content)
 				}
mllm-cli/pkg/server/server.go (1)

17-30: Consider adding nil check for mllmService parameter.

The NewServer constructor doesn't validate that mllmService is non-nil, which could lead to a panic when the handler tries to access it.

Add a nil check:

 func NewServer(addr string, mllmService *mllm.Service) *Server {
+	if mllmService == nil {
+		log.Fatal("FATAL: mllmService cannot be nil")
+	}
 	mux := http.NewServeMux()
mllm-cli/pkg/mllm/service.go (1)

23-27: Consider logging when overwriting an existing session.

The RegisterSession method allows overwriting an existing session without warning. This could mask bugs where the same session ID is registered multiple times.

Add logging when overwriting:

 func (s *Service) RegisterSession(id string, session *mllm.Session) {
 	s.mutex.Lock()
 	defer s.mutex.Unlock()
+	if _, exists := s.sessions[id]; exists {
+		log.Printf("WARN: Overwriting existing session with ID: %s", id)
+	}
 	s.sessions[id] = session
 }
mllm/c_api/Runtime.h (1)

51-53: Clarify memory management contract for pollResponse.

The pairing of pollResponse() returning a const char* with a separate freeResponseString() function creates potential for memory leaks if callers forget to free. Consider if the API could be made safer.

Options to consider:

  1. Document the memory contract very clearly (which the documentation comment above addresses).
  2. Consider returning a status code with an out-parameter for the string, making memory ownership more explicit.
  3. Use a buffer provided by the caller to avoid heap allocation.

For now, ensure that all Go wrapper code and examples demonstrate proper usage with defer to free responses immediately after use.

mllm-cli/cmd/mllm-server/main.go (1)

30-32: Consider making log level and worker threads configurable.

The log level (2) and worker thread count (4) are hard-coded. Making these configurable via flags would improve flexibility for different deployment environments.

Add command-line flags:

 func main() {
 	modelPath := flag.String("model-path", "", "Path to the MLLM model directory.")
+	logLevel := flag.Int("log-level", 2, "Log level (0=OFF, 1=ERROR, 2=INFO, 3=DEBUG)")
+	workerThreads := flag.Int("workers", 4, "Number of worker threads")
 	flag.Parse()
 
 	if *modelPath == "" {
 		log.Fatal("FATAL: --model-path argument is required.")
 	}
 
 	if !mllm.InitializeContext() {
 		log.Fatal("FATAL: InitializeContext failed!")
 	}
-	mllm.SetLogLevel(2)
-	if !mllm.StartService(4) {
+	mllm.SetLogLevel(*logLevel)
+	if !mllm.StartService(*workerThreads) {
 		log.Fatal("FATAL: StartService failed!")
 	}
mllm-cli/pkg/api/types.go (1)

14-16: Consider English comments and clarify EnableThinking vs Thinking fields.

Two observations:

  1. The comment on line 15 is in Chinese, which may not be accessible to all contributors. Consider using English for consistency.
  2. The EnableThinking and Thinking fields appear to serve similar purposes. Having both could lead to confusion about which field clients should use.

Clarify the distinction or consolidate:

 type OpenAIRequest struct {
 	Model          string           `json:"model"`
 	Messages       []RequestMessage `json:"messages"`
 	Stream         bool             `json:"stream"`
-	EnableThinking bool             `json:"enable_thinking,omitempty"` 
-	Thinking       bool             `json:"thinking,omitempty"`       // <-- 新增此行,用于接收客户端可能发送的 "thinking": true
+	// EnableThinking enables the model's reasoning/thinking mode
+	EnableThinking bool             `json:"enable_thinking,omitempty"` 
 	SessionID      string           `json:"session_id,omitempty"`     
 }

If both fields are genuinely needed, document why they're distinct and when each should be used.

mllm-cli/mllm/c.go (4)

8-8: Link portability: consider rpath/build tags to avoid runtime loader issues.

Linking against -lMllmSdkC -lMllmRT -lMllmCPUBackend assumes system search paths are configured. For CLI distribution, consider:

  • Adding an rpath to the binary (e.g., -Wl,-rpath,$ORIGIN or a project-specific lib dir).
  • Using build tags/conditional LDFLAGS per OS/arch if backends differ.
  • Documenting LD_LIBRARY_PATH/DYLD_LIBRARY_PATH requirements.

59-74: Finalizer is a safety net, not a guarantee; consider reducing side effects and improving error context.

  • Avoid relying on the finalizer for correctness; explicit Close should be the primary path.
  • Printing in finalizers can surprise users; prefer a debug logger or remove the print.
  • The error message should include modelPath and be English/localized consistently; include C-side error if available.

Apply this diff:

-    handle := C.createQwen3Session(cModelPath)
-    if !isOk(handle) {
-        return nil, fmt.Errorf("底层C API createQwen3Session 失败")
-    }
+    handle := C.createQwen3Session(cModelPath)
+    if !isOk(handle) {
+        return nil, fmt.Errorf("createQwen3Session failed for model %q", modelPath)
+    }
@@
-    runtime.SetFinalizer(s, func(s *Session) {
-        fmt.Println("[Go Finalizer] Mllm Session automatically released.") 
-        C.freeSession(s.cHandle)
-    })
+    runtime.SetFinalizer(s, func(s *Session) {
+        // Finalizer is best-effort; prefer explicit Close().
+        C.freeSession(s.cHandle)
+    })

55-57: Surface SetLogLevel errors or document valid range.

If the C API validates levels, consider returning bool/error or documenting accepted range to help callers detect misconfiguration.

Would you like me to add a thin wrapper returning bool and adjust callers?


95-99: Optional: validate jsonRequest early.

If empty strings are invalid, reject early for clearer errors; if JSON must be well-formed, consider basic validation before crossing the cgo boundary.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f8eec9a and e4b2497.

📒 Files selected for processing (17)
  • CMakeLists.txt (1 hunks)
  • mllm-cli/cmd/mllm-client/main.go (1 hunks)
  • mllm-cli/cmd/mllm-server/main.go (1 hunks)
  • mllm-cli/go.mod (2 hunks)
  • mllm-cli/mllm/c.go (2 hunks)
  • mllm-cli/pkg/api/types.go (1 hunks)
  • mllm-cli/pkg/mllm/service.go (1 hunks)
  • mllm-cli/pkg/server/handlers.go (1 hunks)
  • mllm-cli/pkg/server/server.go (1 hunks)
  • mllm/c_api/Object.h (1 hunks)
  • mllm/c_api/Runtime.cpp (2 hunks)
  • mllm/c_api/Runtime.h (1 hunks)
  • task.py (2 hunks)
  • tasks/build_android_debug.yaml (1 hunks)
  • tasks/build_android_go_dialog_test.yaml (1 hunks)
  • tasks/build_android_mllm_client.yaml (1 hunks)
  • tasks/build_android_mllm_server.yaml (1 hunks)
🧰 Additional context used
🪛 Clang (14.0.6)
mllm/c_api/Runtime.cpp

[error] 11-11: inclusion of deprecated C++ header 'string.h'; consider using 'cstring' instead

(modernize-deprecated-headers,-warnings-as-errors)


[error] 36-36: statement should be inside braces

(google-readability-braces-around-statements,readability-braces-around-statements,-warnings-as-errors)


[error] 37-37: implicit conversion bool -> 'int'

(readability-implicit-bool-conversion,-warnings-as-errors)


[error] 38-38: statement should be inside braces

(google-readability-braces-around-statements,readability-braces-around-statements,-warnings-as-errors)


[error] 39-39: implicit conversion bool -> 'int'

(readability-implicit-bool-conversion,-warnings-as-errors)


[error] 40-40: implicit conversion bool -> 'int'

(readability-implicit-bool-conversion,-warnings-as-errors)


[error] 66-66: variable 'worker_threads' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 88-88: initializing non-owner 'MllmSessionWrapper *' with a newly created 'gsl::owner<>'

(cppcoreguidelines-owning-memory,-warnings-as-errors)


[error] 104-104: variable 'session_wrapper' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 115-115: variable 'session_wrapper' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 126-126: variable 'status' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 135-135: variable 'request_id' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 136-136: variable 'response' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 144-144: variable 'j' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)


[error] 144-144: variable name 'j' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)


[error] 165-165: initializing non-owner 'char *' with a newly created 'gsl::owner<>'

(cppcoreguidelines-owning-memory,-warnings-as-errors)


[error] 172-172: 'if' statement is unnecessary; deleting null pointer has no effect

(readability-delete-null-pointer,-warnings-as-errors)


[error] 173-173: deleting a pointer through a type that is not marked 'gsl::owner<>'; consider using a smart pointer instead

(cppcoreguidelines-owning-memory,-warnings-as-errors)

🪛 Ruff (0.14.2)
task.py

526-526: Starting a process with a shell, possible injection detected

(S605)

🔇 Additional comments (17)
mllm/c_api/Object.h (1)

40-41: LGTM!

The addition of kCustomObject enum value and corresponding v_custom_ptr union member appropriately extends the C API to support custom object pointers, aligning with the new session management functionality.

task.py (1)

515-526: Verify the security model for shell command execution.

The use of os.system(command_str) executes shell commands directly, which static analysis flags as a potential injection risk. While this appears acceptable since commands come from committed YAML config files rather than user input, consider whether subprocess.run() with shell=False would be more appropriate for better security posture.

If untrusted input could ever reach these YAML files, apply this refactor:

+import subprocess
+
 class ShellCommandTask(Task):
     def __init__(self, config):
         super().__init__(config)
 
     def run(self):
         logging.info("Generic Shell Command Task Start (re-enabled)...")
         command_str = self.config.get("command", "")
         if not command_str:
             logging.error("No command provided in ShellCommandTask.")
             return
 
-        throw_error_if_failed(os.system(command_str))
+        result = subprocess.run(command_str, shell=True, check=False)
+        throw_error_if_failed(result.returncode)

Note: This maintains current behavior while making the shell invocation explicit.

mllm/c_api/Runtime.cpp (3)

79-96: Verify memory management strategy for session wrappers.

The raw new allocation at line 88 creates an owning pointer returned via the C API. Ensure that:

  1. All code paths properly call freeSession to avoid leaks
  2. Double-free scenarios are prevented
  3. The Go bindings correctly manage the lifecycle

Consider documenting the ownership transfer in a comment:

// Allocate wrapper on heap; ownership transfers to caller
// Caller MUST call freeSession to deallocate
auto* handle = new MllmSessionWrapper();

130-169: Review response buffer management and completion detection.

The function allocates a C-string buffer that must be freed by the caller. Verify:

  1. All callers properly call freeResponseString
  2. The finish_reason == "stop" detection correctly handles all completion scenarios
  3. Consider returning an empty string or specific sentinel instead of nullptr for finished streams to distinguish from errors

65-77: LGTM!

The service lifecycle functions correctly wrap the underlying C++ service API and provide appropriate error codes.

mllm-cli/go.mod (2)

5-8: LGTM!

The addition of gorilla/websocket and google/uuid dependencies aligns with the new client-server functionality and streaming support introduced in this PR.


3-3: Ignore this review comment; it contains factually incorrect information.

Go 1.25 was released on August 12, 2025, so Go 1.25.0 does exist. The review comment's claim that "Go 1.25.0 doesn't exist" is incorrect. The go.mod file specifies version 1.22.2, which is a valid stable Go version, and go mod verify confirms all modules are valid. No action is required.

Likely an incorrect or invalid review comment.

CMakeLists.txt (1)

22-22: Verify enabling C SDK binding by default is appropriate for all build configurations.

Changing MLLM_BUILD_SDK_C_BINDING from OFF to ON enables the C API bindings by default for all builds. Ensure:

  1. All required dependencies for the C bindings are commonly available
  2. This doesn't break existing build workflows or CI pipelines
  3. The change is documented in release notes or migration guides

Users who want the previous behavior can override with:

cmake -DMLLM_BUILD_SDK_C_BINDING=OFF ...
mllm-cli/cmd/mllm-client/main.go (2)

47-60: LGTM on response handling.

The HTTP response handling correctly checks for errors before accessing the response, and properly defers the body close. The status code validation and history rollback on errors are well-implemented.


47-53: Fix potential nil pointer dereference on response.Body.

The defer resp.Body.Close() is called before checking if err != nil, which means resp could be nil and cause a panic.

Apply this diff:

 		resp, err := http.DefaultClient.Do(req)
 		if err != nil {
 			log.Printf("ERROR: Request failed: %v", err)
 			history = history[:len(history)-1]
 			continue
 		}
 		defer resp.Body.Close()

Move the defer statement after the error check:

 		resp, err := http.DefaultClient.Do(req)
 		if err != nil {
 			log.Printf("ERROR: Request failed: %v", err)
 			history = history[:len(history)-1]
 			continue
 		}
+		defer resp.Body.Close()
-		defer resp.Body.Close()

Actually, looking at the code again, the defer is already after the error check on line 53. Let me re-read... Yes, it's correct. The defer is after the error check. This is fine.
[scratchpad_end] -->

Likely an incorrect or invalid review comment.

mllm-cli/pkg/server/server.go (1)

32-42: LGTM!

The Start and Shutdown methods follow standard Go HTTP server patterns. Error handling properly distinguishes http.ErrServerClosed (expected during shutdown) from actual failures, and the graceful shutdown implementation correctly propagates the context.

mllm-cli/pkg/mllm/service.go (2)

12-21: LGTM!

The Service structure and constructor are clean and correctly initialized. The mutex is appropriately included to protect concurrent access to the sessions map.


41-52: LGTM!

The Shutdown method correctly locks the mutex, iterates through all sessions to close them, and reinitializes the map. This ensures clean resource cleanup during server shutdown.

mllm-cli/cmd/mllm-server/main.go (2)

53-55: LGTM!

Starting the HTTP server in a goroutine is the correct approach for graceful shutdown, allowing the main goroutine to wait for shutdown signals.


57-72: LGTM!

The graceful shutdown implementation correctly handles SIGINT and SIGTERM signals, shuts down the HTTP server with a reasonable timeout, and ensures the mllm service cleanup runs even if the HTTP shutdown fails. The shutdown ordering (HTTP server → mllm service) is appropriate.

mllm-cli/pkg/api/types.go (1)

19-33: LGTM!

The response types (ResponseDelta, ResponseChoice, OpenAIResponseChunk) are well-structured to represent streaming SSE chunks in the OpenAI format. The nesting and JSON tags are appropriate.

mllm-cli/mllm/c.go (1)

8-8: Double‑free safety recap.

You correctly clear the finalizer in Close and null the handle after free. However, because the finalizer path doesn’t null the handle (object is dying anyway), it’s crucial that no other goroutine can reach s.cHandle concurrently. The locking change above ensures that.

Please confirm no other packages hold aliases to s.cHandle or call C.freeSession directly.

Also applies to: 13-20, 68-71, 78-81

Comment on lines +42 to +44
requestBody, _ := json.Marshal(apiRequest)

req, _ := http.NewRequest("POST", serverURL, bytes.NewBuffer(requestBody))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle errors from json.Marshal and http.NewRequest.

Both json.Marshal and http.NewRequest can return errors, but they are silently ignored. This could lead to nil pointer dereferences or invalid requests.

Apply this diff:

-		requestBody, _ := json.Marshal(apiRequest)
+		requestBody, err := json.Marshal(apiRequest)
+		if err != nil {
+			log.Printf("ERROR: Failed to marshal request: %v", err)
+			history = history[:len(history)-1]
+			continue
+		}
 
-		req, _ := http.NewRequest("POST", serverURL, bytes.NewBuffer(requestBody))
+		req, err := http.NewRequest("POST", serverURL, bytes.NewBuffer(requestBody))
+		if err != nil {
+			log.Printf("ERROR: Failed to create request: %v", err)
+			history = history[:len(history)-1]
+			continue
+		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
requestBody, _ := json.Marshal(apiRequest)
req, _ := http.NewRequest("POST", serverURL, bytes.NewBuffer(requestBody))
requestBody, err := json.Marshal(apiRequest)
if err != nil {
log.Printf("ERROR: Failed to marshal request: %v", err)
history = history[:len(history)-1]
continue
}
req, err := http.NewRequest("POST", serverURL, bytes.NewBuffer(requestBody))
if err != nil {
log.Printf("ERROR: Failed to create request: %v", err)
history = history[:len(history)-1]
continue
}
🤖 Prompt for AI Agents
In mllm-cli/cmd/mllm-client/main.go around lines 42 to 44, both json.Marshal and
http.NewRequest ignore returned errors; change the code to capture each error
into a variable, check it immediately, and handle failures (log or print a clear
error message and exit/return). Specifically, replace the blank-ignored error
for json.Marshal with "requestBody, err := json.Marshal(apiRequest)" and if err
!= nil handle it, then create the request with "req, err :=
http.NewRequest(...)" and check err != nil and handle it before using req.

Comment on lines +45 to +51
sessionID := filepath.Base(*modelPath)
if !session.Insert(sessionID) {
session.Close()
log.Fatalf("FATAL: Failed to insert session with ID '%s'", sessionID)
}
mllmService.RegisterSession(sessionID, session)
log.Printf("Session created and registered successfully with ID: %s", sessionID)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Session ID derivation may cause collisions with multiple models.

Using filepath.Base(*modelPath) as the session ID could cause collisions if multiple models have the same directory name in different paths. For example, /path1/model and /path2/model would both get session ID "model".

Consider one of these approaches:

  1. Use the full model path or a hash:
-	sessionID := filepath.Base(*modelPath)
+	// Use a hash of the full path to avoid collisions
+	sessionID := fmt.Sprintf("%x", md5.Sum([]byte(*modelPath)))[:16]
  1. Allow explicit session ID via flag:
+	sessionID := flag.String("session-id", "", "Session ID (defaults to model directory name)")
 	flag.Parse()
 	// ...
-	sessionID := filepath.Base(*modelPath)
+	sid := *sessionID
+	if sid == "" {
+		sid = filepath.Base(*modelPath)
+	}

The second approach provides flexibility while maintaining the current default behavior.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In mllm-cli/cmd/mllm-server/main.go around lines 45 to 51, deriving the
sessionID with filepath.Base(*modelPath) can collide for models with identical
directory names; change to accept an explicit session ID via a CLI flag and fall
back to a deterministic unique ID when not provided (for example: use the model
path combined with a short hash or a hash-only of the full model path, or the
base name concatenated with the first 8 chars of a SHA256 hex of *modelPath) so
IDs are unique; implement: add a new flag (e.g., --session-id), if set use that
value (validate allowed chars/length), otherwise compute sessionID = baseName +
"-" + shortHash(*modelPath) (or full-hash), then use that sessionID for
session.Insert/registration and update log messages accordingly.

Comment on lines +13 to +20
static void* MllmCAny_get_v_custom_ptr(MllmCAny handle) {
return handle.v_custom_ptr;
}
static MllmCAny MllmCAny_set_v_custom_ptr_null(MllmCAny handle) {
handle.v_custom_ptr = NULL;
return handle;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid peeking into C struct internals; add stable C API instead.

Accessing MllmCAny.v_custom_ptr directly is ABI-fragile. If the C struct layout changes, this breaks at compile or (worse) runtime. Prefer an official C helper (e.g., mllmAnyIsValid, mllmAnyNull) exported by the C SDK, or provide a dedicated C API to query/free validity instead of field access from Go.

Comment on lines +28 to +31
type Session struct {
cHandle C.MllmCAny
sessionID string
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Concurrent use-after-free risk; guard Session with locks and a closed flag.

Methods can race with Close(), leading to use-after-free of the C handle. Add an RWMutex and a closed flag; serialize Close vs. Insert/SendRequest/PollResponse/SessionID.

Apply this diff:

@@
-import "C"
-import "unsafe"
-import "fmt"
-import "runtime"
+import "C"
+import "unsafe"
+import "fmt"
+import "runtime"
+import "sync"
@@
 type Session struct {
     cHandle C.MllmCAny
     sessionID string
+    mu     sync.RWMutex
+    closed bool
 }
@@
-func (s *Session) Close() {
-    if C.MllmCAny_get_v_custom_ptr(s.cHandle) != nil {
-        fmt.Println("[Go Close] Mllm Session manually closed.") 
-        C.freeSession(s.cHandle)
-        s.cHandle = C.MllmCAny_set_v_custom_ptr_null(s.cHandle)
-        runtime.SetFinalizer(s, nil)
-    }
-}
+func (s *Session) Close() {
+    s.mu.Lock()
+    defer s.mu.Unlock()
+    if s.closed {
+        return
+    }
+    if C.MllmCAny_get_v_custom_ptr(s.cHandle) != nil {
+        fmt.Println("[Go Close] Mllm Session manually closed.")
+        C.freeSession(s.cHandle)
+        s.cHandle = C.MllmCAny_set_v_custom_ptr_null(s.cHandle)
+    }
+    s.closed = true
+    runtime.SetFinalizer(s, nil)
+}
@@
-func (s *Session) Insert(sessionID string) bool {
+func (s *Session) Insert(sessionID string) bool {
+    s.mu.Lock()
+    defer s.mu.Unlock()
+    if s.closed {
+        fmt.Println("[Go Insert] Error: session already closed.")
+        return false
+    }
     cSessionID := C.CString(sessionID)
     defer C.free(unsafe.Pointer(cSessionID))
     result := C.insertSession(cSessionID, s.cHandle)
     if isOk(result) {
         s.sessionID = sessionID 
     }
     return isOk(result)
 }
@@
-func (s *Session) SendRequest(jsonRequest string) bool {
-    if s.sessionID == "" {
+func (s *Session) SendRequest(jsonRequest string) bool {
+    s.mu.RLock()
+    defer s.mu.RUnlock()
+    if s.closed {
+        fmt.Println("[Go SendRequest] Error: session is closed.")
+        return false
+    }
+    if s.sessionID == "" {
         fmt.Println("[Go SendRequest] Error: sessionID is not set on this session.")
         return false 
     }
     cSessionID := C.CString(s.sessionID)
     cJsonRequest := C.CString(jsonRequest)
     defer C.free(unsafe.Pointer(cSessionID))
     defer C.free(unsafe.Pointer(cJsonRequest))
 
     result := C.sendRequest(cSessionID, cJsonRequest)
     return isOk(result)
 }
@@
-func (s *Session) PollResponse(requestID string) string { 
-    if requestID == "" {
+func (s *Session) PollResponse(requestID string) string { 
+    s.mu.RLock()
+    defer s.mu.RUnlock()
+    if s.closed {
+        fmt.Println("[Go PollResponse] Error: session is closed.")
+        return ""
+    }
+    if requestID == "" {
         fmt.Println("[Go PollResponse] Error: requestID cannot be empty.")
         return ""
     }
     cRequestID := C.CString(requestID)
     defer C.free(unsafe.Pointer(cRequestID))
 
     cResponse := C.pollResponse(cRequestID) 
     if cResponse == nil {
         return ""
     }
     defer C.freeResponseString(cResponse)
     
     return C.GoString(cResponse)
 }
@@
-func (s *Session) SessionID() string {
-    return s.sessionID
-}
+func (s *Session) SessionID() string {
+    s.mu.RLock()
+    defer s.mu.RUnlock()
+    return s.sessionID
+}

Also applies to: 76-83, 85-93, 95-107, 109-124, 126-128

🤖 Prompt for AI Agents
In mllm-cli/mllm/c.go around lines 28-31 (and similarly for ranges 76-83, 85-93,
95-107, 109-124, 126-128), the Session struct and its methods are vulnerable to
concurrent use-after-free when Close() runs while other methods access cHandle;
add a sync.RWMutex field (e.g., mu) and a boolean closed flag to Session;
acquire mu.RLock() at the start of Insert/SendRequest/PollResponse/SessionID and
return an error if closed, release RLock before returning; make Close() acquire
mu.Lock(), check and set closed=true, safely free the C handle and set it to nil
under the write lock, then release the lock; ensure all accesses to cHandle and
closed are done under the appropriate lock to serialize Close vs. other
operations.

Comment on lines +45 to +48
func StartService(workerThreads int) bool {
result := C.startService(C.size_t(workerThreads))
return isOk(result)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate workerThreads to prevent size_t underflow on negative input.

Casting a negative int to C.size_t yields a huge value. Guard against workerThreads < 0.

Apply this diff:

 func StartService(workerThreads int) bool {
-    result := C.startService(C.size_t(workerThreads))
+    if workerThreads < 0 {
+        fmt.Println("[Go StartService] Error: workerThreads cannot be negative.")
+        return false
+    }
+    result := C.startService(C.size_t(workerThreads))
     return isOk(result)
 }
🤖 Prompt for AI Agents
In mllm-cli/mllm/c.go around lines 45 to 48, the StartService function casts
workerThreads directly to C.size_t which underflows for negative ints; validate
that workerThreads is non-negative before casting, returning false (or an error)
immediately if workerThreads < 0, otherwise cast to C.size_t and call
C.startService as before. Ensure the validation is explicit and documented with
a short comment so callers can see negative values are rejected.

#include <string>
#include <vector>
#include <map>
#include <string.h> // for strncpy
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Replace deprecated C header with C++ equivalent.

Use <cstring> instead of <string.h> for C++ code.

-#include <string.h> // for strncpy
+#include <cstring> // for strncpy
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#include <string.h> // for strncpy
#include <cstring> // for strncpy
🧰 Tools
🪛 Clang (14.0.6)

[error] 11-11: inclusion of deprecated C++ header 'string.h'; consider using 'cstring' instead

(modernize-deprecated-headers,-warnings-as-errors)

🤖 Prompt for AI Agents
In mllm/c_api/Runtime.cpp around line 11, replace the deprecated C header
include with the C++ equivalent by changing #include <string.h> to #include
<cstring>, and update any direct calls to strncpy to use the std:: namespace
(std::strncpy) or ensure the symbol is available in the current scope.

Comment on lines +34 to +53
//===----------------------------------------------------------------------===//
// Mllm service functions
//===----------------------------------------------------------------------===//
MllmCAny startService(size_t worker_threads);

MllmCAny stopService();

void setLogLevel(int level);

MllmCAny createQwen3Session(const char* model_path);

MllmCAny insertSession(const char* session_id, MllmCAny handle);

MllmCAny freeSession(MllmCAny handle);

MllmCAny sendRequest(const char* session_id, const char* json_request);

const char* pollResponse(const char* session_id);

void freeResponseString(const char* response_str);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Consider adding documentation comments for the new API surface.

The new session management and service functions lack documentation comments explaining:

  • The lifecycle and ownership of sessions
  • Memory management rules (e.g., who owns the string returned by pollResponse and when to call freeResponseString)
  • Thread-safety guarantees
  • Expected behavior and return values

Add documentation comments in Doxygen or similar format:

 //===----------------------------------------------------------------------===//
 // Mllm service functions
 //===----------------------------------------------------------------------===//
+/**
+ * @brief Starts the MLLM service with the specified number of worker threads.
+ * @param worker_threads Number of worker threads to use.
+ * @return MllmCAny Result status. Check with isOk().
+ */
 MllmCAny startService(size_t worker_threads);
 
+/**
+ * @brief Stops the MLLM service and cleans up resources.
+ * @return MllmCAny Result status. Check with isOk().
+ */
 MllmCAny stopService();
 
+/**
+ * @brief Sets the log level for MLLM logging.
+ * @param level Log level (e.g., 0=OFF, 1=ERROR, 2=INFO, 3=DEBUG).
+ */
 void setLogLevel(int level);
 
+/**
+ * @brief Creates a new Qwen3 session from the specified model path.
+ * @param model_path Path to the model directory.
+ * @return MllmCAny Session handle. Check with isOk(). Must be freed with freeSession().
+ */
 MllmCAny createQwen3Session(const char* model_path);
 
+/**
+ * @brief Inserts a session into the service registry with the given ID.
+ * @param session_id Unique identifier for the session.
+ * @param handle Session handle from createQwen3Session().
+ * @return MllmCAny Result status. Check with isOk().
+ */
 MllmCAny insertSession(const char* session_id, MllmCAny handle);
 
+/**
+ * @brief Frees a session handle and its associated resources.
+ * @param handle Session handle to free.
+ * @return MllmCAny Result status. Check with isOk().
+ */
 MllmCAny freeSession(MllmCAny handle);
 
+/**
+ * @brief Sends a JSON request to the specified session.
+ * @param session_id Session identifier.
+ * @param json_request JSON-formatted request string.
+ * @return MllmCAny Result status. Check with isOk().
+ */
 MllmCAny sendRequest(const char* session_id, const char* json_request);
 
+/**
+ * @brief Polls for a response from the specified session.
+ * @param session_id Session identifier.
+ * @return const char* Response string or NULL if no response available.
+ *         Caller must free the returned string using freeResponseString().
+ */
 const char* pollResponse(const char* session_id);
 
+/**
+ * @brief Frees a response string returned by pollResponse().
+ * @param response_str Response string to free. Can be NULL.
+ */
 void freeResponseString(const char* response_str);
🤖 Prompt for AI Agents
In mllm/c_api/Runtime.h around lines 34 to 53, the new public C API declarations
lack documentation: add Doxygen-style comments above each function and a brief
header for the section that describe each function's purpose, parameter
semantics, ownership/lifetime rules (who allocates/frees session handles and
response strings), thread-safety guarantees, expected return values and error
conditions, and example usage (e.g., call startService before
createQwen3Session, call freeResponseString after consuming pollResponse). For
pointer-returning APIs (pollResponse, createQwen3Session,
insertSession/freeSession), explicitly state ownership and when to call
freeResponseString or freeSession; for functions returning MllmCAny explain what
the opaque type represents and how to check for failure; and document
concurrency expectations (reentrant, per-session single-threaded, or protected
by caller). Ensure comments are concise, follow Doxygen format (/** ... */) and
placed immediately above each declaration.

Comment on lines +18 to +16
export CGO_LDFLAGS="-L/root/zty_workspace/mllm_zty/build-android-arm64-v8a/bin"
export CGO_CFLAGS="-I/root/zty_workspace/mllm_zty"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace hardcoded absolute paths with environment variables or relative paths.

Same portability issue present in the other Android build tasks. The hardcoded path /root/zty_workspace/mllm_zty should be parameterized.

-        export CGO_LDFLAGS="-L/root/zty_workspace/mllm_zty/build-android-arm64-v8a/bin"
-        export CGO_CFLAGS="-I/root/zty_workspace/mllm_zty"
+        export CGO_LDFLAGS="-L${MLLM_ROOT}/build-android-arm64-v8a/bin"
+        export CGO_CFLAGS="-I${MLLM_ROOT}"
🤖 Prompt for AI Agents
In tasks/build_android_go_dialog_test.yaml around lines 18 to 19, the
CGO_LDFLAGS and CGO_CFLAGS use a hardcoded absolute path
(/root/zty_workspace/mllm_zty); replace these with either a provided environment
variable (e.g., $WORKSPACE or $PROJECT_ROOT) or a relative path (e.g.,
${CI_PROJECT_DIR} or ./) so the build is portable across environments — update
the exports to reference that variable/relative path and document or ensure the
variable is set in the pipeline.

Comment on lines +16 to +15
export CGO_LDFLAGS="-L/root/zty_workspace/mllm_zty/build-android-arm64-v8a/bin"
export CGO_CFLAGS="-I/root/zty_workspace/mllm_zty"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace hardcoded absolute paths with environment variables or relative paths.

Same issue as in build_android_mllm_server.yaml: the hardcoded path /root/zty_workspace/mllm_zty will break builds on other machines.

-      export CGO_LDFLAGS="-L/root/zty_workspace/mllm_zty/build-android-arm64-v8a/bin"
-      export CGO_CFLAGS="-I/root/zty_workspace/mllm_zty"
+      export CGO_LDFLAGS="-L${MLLM_ROOT}/build-android-arm64-v8a/bin"
+      export CGO_CFLAGS="-I${MLLM_ROOT}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export CGO_LDFLAGS="-L/root/zty_workspace/mllm_zty/build-android-arm64-v8a/bin"
export CGO_CFLAGS="-I/root/zty_workspace/mllm_zty"
export CGO_LDFLAGS="-L${MLLM_ROOT}/build-android-arm64-v8a/bin"
export CGO_CFLAGS="-I${MLLM_ROOT}"
🤖 Prompt for AI Agents
In tasks/build_android_mllm_client.yaml around lines 16-17, the CGO_LDFLAGS and
CGO_CFLAGS use a hardcoded absolute path (/root/zty_workspace/mllm_zty); replace
those with either a configurable environment variable (e.g. ${WORKSPACE_ROOT} or
${GITHUB_WORKSPACE}) or a relative path from the repo root (e.g.
${PWD}/build-android-arm64-v8a/bin and ${PWD}) so the CI/build works on other
machines; ensure the chosen env var is set earlier in the workflow or document
it, and update the two export lines to reference that variable instead of the
hardcoded path.

Comment on lines +16 to +15
export CGO_LDFLAGS="-L/root/zty_workspace/mllm_zty/build-android-arm64-v8a/bin"
export CGO_CFLAGS="-I/root/zty_workspace/mllm_zty"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace hardcoded absolute paths with environment variables or relative paths.

The hardcoded absolute path /root/zty_workspace/mllm_zty will break builds on other machines. Consider using ${PROJECT_ROOT} or an environment variable like ${MLLM_ROOT}.

Apply this pattern:

-      export CGO_LDFLAGS="-L/root/zty_workspace/mllm_zty/build-android-arm64-v8a/bin"
-      export CGO_CFLAGS="-I/root/zty_workspace/mllm_zty"
+      export CGO_LDFLAGS="-L${MLLM_ROOT}/build-android-arm64-v8a/bin"
+      export CGO_CFLAGS="-I${MLLM_ROOT}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export CGO_LDFLAGS="-L/root/zty_workspace/mllm_zty/build-android-arm64-v8a/bin"
export CGO_CFLAGS="-I/root/zty_workspace/mllm_zty"
export CGO_LDFLAGS="-L${MLLM_ROOT}/build-android-arm64-v8a/bin"
export CGO_CFLAGS="-I${MLLM_ROOT}"
🤖 Prompt for AI Agents
In tasks/build_android_mllm_server.yaml around lines 16-17, the CGO_LDFLAGS and
CGO_CFLAGS use a hardcoded absolute path (/root/zty_workspace/mllm_zty) which
will break on other machines; change them to reference an environment variable
(e.g. ${MLLM_ROOT} or ${PROJECT_ROOT}) or a relative path, e.g. export
MLLM_ROOT=${MLLM_ROOT:-$(pwd)} at the top of the script and then use export
CGO_LDFLAGS="-L${MLLM_ROOT}/build-android-arm64-v8a/bin" and export
CGO_CFLAGS="-I${MLLM_ROOT}" so builds work across environments and CI.

@yuerqiqi yuerqiqi changed the title feat: 完成qwen3部分mllm-cli以及api封装 feat: Add mllm-cli support for Qwen3 and update docs Nov 4, 2025
@yuerqiqi
Copy link
Contributor Author

yuerqiqi commented Nov 4, 2025

I've just rebased this branch onto the latest upstream/v2. It's ready for review again.

@chenghuaWang
Copy link
Collaborator

LGTM

@chenghuaWang chenghuaWang merged commit a926f14 into UbiquitousLearning:v2 Nov 4, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants