Skip to content

Comments

B4513#92

Closed
idales wants to merge 9 commits intokherud:b3751from
idales:b4513
Closed

B4513#92
idales wants to merge 9 commits intokherud:b3751from
idales:b4513

Conversation

@idales
Copy link

@idales idales commented Jan 30, 2025

support for the b4513 version of the llama.cpp

@idales idales closed this Jan 30, 2025
@idales idales reopened this Jan 30, 2025
@idales
Copy link
Author

idales commented Jan 31, 2025

@kherud,

I've fixed a build issue from commit 4dc9dd1 by moving the external CMake option LLAMA_BUILD_COMMON=ON directly into CMakeLists.txt. However, I'm unsure if the pipeline has run with this update yet.

@kherud
Copy link
Owner

kherud commented Jan 31, 2025

Hi @idales thank you very much for this PR! I'll look at it over the weekend and merge it. Until then I'll try to approve running the pipeline.

@kherud
Copy link
Owner

kherud commented Feb 3, 2025

Hi @idales building the C++ code locally, I get the same error as the CI runners GGML_ASSERT(n_threads > 0) failed. Unfortunately I didn't have the time too look more deeply into it over the weekend. Does it work for you (and if yes, which platform)? I'll investigate it more closely over the week.

@vaiju1981
Copy link
Contributor

I was able to pass it by passing modelParameters.setNthreads(8) so instead of default value. But i have another error that I am getting and looking into it.

Error: libc++abi: terminating due to uncaught exception of type nlohmann::json_abi_v3_11_3::detail::type_error: [json.exception.type_error.302] type must be number, but is array

@kherud
Copy link
Owner

kherud commented Feb 3, 2025

I was able to pass it by passing modelParameters.setNthreads(8) so instead of default value. But i have another error that I am getting and looking into it.

Awesome, nice hint, thanks!

Error: libc++abi: terminating due to uncaught exception of type nlohmann::json_abi_v3_11_3::detail::type_error: [json.exception.type_error.302] type must be number, but is array

The Java library has the two classes ModelParameters and InferenceParameters to configure llama.cpp. Both classes create a JSON string under the hood, which is then passed to llama.cpp (since its easier to re-use llama.cpp code that way). This string is parsed via nlohmann::json. My suspicion is that in earlier versions there was a parameter that used to be an array and still is in the two Java classes, but the newer llama.cpp version now expects a single number instead.

@vaiju1981
Copy link
Contributor

Looks like it's tensor_split ( abetlen/llama-cpp-python#1016 ) but i don't see it set anywhere and i even removed ( commented ) [] brackets in the code.

Comment on lines -357 to 406
{
gpt_params params;
common_params params;

auto *ctx_server = new server_context();

std::string c_params = parse_jstring(env, jparams);
json json_params = json::parse(c_params);
server_params_parse(json_params, params);

if (json_value(json_params, "disable_log", false))
const jsize argc = env->GetArrayLength(jparams);
char **argv = parse_string_array(env, jparams, argc);
if (argv == nullptr)
{
log_disable();
}
else
{
log_enable();
return;
}

if (!params.system_prompt.empty())
const auto parsed_params = common_params_parse(argc, argv, params, LLAMA_EXAMPLE_SERVER);
free_string_array(argv, argc);
if (!parsed_params)
{
ctx_server->system_prompt_set(params.system_prompt);
return;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vaiju1981 I changed how the model parameters are passed from Java to C++. The code now assembles a CLI string (e.g. something like --model /path/to/file -c 2048). This way we can better re-use the existing llama.cpp code to parse the parameters and don't have to rely on custom logic. So far it seems to work great, but I'm now stuck after requesting a completion and getting no output. I'll investigate further over the next few days.

@idales
Copy link
Author

idales commented Feb 4, 2025

Hi @idales building the C++ code locally, I get the same error as the CI runners GGML_ASSERT(n_threads > 0) failed. Unfortunately I didn't have the time too look more deeply into it over the weekend. Does it work for you (and if yes, which platform)? I'll investigate it more closely over the week.

Hi @kherud
I am creating an Android project on Ubuntu. In the project I load a model and run the completion. Indeed, the latest versions of llama.cpp require mandatory assignment of the number of threads. I do this in my code.

@vaiju1981
Copy link
Contributor

vaiju1981 commented Feb 4, 2025

I am able to pass all the test except for embedding. I will keep you posted if I can fix it.

Right now the only issue is with embedding call: libc++abi: terminating due to uncaught exception of type nlohmann::json_abi_v3_11_3::detail::type_error: [json.exception.type_error.302] type must be number, but is array

@vaiju1981
Copy link
Contributor

How important are testLogJSON and testLogText tests, apart from these and embedding i am able to run all the other tests.

@kherud
Copy link
Owner

kherud commented Feb 7, 2025

I'd say if logging in general works don't worry about the tests. Logging is tricky to test, so the tests were never really reliable anyway. Feel free to shoot a pull request 👍

@vaiju1981
Copy link
Contributor

vaiju1981 commented Feb 12, 2025

apart from logging, rest is working fine, I also updated to the latest release tag of llama.cpp.

I created new PR: #93

It kept on giving me permission error when updating this branch.

@kherud
Copy link
Owner

kherud commented Mar 9, 2025

Thank you again for this PR. We finally finished #93 and just released a new major version to upgrade to the newest available llama.cpp version. Thus I'll close this PR.

@kherud kherud closed this Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants