Custom llama.cpp fork with character intelligence engine: control vectors, attention bias, head rescaling, attention temperature, fast weight memory
android c-plus-plus ndk jni quantization attention-mechanism arm-neon edge-ai mobile-ai llama-cpp character-ai ggml gguf on-device-inference control-vectors
-
Updated
Apr 4, 2026 - C++