Segmentation fault during embedBatch on ARM64 Linux (aarch64)

## Description

`node-llama-cpp` segfaults during `embedBatch` calls on ARM64 Linux (aarch64). The crash occurs consistently when embedding documents using the `embeddinggemma-300M-Q8_0.gguf` model, even with as few as 2 documents (36 chunks).

The issue reproduces on both Bun 1.3.8 and 1.3.11.

## Environment

- **OS:** Ubuntu 24.04, Linux 6.17.0-1009-oracle aarch64
- **CPU:** ARM Neoverse (Oracle Cloud ARM instance, `neon fp aes crc32 atomics`)
- **node-llama-cpp version:** 3.15.1
- **Bun versions tested:** 1.3.8, 1.3.11
- **Model:** `hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf`
- **RAM:** 24GB available, process uses ~700MB RSS before crash
- **No GPU** — falls back to CPU (`[node-llama-cpp] A prebuilt binary was not found, falling back to using no GPU`)

## Reproduction

Using [qmd](https://github.com/tobi/qmd) which calls node-llama-cpp for embeddings:

```bash
qmd embed  # crashes after model loads and begins embedding computation
```

Minimal reproduction: any call to `session.embedBatch(texts)` with the embedding model loaded on ARM64 Linux. Crashes regardless of batch size (tested with 2 docs / 36 chunks up to 2353 docs / 4648 chunks).

## Crash Output

```
Chunking 2 documents by token count...
[node-llama-cpp] A prebuilt binary was not found, falling back to using no GPU
Embedding 2 documents (36 chunks, 78.8 KB)
Model: embeddinggemma

============================================================
Bun v1.3.11 (af24e281) Linux arm64
Linux Kernel v6.17.0 | glibc v2.39
CPU: neon fp aes crc32 atomics

panic: Segmentation fault at address 0xE20BE9728980
oh no: Bun has crashed. This indicates a bug in Bun, not your code.
```

Bun crash report: https://bun.report/1.3.8/La1b64edcbijEugggCuzynqE+lF_2huF+y3F+95FmmjM291jBm13jBuhklB+2imBm8/Lur+L+t9t9C+zzvgD21shB2096BA2+43DgwkvzJ

## Notes

- The crash happens deep in the native embedding/GGML code, not in JS/TS
- Same model and code works fine on x86_64 Linux
- Process reaches ~700MB RSS / 74GB VSZ before the segfault
- The prebuilt binary message suggests no optimized ARM64 build is available, so it falls back to a generic build
- **Workaround:** Using Ollama with `nomic-embed-text` via HTTP API for embeddings instead of node-llama-cpp

## Expected Behavior

`embedBatch` should complete without crashing on ARM64 Linux.

## Actual Behavior

Segmentation fault after loading the model and beginning embedding computation. Occurs consistently on every run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Segmentation fault during embedBatch on ARM64 Linux (aarch64) #590

Description

Environment

Reproduction

Crash Output

Notes

Expected Behavior

Actual Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Segmentation fault during embedBatch on ARM64 Linux (aarch64) #590

Description

Description

Environment

Reproduction

Crash Output

Notes

Expected Behavior

Actual Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions