Skip to content

Conversation

@kylediaz
Copy link
Contributor

@kylediaz kylediaz commented Jan 8, 2026

This PR adds an alternative implementation of embeddingsToBase64Bytes for environments with access to Buffer, for a 6-7x speed increase.

I've been having trouble getting a lot of throughput when mass batch uploading documents using our JS client. I measured what was happening and one of the culprits was the embeddingsToBase64Bytes function.

Screenshot 2026-01-07 at 4.47.30 PM.png

I measured that when uploading batches of 300 docs w/ 1536 dim vectors, there was ~60ms of synchronous work being done before the API call. The largest culprit of which was embeddingsToBase64Bytes.

There's actually nothing wrong with the current implementation. Browsers do not have access to Buffer, so the existing implementation is fine in light of that. I still want to have more performance on node and bun.

I wrote a script to measure both implementations and this is what I get:

Batch Dims Original Buffer Speedup
10 384 0.273ms 0.044ms 6.2x
10 768 0.393ms 0.057ms 6.9x
10 1536 0.747ms 0.111ms 6.7x
10 3072 1.459ms 0.233ms 6.3x
100 384 2.016ms 0.295ms 6.8x
100 768 3.682ms 0.587ms 6.3x
100 1536 7.410ms 1.156ms 6.4x
100 3072 14.514ms 2.330ms 6.2x
300 384 6.210ms 0.916ms 6.8x
300 768 11.500ms 1.819ms 6.3x
300 1536 22.238ms 3.391ms 6.6x
300 3072 43.918ms 7.010ms 6.3x
1000 384 22.009ms 3.045ms 7.2x
1000 768 39.422ms 6.017ms 6.6x
1000 1536 76.193ms 11.729ms 6.5x
1000 3072 155.027ms 23.395ms 6.6x

With this change, I should expect to get up to 50% more throughput on node and bun.

Copy link
Contributor Author

kylediaz commented Jan 8, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link

github-actions bot commented Jan 8, 2026

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@kylediaz kylediaz requested a review from itaismith January 8, 2026 01:10
@kylediaz kylediaz marked this pull request as ready for review January 8, 2026 01:10
@propel-code-bot
Copy link
Contributor

The change set also adds comprehensive Jest coverage to confirm the Buffer fast path and the existing portable implementation yield identical base64 payloads across a wide range of numerical edge cases.

Affected Areas

• clients/new-js/packages/chromadb/src/utils.ts
• clients/new-js/packages/chromadb/test/utils.test.ts

This summary was automatically generated by @propel-code-bot

@kylediaz
Copy link
Contributor Author

kylediaz commented Jan 8, 2026

I should note that there is an alternative implementation that is 2x faster than the one that's in this PR:

function embeddingsToBase64_prealloc(embeddings: number[][]): string[] {
    if (embeddings.length === 0) return [];
    
    const dim = embeddings[0]!.length;
    const buffer = Buffer.alloc(dim * 4);
    
    return embeddings.map((embedding) => {
        for (let i = 0; i < embedding.length; i++) {
            buffer.writeFloatLE(embedding[i]!, i * 4);
        }
        return buffer.toString("base64");
    });
}

But the size of each embedding must all be the same, but we don't actually validate that all embeddings are the same size before upload. This is currently enforced by the server.

@kylediaz
Copy link
Contributor Author

kylediaz commented Jan 8, 2026

Buffer is something specific to node, and doesn't extend to other environments I care about, particularly Cloudflare workers. I wonder if there is a different pattern we can use for the portable version.
I implemented a few alternate versions, but I can't break more than a 10% speed improvement which is basically nothing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants