Generate AI embedding vectors for AI expression topics #287

bobular · 2025-10-24T13:54:33Z

This PR integrates embedding API calls to generate semantic vector representations for each topic's headline and summary text. The embeddings are computed by concatenating the headline and summary with double newlines, then calling the embedding service to produce 512-dimensional vectors. A new embedding_vector field (number array) is optionally added to the topic response payload for use in client-side similarity calculations and visualization. The "Other" topic does not get a vector.

The token usage handling was overhauled while we were at it (to avoid applying the much higher chat model costs). The $0.02/M input tokens cost has been hardcoded in DailyCostMonitor

TO DO:

Add config mechanism for enabling/disabling the embedding functionality - now via a makeTopicEmbeddings prop in the JSON request payload to the SingleGeneAiExpressionReporter - defaults to false of course.
Remove requirement for OPENAI_API_KEY to be set if makeTopicEmbeddings is false
Add hard-coded configurable support for extended reasoning in ClaudeSummarizer

…ence case

…NT API calls

…-embeddings

bobular added 9 commits September 16, 2025 23:30

abstract Summarizer and concrete OpenAISummarizer

49f3f25

first attempt at ClaudeSummarizer

7cc64d3

switch to Claude and tweaks

6b2e379

deprecate OPENAI_-prefixed daily cost env vars

fa12327

add more retries for Claude

7d7ba6f

Anthropic SDK upgrade to 2.9.0; use Sonnet 4.5; tweak prompt for sent…

98330e5

…ence case

added embedding stuff; not tested; no cost monitoring

9e929e0

tidy up token usage

6c5818f

merge from master; fix import paths for latest OpenAI SDK

57617da

bobular changed the base branch from master to ai-expression-claude October 27, 2025 11:51

bobular added 3 commits October 27, 2025 12:08

better handling of AI API 500 responses and configurable MAX_CONCURRE…

bf83d47

…NT API calls

rewrite retry logic for Java 11 (was 12+)

410732a

merged in API robustness updates from ai-expression-claude branch

95757e6

bobular requested a review from ryanrdoherty October 29, 2025 17:49

Merge remote-tracking branch 'origin/master' into ai-expression-topic…

a84d34d

…-embeddings

bobular changed the base branch from ai-expression-claude to master November 8, 2025 18:09

bobular mentioned this pull request Nov 8, 2025

ClaudeSummarizer and OpenAISummarizer subclasses #286

Draft

5 tasks

bobular added 3 commits November 8, 2025 18:46

make topic embedding configurable by hard-coding, default is off

69e8dc1

topic embeddings now configured by reporter request payload

77b6adc

add Claude extended thinking - off by default

f6ab3ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generate AI embedding vectors for AI expression topics #287

Generate AI embedding vectors for AI expression topics #287

Uh oh!

bobular commented Oct 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Generate AI embedding vectors for AI expression topics #287

Are you sure you want to change the base?

Generate AI embedding vectors for AI expression topics #287

Uh oh!

Conversation

bobular commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bobular commented Oct 24, 2025 •

edited

Loading