Conversation
dd68c08 to
ddf105c
Compare
b3b6edf to
900fefd
Compare
900fefd to
bcb14fa
Compare
| ) => { | ||
| const normalizedLetters = codeLetters ?? '??'; | ||
| const normalizedNumber = codeArtistNumber ?? 0; | ||
| const artistKey = `${artistName.toLowerCase()}|${normalizedLetters}|${normalizedNumber}`; |
There was a problem hiding this comment.
ensureArtist cache key does not account for genre
The artist cache key at line 216 is:
const artistKey = `${artistName.toLowerCase()}|${normalizedLetters}|${normalizedNumber}`;But the DB query for non-various artists filters by genre_id (line 229):
.where(isVarious ? and(...baseConditions) : and(...baseConditions, eq(artists.genre_id, genreId)))If the same artist name+code appears in multiple genres, the cache returns the ID from the first genre without querying the DB for the second. There are 13 artists in the legacy database that would be affected:
norm_name | norm_letters | genre_ids
-------------------+--------------+----------
a.m. architect | AM | 6,15
boris kovac | KO | 4,9
brian harnetty | HA | 4,5
dan crary | CR | 9,14
das torpedoes | DA | 6,15
davenport | DA | 9,11
jpp | JP | 4,9
kwame | KW | 6,11
patrick o'hearn | OH | 7,11
randy greif | GR | 13,15
rhythm & sound | RH | 10,15
various | Z- | 7,10
wzt hearts | WZ | 4,11
For these artists, releases in the second genre encountered would be linked to an artist row with the wrong genre_id. The genre_artist_crossreference entry would still be created correctly, but the artist row's primary genre would be wrong.
Fix: include genreId in the cache key for non-various artists:
const artistKey = isVarious
? `${artistName.toLowerCase()}|${normalizedLetters}|${normalizedNumber}`
: `${artistName.toLowerCase()}|${normalizedLetters}|${normalizedNumber}|${genreId}`;There was a problem hiding this comment.
Oh I meant to get rid of the genre column in the artists table entirely. The genre cross reference table is all we need
There was a problem hiding this comment.
We don't want multiple rows in the artist table for the same artist in different genres
Summary
jobs/*targets indeploy-base.yml, including cron schedule setup, validation, and verification on EC2.Dockerfile.library-etlto build/run the library ETL job container and update workflow detection to include job targets.library-etljob implementation with legacy ETL logic, format parsing, artist/genre handling, and cronjob run tracking.MirrorSQLcleanup/timeout handling) and update schema/migrations to support new ETL fields (cronjob runs, code volume letters, etc.).