Skip to content

Conversation

@samholmes
Copy link
Collaborator

@samholmes samholmes commented Feb 4, 2025

  • Fix typo in index
  • Use edge-server-tools utilities for db inits
  • Add clickhouseEngine

CHANGELOG

Does this branch warrant an entry to the CHANGELOG?

  • Yes
  • No

Dependencies

Description

none

const processors: {
[partnerId: string]: undefined | ((rawTx: unknown) => StandardTx)
} = {
banxa: processBanxaTx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the processTx routnies should only be used for a one time migration. After that the clickhouseEngine can directly read the database StandardTx.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment wasn't addressed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ask: don't run processor functions at all in the clickhouse loop. Just use the StandardTx type to migrate from couch to clickhouse

start_key: 0,
end_key: config.clickhouseIndexVersion,
inclusive_end: false
skip: PAGE_SIZE * i++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use couchdb bookmarks instead. far more efficient

let standardTx: StandardTx
try {
standardTx = processor(row.doc?.rawTx)
standardTx = processor(doc.rawTx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ignored my previous comment that we do NOT need to use each providers processor during the migration engine.

#198 (comment)

The processTx only needs to happen whenever we wish to change the schema based on the rawTx. This should be a one time migration. After that, just grab the standardTx from the database to populate clickhouse. This prevents needing to update the engine code everytime we add a new provider.

// We've reached the end of the view index, so we'll continue but with a
// delay so as not to thrash the couchdb unnecessarily.
if (response.rows.length !== PAGE_SIZE) {
if (response.docs.length !== PAGE_SIZE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use couchdb bookmarks as mentioned

Math.round(standardTx.timestamp),
standardTx.usdValue,
standardTx.indexVersion
config.clickhouseIndexVersion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this wouldn't be needed anymore now that we have the updatetime. Removing this prevents extra writes to couchdb

})
)

lastDocUpdateTime = standardTx.updateTime.toISOString()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better efficiency:

const txUpdateTime = standardTx.updateTime.toISOString()
if (lastDocUpdateTime == null || lastDocUpdateTime < txUpdateTime) {
  lastDocUpdateTime = txUpdateTime
} 

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so just max(last, curr)

if (response.docs.length !== PAGE_SIZE) {
i = 0
if (lastDocUpdateTime != null) {
afterTime = lastDocUpdateTime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afterTime should be saved in couch or clickhouse somewhere. Otherwise we'll migrate all of couch to clickhouse on reboot of the server.

reports_settings would be a good db in couch to use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.

const processors: {
[partnerId: string]: undefined | ((rawTx: unknown) => StandardTx)
} = {
banxa: processBanxaTx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment wasn't addressed

const { partnerId } = getDocumentIdentifiers(doc._id)
const processor = processors[partnerId]

if (processor == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would a processor every be null?

try {
standardTx = processor(doc.rawTx)
} catch (error) {
datelog(`Error processing ${doc._id}`, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should throw otherwise this document won't be migrated and we'd need to fix it before continuing.

standardTx.payoutCurrency,
standardTx.payoutAmount,
standardTx.status,
Math.round(standardTx.timestamp),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why round. our timestamp is in seconds so there should be a few decimals to account for MS.

timestamp: number

/** When the document was created. */
createTime?: Date
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be called updateTime since it's not when the doc was created but when it was last updated. the reports engine will continually update the doc as it changes status and this would change this time value.

values: newRows
})
// Update all documents processed
await dbTransactions.bulk({ docs: newDocs })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we updating the couchdb docs? Nothing was changed in the docs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants