-
Notifications
You must be signed in to change notification settings - Fork 26
refactor: filter optional database fields #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/migration-logging-refactor
Are you sure you want to change the base?
refactor: filter optional database fields #105
Conversation
| } | ||
|
|
||
| /** | ||
| * Validates that all required fields are present and that no unexpected fields exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and that no unexpected fields exist.
Are unexpected fields actually an issue? e.g. do they cause an exception on write? You could double check that by calling the admin API with data for an entity that includes "unexpected fields" 🤔
If it works fine I wouldn't report it, because imagining you migrate from SW6.7.5.0 to SW6.7.4.0 (for whatever reason), and because the source system is a minor version ahead, it might include an additional field that doesn't exists in the target system yet. Right now our converters for SW6 are rather dump and just copy the data over, which means it might produce an validation error that isn't an issue in practice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not specifically implemented it will not throw by default, but the error is also just a warning + unfixable
| $hasMapping = $this->mappingService->hasValidMappingByEntityId( | ||
| $validationContext->getMigrationContext()->getConnection()->getId(), | ||
| $referenceEntity, | ||
| $fkValue, | ||
| $validationContext->getContext() | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about this: we might ignore a edge case here:
Yes in the best case, e.g. an product is converted (and has a mapping) before an order is converted and then trying to look up that product id in the mappings DB should work. So far so good.
But in practice we also often convert an entity together with some associations. e.g. the product data contains it's manufacturer data:
| $converted['manufacturer'] = $this->getManufacturer($data['manufacturer']); |
which still creates a mapping entry:
SwagMigrationAssistant/src/Profile/Shopware/Converter/ProductConverter.php
Lines 696 to 701 in 5d4ec50
| $mapping = $this->mappingService->getOrCreateMapping( | |
| $this->connectionId, | |
| DefaultEntities::PRODUCT_MANUFACTURER, | |
| $data['id'], | |
| $this->context | |
| ); |
but that mapping entry isn't immediately flushed to the DB, only after a whole batch is processed:
| $converter->writeMapping($context); |
The mapping service correctly handles this with it's internal cache:
SwagMigrationAssistant/src/Migration/Mapping/MappingService.php
Lines 114 to 117 in 8f780ff
| $cacheKey = $entityName . $oldIdentifier; | |
| if (isset($this->mappings[$cacheKey])) { | |
| return $this->mappings[$cacheKey]; | |
| } |
So doing the same mapping lookup twice will return the same uuid instead of generating a new one.
BUT it looks like your MigrationValidationService is called once per entity before the mappings are flushed to the DB:
SwagMigrationAssistant/src/Migration/Service/MigrationDataConverter.php
Lines 114 to 120 in 5d4ec50
| $this->validationService->validate( | |
| $migrationContext, | |
| $context, | |
| $convertStruct->getConverted(), | |
| $dataSet::getEntity(), | |
| $item | |
| ); |
Which isn't a problem in itself, but here comes the
TLDR: the marked function call to $this->mappingService->hasValidMappingByEntityId is implemented like this:
SwagMigrationAssistant/src/Migration/Mapping/MappingService.php
Lines 217 to 231 in 8f780ff
| public function hasValidMappingByEntityId(string $connectionId, string $entityName, string $entityId, Context $context): bool | |
| { | |
| $criteria = new Criteria(); | |
| $criteria->addFilter( | |
| new EqualsFilter('connectionId', $connectionId), | |
| new EqualsFilter('entity', $entityName), | |
| new EqualsFilter('entityId', $entityId), | |
| new NotFilter(MultiFilter::CONNECTION_AND, [ | |
| new EqualsFilter('oldIdentifier', null), | |
| ]), | |
| ); | |
| $criteria->setLimit(1); | |
| return $this->migrationMappingRepo->searchIds($criteria, $context)->getTotal() > 0; | |
| } |
And thus always reaches for the DB, without respecting the local cache first. So these cases likely create validation errors, even though the mappings exists, just not yet in the DB 🤯
=> that hasValidMappingByEntityId method needs to be adjusted to search through the $this->writeArray first (unfortunately it can't use the $this-mappings cache, because that is indexed by old identifiers, not the new entityId)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also thinking further about it: Is this kind of validation really necessary in practice? Which issues in a migration does it detect?
Because it creates another N+1 Query problem by potentially having to look for multiple existing mappings in the DB for every processed entity, which can't really be preloaded with the whole entity batch.
And that adds more complexity and brings us further away from solving that "N+1 Query Problem" for the migration in general, thus slower migrations / worse performance
| ); | ||
| } | ||
|
|
||
| public function testShouldLogWhenEntityHasNowId(): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public function testShouldLogWhenEntityHasNowId(): void | |
| public function testShouldLogWhenEntityHasNoId(): void |
?
…r-optional-database-field
Before the validation would just compare against the EntityDefinition, but there where logs for fields that have a
Required:classflag but are nullable in the database or have a defined default value and so are not necessarily needed for writing the data.Changes
getRequiredDatabaseColumns()toMigrationValidationServiceto get access to nullable / default value informationfilterRequiredFields()to filter fields that: haveRequired::classflag + have no default value + are not nullable