Skip to content

Conversation

@larskemper
Copy link
Member

@larskemper larskemper commented Dec 19, 2025

Before the validation would just compare against the EntityDefinition, but there where logs for fields that have a Required:class flag but are nullable in the database or have a defined default value and so are not necessarily needed for writing the data.

Changes

  • added getRequiredDatabaseColumns() to MigrationValidationService to get access to nullable / default value information
  • required field are saved to don't fetch the schema every time a entity get validated
  • added filterRequiredFields() to filter fields that: have Required::class flag + have no default value + are not nullable

@larskemper larskemper self-assigned this Dec 19, 2025
@larskemper larskemper marked this pull request as ready for review December 19, 2025 10:49
@larskemper larskemper changed the title refactor: filter optional database field refactor: filter optional database fields Dec 19, 2025
}

/**
* Validates that all required fields are present and that no unexpected fields exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and that no unexpected fields exist.

Are unexpected fields actually an issue? e.g. do they cause an exception on write? You could double check that by calling the admin API with data for an entity that includes "unexpected fields" 🤔

If it works fine I wouldn't report it, because imagining you migrate from SW6.7.5.0 to SW6.7.4.0 (for whatever reason), and because the source system is a minor version ahead, it might include an additional field that doesn't exists in the target system yet. Right now our converters for SW6 are rather dump and just copy the data over, which means it might produce an validation error that isn't an issue in practice

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not specifically implemented it will not throw by default, but the error is also just a warning + unfixable

Comment on lines 260 to 265
$hasMapping = $this->mappingService->hasValidMappingByEntityId(
$validationContext->getMigrationContext()->getConnection()->getId(),
$referenceEntity,
$fkValue,
$validationContext->getContext()
);
Copy link
Contributor

@MalteJanz MalteJanz Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about this: we might ignore a edge case here:

Yes in the best case, e.g. an product is converted (and has a mapping) before an order is converted and then trying to look up that product id in the mappings DB should work. So far so good.

But in practice we also often convert an entity together with some associations. e.g. the product data contains it's manufacturer data:

$converted['manufacturer'] = $this->getManufacturer($data['manufacturer']);

which still creates a mapping entry:

$mapping = $this->mappingService->getOrCreateMapping(
$this->connectionId,
DefaultEntities::PRODUCT_MANUFACTURER,
$data['id'],
$this->context
);

but that mapping entry isn't immediately flushed to the DB, only after a whole batch is processed:

The mapping service correctly handles this with it's internal cache:

$cacheKey = $entityName . $oldIdentifier;
if (isset($this->mappings[$cacheKey])) {
return $this->mappings[$cacheKey];
}

So doing the same mapping lookup twice will return the same uuid instead of generating a new one.

BUT it looks like your MigrationValidationService is called once per entity before the mappings are flushed to the DB:

$this->validationService->validate(
$migrationContext,
$context,
$convertStruct->getConverted(),
$dataSet::getEntity(),
$item
);

Which isn't a problem in itself, but here comes the

TLDR: the marked function call to $this->mappingService->hasValidMappingByEntityId is implemented like this:

public function hasValidMappingByEntityId(string $connectionId, string $entityName, string $entityId, Context $context): bool
{
$criteria = new Criteria();
$criteria->addFilter(
new EqualsFilter('connectionId', $connectionId),
new EqualsFilter('entity', $entityName),
new EqualsFilter('entityId', $entityId),
new NotFilter(MultiFilter::CONNECTION_AND, [
new EqualsFilter('oldIdentifier', null),
]),
);
$criteria->setLimit(1);
return $this->migrationMappingRepo->searchIds($criteria, $context)->getTotal() > 0;
}

And thus always reaches for the DB, without respecting the local cache first. So these cases likely create validation errors, even though the mappings exists, just not yet in the DB 🤯

=> that hasValidMappingByEntityId method needs to be adjusted to search through the $this->writeArray first (unfortunately it can't use the $this-mappings cache, because that is indexed by old identifiers, not the new entityId)

Copy link
Contributor

@MalteJanz MalteJanz Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also thinking further about it: Is this kind of validation really necessary in practice? Which issues in a migration does it detect?

Because it creates another N+1 Query problem by potentially having to look for multiple existing mappings in the DB for every processed entity, which can't really be preloaded with the whole entity batch.

And that adds more complexity and brings us further away from solving that "N+1 Query Problem" for the migration in general, thus slower migrations / worse performance

@MalteJanz MalteJanz self-requested a review December 19, 2025 14:07
);
}

public function testShouldLogWhenEntityHasNowId(): void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function testShouldLogWhenEntityHasNowId(): void
public function testShouldLogWhenEntityHasNoId(): void

?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants