SWDEV-558359 - [Topaz][Gigapixel][Performance] The clc 512x512 models are running with overall lower performance for MIGraphX EP than for DirectML EP #188

KenLagos · 2025-10-23T21:44:51Z

Description

Bug Fix: large integers incorrectly converted to -1 in IsUnsupportedOpMode() function

prevent casting of int64 to int32 in IsUnsupportedOpMode() function for "Slice" operator

Motivation and Context

Currently clc models in topaz underperform DML by 52x and the root cause is due to the "Slice" node being flagged as unsupported.

This change fixes an issue where start and end attributes for the slice node were being converted to -1 instead of their real int64 value. The -1 end or start value was causing migraphx/onnxruntime to incorrectly flag this node as unsupported.

Local testing on my machine suggests inference time goes from 90 seconds to 3 seconds which is massive boost in performance and closes the gap between DML and Migraphx performance considerably.

…pMode() function - prevent casting of int64 to int32 in IsUnsupportedOpMode() function for "Slice" operator

Zhaeong · 2025-10-23T23:32:38Z

This will need to be cherry-picked into wml-main after merging

TedThemistokleous · 2025-10-24T15:56:19Z

What in the slice check is being unsupported? Why are you modifying code that's doing index checks?

onnxruntime/core/providers/migraphx/migraphx_execution_provider.cc

TedThemistokleous

The fix is in toVector call to use a larger size, its using int which may default to 32 bits on some systems. If you're seeing rollover that's likely the culprit

Either make this a larger type to avoid casting things down to int32 (since int-> unsigned) or use size_t/int64_t

std::vector<int> toVector(const ONNX_NAMESPACE::int64s& nums) {
  std::vector<int> result;
  int num = nums.size();
  for (int i = 0; i < num; ++i) {
    result.push_back(static_cast<int>(nums[i]));
  }

  return result;
}

Also don't remove the .at() bounds checks here. These shouldn't be removed for direct access [ ]

…into swdev-558359

TedThemistokleous · 2025-10-24T18:08:58Z

@Zhaeong good to cherry-pick this off the tip of rocm7.1_internal_testing. I'll upstream this to OnnxRT

TedThemistokleous · 2025-10-24T18:36:22Z

Upstreamed here - microsoft#26403

Zhaeong · 2025-10-24T19:28:24Z

Merged to wml-main:
#189

KenLagos and others added 3 commits October 23, 2025 17:40

Bug Fix: large integers incorrectly converted to -1 in IsUnsupportedO…

0c58435

…pMode() function - prevent casting of int64 to int32 in IsUnsupportedOpMode() function for "Slice" operator

remove testing code

7837f02

Merge branch 'rocm7.1_internal_testing' into swdev-558359

dd1ff5b

KenLagos requested a review from apwojcik October 24, 2025 14:52

KenLagos self-assigned this Oct 24, 2025

KenLagos requested a review from TedThemistokleous October 24, 2025 14:53

Merge branch 'rocm7.1_internal_testing' into swdev-558359

20d538e

TedThemistokleous reviewed Oct 24, 2025

View reviewed changes

onnxruntime/core/providers/migraphx/migraphx_execution_provider.cc Show resolved Hide resolved

TedThemistokleous requested changes Oct 24, 2025

View reviewed changes

KenLagos added 2 commits October 24, 2025 12:50

refactor to use toVector() function

91a78ea

Merge branch 'swdev-558359' of https://github.com/KenLagos/onnxruntime …

bf4181a

…into swdev-558359

KenLagos requested a review from TedThemistokleous October 24, 2025 16:54

TedThemistokleous approved these changes Oct 24, 2025

View reviewed changes

TedThemistokleous merged commit 8dfad21 into ROCm:rocm7.1_internal_testing Oct 24, 2025
2 of 4 checks passed

TedThemistokleous added the Upstream Changset that should be merged upstream to Microsoft/Onnxruntime label Oct 24, 2025

TedThemistokleous mentioned this pull request Oct 24, 2025

[MIGraphX EP] Bugfix for toVector Function causing rollover with int64 inputs microsoft/onnxruntime#26403

Open

TedThemistokleous added the Promote Cherry-pick this change to latest ROCm internal branch label Oct 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SWDEV-558359 - [Topaz][Gigapixel][Performance] The clc 512x512 models are running with overall lower performance for MIGraphX EP than for DirectML EP #188

SWDEV-558359 - [Topaz][Gigapixel][Performance] The clc 512x512 models are running with overall lower performance for MIGraphX EP than for DirectML EP #188

Uh oh!

KenLagos commented Oct 23, 2025

Uh oh!

Zhaeong commented Oct 23, 2025

Uh oh!

TedThemistokleous commented Oct 24, 2025

Uh oh!

Uh oh!

TedThemistokleous left a comment •

edited

Loading

Uh oh!

Uh oh!

TedThemistokleous commented Oct 24, 2025

Uh oh!

TedThemistokleous commented Oct 24, 2025

Uh oh!

Zhaeong commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SWDEV-558359 - [Topaz][Gigapixel][Performance] The clc 512x512 models are running with overall lower performance for MIGraphX EP than for DirectML EP #188

SWDEV-558359 - [Topaz][Gigapixel][Performance] The clc 512x512 models are running with overall lower performance for MIGraphX EP than for DirectML EP #188

Uh oh!

Conversation

KenLagos commented Oct 23, 2025

Description

Motivation and Context

Uh oh!

Zhaeong commented Oct 23, 2025

Uh oh!

TedThemistokleous commented Oct 24, 2025

Uh oh!

Uh oh!

TedThemistokleous left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TedThemistokleous commented Oct 24, 2025

Uh oh!

TedThemistokleous commented Oct 24, 2025

Uh oh!

Zhaeong commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TedThemistokleous left a comment •

edited

Loading