Added PLR capability by adityalj · Pull Request #6 · b-shi/hipBLASLt

adityalj · 2025-07-14T23:24:20Z

No description provided.

adityalj · 2025-07-22T18:35:09Z

tensilelite/Tensile/SolutionStructs/Solution.py

        reject(state, printRejectionReason, "MIWaveTile0(%u) should be multiple of VectorWidthB(%u)" % (state["MIWaveTile"][1], state["VectorWidthB"]))
        return

+    if ((state["DepthU"] ==state["MatrixInstK"] )and state["PrefetchGlobalRead"]):


Add Wavesizes not supported here

Rename D_U_iseqMI_K as DuEqMIK

adityalj · 2025-07-22T18:52:29Z

tensilelite/Tensile/Components/SIA.py

+        mfmaiter2 = math.ceil(kernel["MIWaveTile"][0]/2) * math.floor(kernel["MIWaveTile"][1]/2)
+        writer.states.syncPlrMfmaIndex = (mfmaiter0 + mfmaiter1 + mfmaiter2)
+        if ( kernel["UseF32XEmulation"]) :
+            writer.states.syncPlrMfmaIndex = writer.states.syncPlrMfmaIndex *3   # TF32


For Complex *4

adityalj · 2025-07-22T19:00:52Z

tensilelite/Tensile/Components/SIA.py

    if kernel["1LDSBuffer"] or kernel["DirectToLds"]:
        writer.states.sync1LdsMfmaIndex = max(writer.states.lwStartMfmaIndex - 1, 0)
    startIter = writer.states.lwStartMfmaIndex//numMfmaPerIter
+    if kernel["D_U_iseqMI_K"]:


make it more flexible

adityalj · 2025-07-22T19:03:54Z

tensilelite/Tensile/Components/LocalRead.py

        else:
-            for vIdx in range(0, numVectorsPerTile):
-                for eIdx in range(0, numReadsPerVector):
+            eIdxCnt = numReadsPerVector


Test on usual cases

Generate an example to demonstrate the issue

adityalj · 2025-07-22T19:16:56Z

tensilelite/Tensile/KernelWriter.py

      isBarrier = kernel["LoopIters"] - self.states.numItersPLR
      writeItems = list(localWriteCode.items())
      macIterItems = macIterCode.flatitems()
+      numMfmaPerIter = len(macIterItems)


print subiter instead pf iter at appropriate place

adityalj · 2025-07-22T19:19:55Z

tensilelite/Tensile/KernelWriter.py

      itemCounter = 0
      for i in range(numMfmaPerIter):
-        mfmaIndex = iteration * numMfmaPerIter + i
+        kernel["mfmaIndex"] = kernel["mfmaIndex"] + 1


state instead of kernel

adityalj · 2025-07-22T19:23:08Z

tensilelite/Tensile/KernelWriter.py

            iterCode.add(SSetPrior(prior=3, comment="store optimization"))
        if (mfmaIndex >= self.states.lwStartMfmaIndex):
          numLoops, itemCounter = calculateRangeAndUpdateCounter(itemCounter, localWriteCodeCounts, self.states.numLocalWriteModPerMfma)
+          if kernel["D_U_iseqMI_K"]:


Consider DTL scenario

adityalj · 2025-07-22T19:29:50Z

tensilelite/Tensile/KernelWriter.py

          self.makeSchedule(kernel, tensorParametersA, tensorParametersB, localWriteEndIter, skipGlobalReadInc=False, lastLoop=NLLlast, isNGLL=isNGLL)
          module.add(self.codes.unrollLoopHeader)

+      if kernel["D_U_iseqMI_K"]:


Depend on quad cycle count

adityalj added 4 commits July 14, 2025 17:43

Added PLR capability

90541e8

Added 3 multipleir for tf32

74410a2

tf32 8x8 tile size works

684c23b

Most of the sizes fixed

474e91c

adityalj commented Jul 23, 2025

View reviewed changes

Changed iter to subiter

a3fd9d7

adityalj force-pushed the adijoshi_plr_tf32 branch from 580587b to a3fd9d7 Compare July 23, 2025 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added PLR capability#6

Added PLR capability#6
adityalj wants to merge 5 commits intob-shi:tf32_perffrom
adityalj:adijoshi_plr_tf32

adityalj commented Jul 14, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

adityalj Jul 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adityalj commented Jul 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant