Skip to content

Conversation

@MichalStrehovsky
Copy link
Member

Successor of #112406. That branch has way too many merge conflicts, so leaving the PR as-is for posterity and starting a new one with a clean slate (I basically manually ported everything).

This changes interface calls when CFG is enabled to use a resolve helper instead of the stub dispatcher. I've also extended the CFG testing a bit to cover more of interface calls in the smoke test.

static int Call(IFoo f, int x, int y) => f.Call(x, y);

Before this change:

00007FF605971D50  push        rbx
00007FF605971D51  sub         rsp,20h
00007FF605971D55  mov         qword ptr [rsp+30h],rcx
00007FF605971D5A  mov         r10d,edx
00007FF605971D5D  mov         ebx,r8d
00007FF605971D60  mov         rcx,qword ptr [__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF6059BEBD0h)]
00007FF605971D67  call        qword ptr [__guard_check_icall_fptr (07FF6059CD558h)]
00007FF605971D6D  mov         rax,rcx
00007FF605971D70  mov         rcx,qword ptr [rsp+30h]
00007FF605971D75  mov         r8d,ebx
00007FF605971D78  mov         edx,r10d
00007FF605971D7B  lea         r10,[__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF6059BEBD0h)]
00007FF605971D82  call        rax
00007FF605971D84  nop
00007FF605971D85  add         rsp,20h
00007FF605971D89  pop         rbx
00007FF605971D8A  ret

After this change:

00007FF704181CF0  sub         rsp,28h
00007FF704181CF4  lea         r10,[__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF7041CEBD0h)]
00007FF704181CFB  call        RhpResolveInterfaceMethodFast (07FF703FFF5A0h)
00007FF704181D00  call        qword ptr [__guard_dispatch_icall_fptr (07FF7041DD560h)]
00007FF704181D06  nop
00007FF704181D07  add         rsp,28h
00007FF704181D0B  ret

Cc @dotnet/ilc-contrib @jakobbotsch

Successor of dotnet#112406. That branch has way too many merge conflicts, so leaving the PR as-is for posterity and starting a new one with a clean slate (I basically manually ported everything).

This changes interface calls when CFG is enabled to use a resolve helper instead of the stub dispatcher. I've also extended the CFG testing a bit to cover more of interface calls in the smoke test.

```csharp
static int Call(IFoo f, int x, int y) => f.Call(x, y);
```

Before this change:

```
00007FF605971D50  push        rbx
00007FF605971D51  sub         rsp,20h
00007FF605971D55  mov         qword ptr [rsp+30h],rcx
00007FF605971D5A  mov         r10d,edx
00007FF605971D5D  mov         ebx,r8d
00007FF605971D60  mov         rcx,qword ptr [__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF6059BEBD0h)]
00007FF605971D67  call        qword ptr [__guard_check_icall_fptr (07FF6059CD558h)]
00007FF605971D6D  mov         rax,rcx
00007FF605971D70  mov         rcx,qword ptr [rsp+30h]
00007FF605971D75  mov         r8d,ebx
00007FF605971D78  mov         edx,r10d
00007FF605971D7B  lea         r10,[__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF6059BEBD0h)]
00007FF605971D82  call        rax
00007FF605971D84  nop
00007FF605971D85  add         rsp,20h
00007FF605971D89  pop         rbx
00007FF605971D8A  ret
```

After this change:

```
00007FF704181CF0  sub         rsp,28h
00007FF704181CF4  lea         r10,[__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF7041CEBD0h)]
00007FF704181CFB  call        RhpResolveInterfaceMethodFast (07FF703FFF5A0h)
00007FF704181D00  call        qword ptr [__guard_dispatch_icall_fptr (07FF7041DD560h)]
00007FF704181D06  nop
00007FF704181D07  add         rsp,28h
00007FF704181D0B  ret
```
Copilot AI review requested due to automatic review settings December 18, 2025 13:45

pSDFrame->SetCallSite(NULL, (TADDR)callSite.GetIndirectCell());

DispatchToken representativeToken = DispatchToken(VirtualCallStubManager::GetTokenFromStub(callSite.GetSiteTarget(), NULL));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the , NULL part because we now need to provide some context, but it doesn't seem necessary here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cachedInterfaceDispatch is enabled for coreclr, then you definitely need to pass something here, as GetTokenFromStub will unconditionally read from the CONTEXT passed in to find the indirection cell. Possibly you could add a new method in parallel to GetTokenFromStub which called something like GetTokenFromStubAndIndirectionCell, which took the indirection cell pointer instead of pulling it out of a CONTEXT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I tried adding the new API and it does find a token that seems valid but under dotnet_jitforcecontrolflowguard=1/dotnet_usecachedinterfacedispatch=1 we then assert here because we can't find a VirtualCallStubManager:

VirtualCallStubManager *pMgr = VirtualCallStubManager::FindStubManager(callSiteTarget, &stubKind);
_ASSERTE(pMgr != NULL);

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes interface calls when Control Flow Guard (CFG) is enabled to use a resolve helper (CORINFO_HELP_INTERFACELOOKUP_FOR_SLOT) instead of the stub dispatcher. The implementation adds a fast path for interface dispatch resolution on Windows for AMD64 and ARM64 architectures, improving performance while maintaining CFG security guarantees.

  • Introduces new helper RhpResolveInterfaceMethodFast for NativeAOT and VSD_ResolveWorkerForInterfaceLookupSlot for CoreCLR to resolve interface methods with inline cache lookup
  • Modifies JIT lowering to route virtual stub calls through the resolver when CFG is enabled, eliminating double indirection
  • Adds comprehensive test coverage for interface dispatch scenarios including exception handling and null reference checks

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/tests/nativeaot/SmokeTests/ControlFlowGuard/ControlFlowGuard.cs Adds test cases for interface dispatch with CFG, including generic and non-generic interfaces, multiple implementations, and exception scenarios
src/coreclr/vm/virtualcallstub.h Declares new VSD_ResolveWorkerForInterfaceLookupSlot function for interface method resolution
src/coreclr/vm/virtualcallstub.cpp Implements the new resolver worker that handles interface dispatch with null checking and cache resolution
src/coreclr/vm/jithelpers.cpp Adds external declaration for JIT_InterfaceLookupForSlot helper
src/coreclr/vm/contractimpl.h Removes INVALID_TOKEN consistency check from DispatchToken constructor
src/coreclr/vm/amd64/cgenamd64.cpp Adds UpdateRegDisplayFromArgumentRegisters to preserve argument registers for stack walking
src/coreclr/vm/amd64/VirtualCallStubAMD64.asm Implements JIT_InterfaceLookupForSlot assembly stub for AMD64
src/coreclr/vm/amd64/AsmHelpers.asm Removes stray TAILJMP_RAX instruction after epilog
src/coreclr/tools/aot/ILCompiler.RyuJit/JitInterface/CorInfoImpl.RyuJit.cs Maps helper to RhpResolveInterfaceMethodFast runtime function for NativeAOT
src/coreclr/tools/Common/JitInterface/CorInfoHelpFunc.cs Adds CORINFO_HELP_INTERFACELOOKUP_FOR_SLOT helper enum value
src/coreclr/nativeaot/Runtime/inc/rhbinder.h Defines IDC_CACHE_POINTER_MASK constant for cache pointer masking
src/coreclr/nativeaot/Runtime/arm64/UniversalTransition.asm Adds parameterized return behavior to support both tail-call and return-result modes
src/coreclr/nativeaot/Runtime/arm64/DispatchResolve.asm Implements fast interface method resolution for ARM64 with cache lookup
src/coreclr/nativeaot/Runtime/amd64/UniversalTransition.asm Adds parameterized exit sequence to universal transition macro
src/coreclr/nativeaot/Runtime/amd64/DispatchResolve.asm Implements fast interface method resolution for AMD64 with polymorphic cache support
src/coreclr/nativeaot/Runtime/StackFrameIterator.cpp Recognizes new universal transition return variants for stack unwinding
src/coreclr/nativeaot/Runtime/EHHelpers.cpp Adds RhpResolveInterfaceMethodFast to AV-translation locations
src/coreclr/nativeaot/Runtime/CMakeLists.txt Conditionally includes DispatchResolve.asm for Windows AMD64/ARM64 builds
src/coreclr/nativeaot/Runtime/AsmOffsets.h Adds assembly offsets for interface dispatch cache structures
src/coreclr/jit/targetarm64.h Defines register trash mask for the interface lookup helper on ARM64
src/coreclr/jit/targetamd64.h Defines register trash mask for the interface lookup helper on AMD64
src/coreclr/jit/morph.cpp Updates CFG handling to apply to all virtual calls, not just vtable calls
src/coreclr/jit/lower.cpp Implements lowering logic to transform VSD calls into resolver calls when CFG is enabled, removes VSD cell argument, and updates call structure
src/coreclr/jit/gentree.h Adds RemoveLate method declaration to CallArgs
src/coreclr/jit/gentree.cpp Implements RemoveLate to remove arguments from both early and late argument lists
src/coreclr/jit/emit.cpp Treats interface lookup helper specially to preserve argument registers during GC
src/coreclr/jit/codegencommon.cpp Defines helper kill set for interface lookup and marks it as needing label definition
src/coreclr/inc/jithelpers.h Maps CORINFO_HELP_INTERFACELOOKUP_FOR_SLOT to JIT_InterfaceLookupForSlot for AMD64
src/coreclr/inc/jiteeversionguid.h Updates JIT/EE version GUID to reflect interface changes
src/coreclr/inc/corinfo.h Adds CORINFO_HELP_INTERFACELOOKUP_FOR_SLOT helper function enum
Comments suppressed due to low confidence (1)

src/coreclr/jit/lower.cpp:3759

  • This code block duplicates the logic already implemented in the cloneUse lambda function defined at lines 3630-3646. The call to cloneUse at line 3749 already handles the exact same logic as lines 3751-3759, making this code redundant. The result from cloneUse should be used directly without the additional if-else block.
        GenTree* indirCellClone = cloneUse(indirCellArgUse, cloneConsts);

        if (indirCellArgUse.Def()->OperIs(GT_LCL_VAR) || (cloneConsts && indirCellArgUse.Def()->IsCnsIntOrI()))
        {
            indirCellClone = comp->gtClone(indirCellArgUse.Def());
        }
        else
        {
            unsigned newLcl = indirCellArgUse.ReplaceWithLclVar(comp);
            indirCellClone  = comp->gtNewLclvNode(newLcl, TYP_I_IMPL);
        }

}

explicit DispatchToken(UINT_PTR token)
{
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CONSISTENCY_CHECK assertion for the INVALID_TOKEN was removed from the DispatchToken constructor. This could allow invalid tokens to be created without detection. If this removal is intentional (e.g., to support a new use case where tokens can legitimately be INVALID_TOKEN), it should be documented with a comment explaining when and why an invalid token is acceptable.

Suggested change
{
{
LIMITED_METHOD_CONTRACT;
CONSISTENCY_CHECK(token != INVALID_TOKEN);

Copilot uses AI. Check for mistakes.
@MichalStrehovsky
Copy link
Member Author

/azp run jit-cfg, runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jkotas
Copy link
Member

jkotas commented Dec 21, 2025

/azp run jit-cfg

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkotas
Copy link
Member

jkotas commented Dec 22, 2025

/azp run jit-cfg, runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

/azp run jit-cfg

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

/azp run jit-cfg

Had to retrigger this because if this is triggered "too fast" after pushing to a branch (up to 1 minute), devops will start running against old commit (I guess?) and then immediately cancel.


EPILOG_WITH_TRANSITION_BLOCK_TAILCALL

EPILOG_NOP mov x0, x12
Copy link
Member

@jkotas jkotas Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakobbotsch Would it make sense for the resolve helper to return the target address in x9 so that the JIT does not need to move the registers around before dispatching the call?

bl          JIT_InterfaceLookupForSlot
mov         x9,x0 <- Avoid these moves 
mov         x0,x4 <-
bl          JIT_DispatchIndirectCall

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to go further, we can even make JIT_InterfaceLookupForSlot tailcall the CFG validation or dispatch helpers.

@jkotas
Copy link
Member

jkotas commented Dec 22, 2025

/azp run jit-cfg

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkotas
Copy link
Member

jkotas commented Dec 22, 2025

/azp run jit-cfg

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

/azp run jit-cfg

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkotas
Copy link
Member

jkotas commented Dec 23, 2025

  • This is passing all tests except the async one that looks like a JIT bug
  • A few comments above need resolving

@am11
Copy link
Member

am11 commented Dec 24, 2025

The implementation adds a fast path for interface dispatch resolution on Windows for AMD64 and ARM64 architectures, improving performance while maintaining CFG security guarantees.

Will a similar fast path be implemented for Unix platforms as a follow-up task? Should we create a tracking issue for it?

@jkotas
Copy link
Member

jkotas commented Dec 25, 2025

Will a similar fast path be implemented for Unix platforms as a follow-up task?

Control Flow Guard is Windows specific feature. It does not make sense to build it on non-Windows in .NET runtime. It would not have the OS support to plug into. PA and IBT makes more sense for non-Windows/Linux.

https://github.com/dotnet/designs/blob/main/accepted/2021/runtime-security-mitigations.md has details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants