Skip to content

kmWriteDef* (Injection)#54

Draft
RoadrunnerWMC wants to merge 5 commits intoTreeki:masterfrom
RoadrunnerWMC:injection_3
Draft

kmWriteDef* (Injection)#54
RoadrunnerWMC wants to merge 5 commits intoTreeki:masterfrom
RoadrunnerWMC:injection_3

Conversation

@RoadrunnerWMC
Copy link
Collaborator

@RoadrunnerWMC RoadrunnerWMC commented Mar 2, 2026

Status of this PR: Tested and working in both NSMBW-Updated and a linkage stress-test I made, which will be added to #52 once this PR is merged. I also have some optimizations in mind, but I plan to save those for follow-up PRs.


This PR adds a new family of macros that let you "inject" code directly into a specified address range:

C++ (kamek.h):

  • kmWriteDefAsm(startAddr[, endAddr[, flags[, pad]]]) { ... }
  • kmWriteDefCpp(startAddr, endAddr, returnType, ...) { ... }
  • kmWriteNops(startAddr, endAddr);

Assembly (kamek_asm.S):

  • kmWriteDefStart startAddr[, endAddr[, flags[, pad]]] + kmWriteDefEnd
  • kmWriteNops startAddr, endAddr

This can make certain types of patches cleaner, as well as help reduce patches' memory footprints.

For example, this patch from NSMBW-Updated:

kmWrite32(0x8013f41c, 0x28000002);  // cmplwi r0, 2

Can now be written as simply:

kmWriteDefAsm(0x8013f41c) { cmplwi r0, 2 }

Or as a larger example, this patch:

kmWrite32(0x800508fc, 0x540007ff);  // clrlwi. r0, r0, 0x1f
kmWrite32(0x80050900, 0x41820014);  // beq NOT_DIRECT_PIPE_END
kmWrite32(0x80050904, 0xa0e30004);  // lhz r7, 4(r3)
kmWrite32(0x80050908, 0x3807fffe);  // subi r0, r7, 0x2
kmWrite32(0x8005090c, 0xb01f042c);  // sth r0, 0x42c(r31)
kmWrite32(0x80050910, 0x4800000c);  // b AFTER_DIRECT_PIPE_END_CHECK
                                    // NOT_DIRECT_PIPE_END:
kmWrite32(0x80050914, 0x38000001);  // li r0, 1
kmWrite32(0x80050918, 0xb01f042c);  // sth r0, 0x42c(r31)
                                    // AFTER_DIRECT_PIPE_END_CHECK:
kmWrite32(0x8005091c, 0xa0a30002);  // lhz r5, 2(r3)
kmWrite32(0x80050920, 0xa89f042c);  // lha r4, 0x42c(r31)
kmWrite32(0x80050924, 0x80c6003c);  // lwz r6, 0x3c(r6)
kmWrite32(0x80050928, 0x7c052214);  // add r0, r5, r4
kmWrite32(0x8005092c, 0x54002036);  // slwi r0, r0, 4
kmWrite32(0x80050930, 0x7ca60214);  // add r5, r6, r0

Can be rewritten as:

kmWriteDefAsm(0x800508fc, 0x80050930) {
    clrlwi. r0, r0, 0x1f
    beq NOT_DIRECT_PIPE_END
    lhz r7, 4(r3)
    subi r0, r7, 0x2
    sth r0, 0x42c(r31)
    b AFTER_DIRECT_PIPE_END_CHECK
NOT_DIRECT_PIPE_END:
    li r0, 1
    sth r0, 0x42c(r31)
AFTER_DIRECT_PIPE_END_CHECK:
    lhz r5, 2(r3)
    lha r4, 0x42c(r31)
    lwz r6, 0x3c(r6)
    add r0, r5, r4
    slwi r0, r0, 4
    add r5, r6, r0
}

Compared to assembling instructions by hand, as in the above examples, not only is this more convenient, but since it's properly integrated with Kamek, relocations are supported. That means your injected code can reference game addresses and other custom-code addresses, and Kamek will ensure everything is linked and translated appropriately, just like with other patch types.

For whole-function replacements that were previously done with kmBranch and related macros, using kmWriteDefCpp/kmWriteDefAsm saves memory because no additional memory is required at all.1 The downside, of course, is that replacement functions cannot be larger than the originals when patched in this way.

Usage Details

C++ macros

kmWriteDefAsm(startAddr[, endAddr[, flags[, pad]]]) { ... }

Inject assembly code directly into the target executable, overwriting whatever's there.

  • startAddr and endAddr are the addresses of the first and last instructions to replace. If endAddr is omitted, it defaults to startAddr.
  • flags specifies options:
    • KM_INJECT_STRIP_BLR_PAST: If the specified address range has size N, and the compiled code has size exactly N+4, and the last four bytes are a blr instruction, delete the instruction instead of raising a link-time error.

      Rationale This flag makes it possible to write simple patches like
      kmWriteDefAsm(0x8013f41c) { cmplwi r0, 2 }
      without needing to add an explicit `nofralloc` every time.

      This is toggleable because implicit blr stripping may be undesirable in some cases.

    • KM_INJECT_ADD_PADDING: If the specified address range has size N, and the compiled code has size < N, pad the remaining space with the provided pad value. (If this flag isn't set, the remaining space is left untouched.)

      Rationale This flag makes it convenient to replace large chunks of code with a smaller amount of replacement code, without having to carefully count the number of instructions and add an appropriate amount of explicit nops to reach the right length.

      This is toggleable because if the user intends to (say) replace an entire function, padding is unnecessary and would only serve to bloat the compiled patch file size.

    • The default is for both of the above flags to be enabled.

  • pad is the value to fill extra space with if KM_INJECT_ADD_PADDING is set. Default is nop (0x60000000).

In most cases, startAddr and endAddr will be all you need. flags and pad are expected to be used only rarely.

kmWriteDefCpp(startAddr, endAddr, returnType, ...) { ... }

Inject a C++ function defined directly underneath into the target executable, overwriting whatever function is already there.

startAddr and endAddr are the same as in kmWriteDefAsm, and the remaining argument(s) are the same as in kmBranchDefCpp/kmCallDefCpp.

Regarding flags and pad These values are not configurable for this macro, and are both hardcoded to 0. This is because I can't think of any non-contrived use cases for them, and unlike in kmWriteDefAsm, they would always muddy up the macro signature: they can't be added as optional arguments, because the macro already uses optional arguments to collect function arguments.

If a use case is found in the future, we can add a kmWriteDefCppEx macro that includes the extra arguments, or something. But for now, I don't see the point.

kmWriteNops(startAddr, endAddr);

This is syntactic sugar for (conceptually) kmWriteDefAsm(startAddr, endAddr) { }.

To write a single nop, use kmWriteNop(addr) (already in Kamek, not added by this PR).

Assembly macros

kmWriteDefStart startAddr[, endAddr[, flags[, pad]]] + kmWriteDefEnd

Equivalent to kmWriteDefAsm in C++.

// One-, two-, three-, and four-argument versions are all supported.
// Separate arguments with commas.
kmWriteDefStart 0x8013f41c
    cmplwi r0, 2
kmWriteDefEnd

The use of an "end" macro is unfortunately required for technical reasons.

kmWriteNops startAddr, endAddr

Equivalent to kmWriteNops in C++. kmWriteNop exists, too.

Kamekfile format and loader changes

To support this new feature, the loader is updated to use the new Kamekfile v3 format, which adds a new kWriteRange command type. The updated loader also now requires game harnesses (e.g. nsmbw.cpp) to supply a memcpy pointer in the loaderFunctions table.

It's worth pointing out that #59 also increments the Kamekfile version field, currently to 3. Whichever PR is merged second will have its version number changed to 4 instead.

Implementation notes

Use of sections

Injected functions are placed in their own sections called .km_inject_<unique identifier>.

Unlike other hook types, which use structs in the .kamek section to declare their metadata, each injected function declares its metadata in another bespoke section, .km_inject_<unique identifier>_meta. This is done to simplify the implementation. Kamek needs to determine every section's base address extremely early in the linking process, and for injection sections, this is part of the hook metadata. On the other hand, hooks in the .kamek section are the last thing the Linker class processes: it requires relocations to have already been processed, which requires symbols to have already been processed, which requires all sections to already have their base addresses assigned.

In earlier versions of this PR, I experimented with techniques like using multiple processing "passes" or adding stripped-down versions of the symbol-, relocation-, and hook-processing functions so that injection metadata could be parsed earlier. These all ended up much more complicated and ugly than the final approach. Raw section data (apart from relocations, which we don't need for this) is available right at the start of linking, so using extra sections like this is a simple way to convey the information in a way that can be easily parsed when we need to. And just like the .kamek section, these extra metadata sections are discarded during the linking process, so this design choice doesn't affect the final output at all.

Not planned

Using symbol names as targets instead of explicit addresses. For example:

// This does NOT work
kmWriteDefAsm(getLength__16dIggyWanKusari_cCFv) { /* ... */ }
Rationale Not only would supporting this make the implementation more complex, but the only possible use case for it would be to replace a function with a blr...
// This (still) does NOT work
kmWriteDefAsm(getLength__16dIggyWanKusari_cCFv) { nofralloc; blr }  // (or just leave empty)
...since replacing more than one instruction requires specifying the start and end addresses, which would defeat the purpose of using a symbol address in the first place:
// Hopefully clearer now why this feature would be kind of pointless
kmWriteDefAsm(getLength__16dIggyWanKusari_cCFv, 0x800b95fc) {
    fsub f1, f1, f1
}

Discussion: End-address inclusivity/exclusivity

There's an argument to be made for making endAddr exclusive rather than inclusive. Essentially, instead of thinking of it as "the last instruction to be overwritten", it would instead refer to "the next instruction to be run". This would parallel very nicely with kmBranchDefAsm and its exitPoint parameter.

A much older version of this branch, from before this PR was made, implemented it that way. However, after thinking about it for a while, and discussing with others, it was decided that the exclusive-upper-bound semantics would be too unintuitive, even if it would match kmBranchDefAsm better. So it was changed.

Footnotes

  1. After the loading process is completed during game boot, at least. Also, some memory might still be required for things like static local variables and string constants, if the new function uses any.

@RoadrunnerWMC RoadrunnerWMC force-pushed the injection_3 branch 6 times, most recently from 91420a8 to eaec91e Compare March 3, 2026 08:47
@RoadrunnerWMC RoadrunnerWMC force-pushed the injection_3 branch 5 times, most recently from 39ff202 to 7dd602d Compare March 10, 2026 04:59
@RoadrunnerWMC RoadrunnerWMC force-pushed the injection_3 branch 11 times, most recently from cad4e77 to 61724f0 Compare March 12, 2026 10:37
@RoadrunnerWMC RoadrunnerWMC changed the title kmWriteDefAsm (Injection) kmWriteDef* (Injection) Mar 12, 2026
@RoadrunnerWMC RoadrunnerWMC force-pushed the injection_3 branch 3 times, most recently from 2da1c77 to 5af0610 Compare March 13, 2026 08:57
@RoadrunnerWMC RoadrunnerWMC force-pushed the injection_3 branch 2 times, most recently from 7f3c528 to 1102795 Compare March 13, 2026 09:12
The previous implementation temporarily placed the .kamek (hooks) section into the simulated Wii memory just after the .bss section. Any relocations that applied to this address range were assumed to apply to the hook data, which was fine because there was no way to violate that assumption at the time.

However, with the new code-injection feature, it's now possible to inject code into arbitrary static address ranges. If building in -static mode, that means that .kamek is given a fixed, absolute address range, which can potentially conflict with one or more injected code sections. It's a rare edge case, but it can happen. When it does, Kamek can no longer distinguish between relocations meant to apply to the injected section(s) and ones meant to apply to the hook data, which can lead to exceptions or incorrect behavior.

To fix this, this commit makes Kamek keep track of the .kamek section data *separately*, outside of the Wii address space entirely. This somewhat complicates the logic, unfortunately, but it fixes the root cause of the issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant