Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 47 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,11 +106,54 @@ Each line shows **one** instruction. When the marker is ` ` or `~`, both target

Differing operands are wrapped in `{braces}` for easy identification.

Matching instruction runs are collapsed by default (shows 3 lines of context). Options:
By default, long runs of fully matching instructions are **collapsed** into a summary line like `... 10 matching instructions ...`.
This is only a display shortcut: those instructions still exist and still match; they are just hidden to keep the diff compact.

With `--no-collapse`, those summary lines disappear and every instruction in matching regions is printed explicitly.
This is useful when mapping source lines to exact instruction neighborhoods or when checking whether a mismatch cluster is truly contiguous.

Quick rule of thumb for agents:
- default (collapsed): better for scanning large functions and finding mismatch hotspots quickly.
- `--no-collapse`: better for deep analysis when you need complete sequential instruction flow.

Options:
- `-C 5` — show 5 context lines instead of 3
- `--no-collapse` — show every instruction
- `--no-collapse` — disable match-run collapsing and show every instruction
- `--range 100-300` — only show instructions at hex offsets 0x100–0x300

### Using `m2c` for raw draft decompilation

If a local clone of [`m2c`](https://github.com/matt-kempster/m2c) is available, it can be used to produce a **rough first-pass C draft** from an extracted assembly file.
For this repository, the relevant input files are the dtk-generated assembly files under `build/GMSJ01/asm/<path>.s`.

`m2c` works here as an asm-to-draft tool, not as a source-of-truth decompiler.
Treat its output as scaffolding for understanding control flow and rough data flow, then verify everything against objdiff and the binary.

Tested example:

```bash
python D:/Develop/m2c/m2c.py -t ppc -f __dt__22TNerveFireWanwanEscapeFv --globals=none build/GMSJ01/asm/Enemy/fireWanwan.s
```

That command decompiles the function named by the `.fn` symbol in the asm file and prints a raw draft to stdout.
For larger functions, the same pattern works with the mangled symbol name, for example:

```bash
python D:/Develop/m2c/m2c.py -t ppc -f "execute__22TNerveFireWanwanEscapeCFP24TSpineBase<10TLiveActor>" --globals=none build/GMSJ01/asm/Enemy/fireWanwan.s
```

Practical workflow:
- find the TU assembly file in `build/GMSJ01/asm/...`
- open it and copy the mangled function name from the `.fn` line
- run `m2c` with `-t ppc` (`ppc` is the right alias for this CodeWarrior PowerPC output)
- use `--globals=none` when you only want the function body draft
- redirect stdout to a scratch file if needed for longer functions

Important limitations:
- `m2c` output is often rough for this codebase: inferred placeholder structs, bad field names, missing types, and occasional broken expressions are normal.
- `m2c` does **not** support the C++ context flow needed for this repository's real types, so do not try to feed it SMS C++ context files and expect good typed output.
- use `m2c` for orientation only; never trust it over the asm or decomp-diff.

## Source Organization

Each `.o` file maps 1:1 to a `.cpp` file. The path is listed in `configure.py` under `config.libs`. Each object has a status:
Expand Down Expand Up @@ -302,4 +345,6 @@ UNUSED functions must still be reconstructed in the source because:
- **Don't try to process diff tool output with external utilities**. It is designed to be read by agents, and filtering out certain lines via regex will miss crucial context. The diff shows the disassembly of the original binary and current code simultaneously and should be read directly and sequentially in an attempt to understand what the current code does differently from the original.
- **Read the full diff before acting**. Use `--no-collapse` to see every instruction. Identify all problem clusters before making changes, then work on the largest cluster first.
- **Focus on one part of the function at a time**. Identify what exact lines in the source code a non-matching part of the disassembly corresponds. Use `--range` argument of the diff tool to only see the asm for the part being worked on.
- **Use temporary marker calls to map source to asm when anchors are missing**. If there are no obvious anchors (for example, no calls to known functions nearby), temporarily add a fake external marker like `extern void marker__();` and call it at a candidate point in the function. The call will show up clearly in diff output and helps bracket surrounding instructions, and you can repeat this process to narrow correspondence precisely.
- **Always remove marker calls after mapping**. Any extra call can change register allocation/scheduling and inhibit matching, so markers are strictly temporary debugging aids.
- **Read [docs/AGENT_MATCHING_TIPS.md](docs/AGENT_MATCHING_TIPS.md)** for detailed MWCC codegen patterns that come up repeatedly.
139 changes: 106 additions & 33 deletions docs/AGENT_MATCHING_TIPS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,50 +4,78 @@ This document collects practical knowledge about how the Metrowerks CodeWarrior

Read this document before attempting to match functions. The patterns described here are recurring and will save significant trial-and-error time. Expand the document as needed. Ask for human review whenever the contents of the document disagree with observed reality.

## TVec3 / Vector Codegen Patterns
## MWCC dislikes reordering

`JGeometry::TVec3<f32>` is a 12-byte struct with `x`, `y`, `z` float members. How you read/write it drastically affects code generation.
The compiler never reorders memory stores, loads and function calls relative to one another.
They always should be performed in source code in the exact order they appear in target assembly.

### Construction: component-by-component vs constructor
## MWCC can eliminate redundant reads, but not writes

```cpp
// Constructor form — compiler batches all loads, then all stores:
// lfs f0, ...; lfs f1, ...; lfs f2, ...
// stfs f0, 0(rN); stfs f1, 4(rN); stfs f2, 8(rN)
JGeometry::TVec3<f32> pos(x, y, z);
All writes to memory that happen in code will be reproduced in assembly, and vice-versa, when reconstructing code from assembly, you MUST do repeated redundant stores into memory -- that's just the way the original code was written.

// Component-by-component — compiler interleaves load/store pairs:
// lfs f0, ...; stfs f0, 0(rN)
// lfs f0, ...; stfs f0, 4(rN)
// lfs f0, ...; stfs f0, 8(rN)
JGeometry::TVec3<f32> pos;
pos.x = x;
pos.y = y;
pos.z = z;
```
Redundant reads, on the other hand, can be eliminated by MWCC, but not always, e.g. function inlining might inhibit this.

Check the target assembly to see which pattern (batched vs interleaved) is used, and write the source accordingly.
## MWCC is very reluctant to optimize anything that uses stores ints to memory

### Assignment: `operator=` vs `.set()`
If a Vec struct that contains 3 floats is copied via it's compiler-generated copy ctor or assignment operator -- it will be compiled to integer loads/stores. This makes the compiler unable to keep the values used in registers and forces it to spill them to stack, even if the surrounding code clearly allows for them to stay in floating point registers.

```cpp
// operator= (struct copy) — generates lwz/stw (word load/store):
// lwz r0, 0(rSrc); stw r0, 0(rDst)
// lwz r0, 4(rSrc); stw r0, 4(rDst)
// lwz r0, 8(rSrc); stw r0, 8(rDst)
node.mPos = param_1;
Same logic applies to a Color struct that contains four 8-bit ints: if the ints are initialized one by one, then the compiler will not be able to optimize it to simple bit manipulations in integral registers and will keep the struct on the stack.

// .set(vec) (float copy) — generates lfs/stfs (float load/store):
// lfs f0, 0(rSrc); stfs f0, 0(rDst)
// lfs f0, 4(rSrc); stfs f0, 4(rDst)
// lfs f0, 8(rSrc); stfs f0, 8(rDst)
node.mPos.set(param_1);
## MWCC 1.2.5 stack padding bugs

// .set(x, y, z) (3-arg form) — generates lfs/stfs like component assignment
node.mPos.set(expr_x, expr_y, expr_z);
Our version of MWCC has a bug where the backend allocates more stack than necessary.

Most commonly this happens when functions were inlined: inlined function calls often inflate the stack frame size. To correctly match a function with stack frame size issues, UNUSED inlines from the MAP need to be reconstructed based on their size, and sometimes new inlines need to be fabricated based on own's best judgement.

Another instance of it is using a ternary operator sometimes taking up more stack than using ifs.

Next, local variables can expand the stack even if they are always stored in a register and never actually spilled to the stack.

When no obviously correct way to make stack frame size match exists, a trick should be used to correctly match the function's context: a temporary char array of required size to inflate the stack. Such hacks however should be removed or commented out after the function is matching to allow for a possible proper solution in the future.

## Ifs

Ifs are always compiled to very simple code:
- compare (`cmpwi`/`cmplwi`/etc, or arithmetic instruction with a dot)
- conditional branch (`beq`/`bne`/`ble`/etc)
- the true block
- unconditional branch to end of false block (`b`)
- the false block

The compiler **NEVER** swaps the order of the true block and false block.
It is also very reluctant in changing the control flow, so C++ control flow usually corresponds to assembly one to one.

Ternary operator is compiled similarly, but it is the one exception to control flow being the same. In the following case MWCC might initialize the variable's register with the "otherwise" value (zero) instead of doing so in the false branch, which eliminates the false branch entirely.
```
int b = thing == nullptr ? thing->field : 0;
```

The target assembly will clearly show `lwz`/`stw` (integer move) vs `lfs`/`stfs` (float move). Choose the source pattern that matches.
## Sequential integer comparisons in a disjunction

Whe MWCC sees code like `if (a == 8 || a == 9 || a == 10)` it can optimize it to be `if (a - 8 <= 2)` sometimes. When the latter pattern is encountered with enums -- it should be reversed into multiple disjuncted equality comparisons.

## Switches

MWCC can compile switches in one of two ways: jump table or branching.
Jump tables are easily identifiable via `mtctr` and `bctr` instructions being used.
Switches that became branches usually have control flow that doesn't look like an if: multiple conditional branch instructions follow a single comparison instruction. E.g.
```
cmpwi r0, 0x1
beq ...
bge ...
cmpwi r0, 0x0
bge ...
b ...
... the code block inside the switch ...
```

## Nonsensical control flow

As MWCC inlines functions, sometimes nonsensical control flow will be encountered in the assembly, one that doesn't correspond to any structured control flow constructions like switches, ifs or loops. Such cases are usually explained by **function inlining** rather than gotos. The place where a goto was supposedly used would actually correspond to a return statement, and the place where it points to would be the boundary of the inlined call.

## Keep track of known relevant inlines

Reconstructing correct inline calls is crucial in matching code correctly. When a similar block of code reoccurs -- always consider the possibility that it's an inline, but never disregard the possibility that the original authors simply copy-pasted it. When starting on a new function, explore the inlines already available in the different classes that it uses, as well as in the current translation unit.

## Reference Locals Affect Register Allocation

Expand Down Expand Up @@ -115,3 +143,48 @@ When a TU is compiled with `-inline deferred` (see TU-specific flags in `configu

- In practice, function-definition order in the `.cpp` should be reversed for those TUs.
- If order-sensitive matching drifts for an `-inline deferred` TU, verify definition order before attempting smaller codegen tweaks.

## TVec3 / Vector Codegen Patterns

`JGeometry::TVec3<f32>` is a 12-byte struct with `x`, `y`, `z` float members. How you read/write it drastically affects code generation.

### Construction: component-by-component vs constructor

```cpp
// Constructor form — compiler batches all loads, then all stores:
// lfs f0, ...; lfs f1, ...; lfs f2, ...
// stfs f0, 0(rN); stfs f1, 4(rN); stfs f2, 8(rN)
JGeometry::TVec3<f32> pos(x, y, z);

// Component-by-component — compiler interleaves load/store pairs:
// lfs f0, ...; stfs f0, 0(rN)
// lfs f0, ...; stfs f0, 4(rN)
// lfs f0, ...; stfs f0, 8(rN)
JGeometry::TVec3<f32> pos;
pos.x = x;
pos.y = y;
pos.z = z;
```

Check the target assembly to see which pattern (batched vs interleaved) is used, and write the source accordingly.

### Assignment: `operator=` vs `.set()`

```cpp
// operator= (struct copy) — generates lwz/stw (word load/store):
// lwz r0, 0(rSrc); stw r0, 0(rDst)
// lwz r0, 4(rSrc); stw r0, 4(rDst)
// lwz r0, 8(rSrc); stw r0, 8(rDst)
node.mPos = param_1;

// .set(vec) (float copy) — generates lfs/stfs (float load/store):
// lfs f0, 0(rSrc); stfs f0, 0(rDst)
// lfs f0, 4(rSrc); stfs f0, 4(rDst)
// lfs f0, 8(rSrc); stfs f0, 8(rDst)
node.mPos.set(param_1);

// .set(x, y, z) (3-arg form) — generates lfs/stfs like component assignment
node.mPos.set(expr_x, expr_y, expr_z);
```

The target assembly will clearly show `lwz`/`stw` (integer move) vs `lfs`/`stfs` (float move). Choose the source pattern that matches.
138 changes: 126 additions & 12 deletions include/GC2D/CardSave.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,22 @@
#define GC2D_CARD_SAVE_HPP

#include <JSystem/JDrama/JDRViewObj.hpp>
#include <System/CardManager.hpp>
#include <GC2D/Progress.hpp>

class TMarioGamePad;
class J2DScreen;
class J2DPane;
class J2DTextBox;
class J2DPicture;
class JUTTexture;
class JPABaseEmitter;
class TExPane;
class TPauseMenu2;

class TCardSave : public JDrama::TViewObj {
public:
void changeMode(long);
TEProgress changeMode(long);

TCardSave(const char* name = "<TCardSave>", bool = false);

Expand All @@ -20,25 +27,132 @@ class TCardSave : public JDrama::TViewObj {
void perform(unsigned long, JDrama::TGraphics*);
void makeBuffer(J2DTextBox*, int);
void setMessage(J2DTextBox*, long, unsigned long);
void waitForStop(TEProgress);
s8 waitForStop(TEProgress);
void endWaitForChoice();
void waitForChoice(TEProgress, TEProgress, signed char);
void waitForChoiceBM(TEProgress, TEProgress, signed char);
s8 waitForChoice(TEProgress, TEProgress, signed char);
s8 waitForChoiceBM(TEProgress, TEProgress, signed char);
void endDrawMessage();
void drawMessage(TEProgress);
void drawMessageBM(TEProgress);
void waitForAnyKey(TEProgress);
void waitForSelectOver();
void waitForSelect2(TEProgress, TEProgress);
void waitForSelect3(TEProgress, TEProgress, TEProgress);
void waitForAnyKeyBM(TEProgress);
s8 drawMessage(TEProgress);
s8 drawMessageBM(TEProgress);
s8 waitForAnyKey(TEProgress);
s8 waitForSelectOver();
s8 waitForSelect2(TEProgress, TEProgress);
s8 waitForSelect3(TEProgress, TEProgress, TEProgress);
s8 waitForAnyKeyBM(TEProgress);
void selectBookmarks(TEProgress, TEProgress, TEProgress, TEProgress);
void changePattern(J2DPicture*, short, unsigned long);
void execMovement_();
u8 getNextState();
void execIssueGX_(JDrama::TGraphics*);

static int cMessageID; // TODO: wrong type
static u32 cMessageID[];

// fabricated
u16 getCurMessageID() { return cMessageID[unk310]; }
TCardBookmarkInfo& getBookmarkInfo() { return unk278[unk2EA]; }

public:
/* 0x10 */ int unk10;
/* 0x14 */ J2DScreen* unk14;
/* 0x18 */ bool unk18;
/* 0x1C */ JUTTexture* unk1C[10];
/* 0x44 */ JPABaseEmitter* unk44;
/* 0x48 */ TExPane* unk48;
/* 0x4C */ JUTRect unk4C;
/* 0x5C */ u32 unk5C;
/* 0x60 */ u32 unk60;
/* 0x64 */ char unk64[0xA0 - 0x64];
/* 0xA0 */ J2DTextBox* unkA0;
/* 0xA4 */ J2DTextBox* unkA4;
/* 0xA8 */ TExPane* unkA8;
/* 0xAC */ JUTRect unkAC;
/* 0xBC */ u32 unkBC;
/* 0xC0 */ u32 unkC0;
/* 0xC4 */ TExPane* unkC4;
/* 0xC8 */ JUTRect unkC8;
/* 0xD8 */ J2DTextBox* unkD8;
/* 0xDC */ J2DTextBox* unkDC;
/* 0xE0 */ TExPane* unkE0;
/* 0xE4 */ JUTRect unkE4;
/* 0xF4 */ J2DTextBox* unkF4;
/* 0xF8 */ J2DTextBox* unkF8;
/* 0xFC */ TExPane* unkFC;
/* 0x100 */ TExPane* unk100;
/* 0x104 */ JUTRect unk104;
/* 0x114 */ JUTRect unk114;
/* 0x124 */ J2DTextBox* unk124;
/* 0x128 */ J2DTextBox* unk128;
/* 0x12C */ J2DTextBox* unk12C;
/* 0x130 */ J2DTextBox* unk130;
/* 0x134 */ J2DPane* unk134;
/* 0x138 */ J2DPane* unk138;
/* 0x13C */ J2DPicture* unk13C;
/* 0x140 */ J2DPicture* unk140;
/* 0x144 */ J2DPicture* unk144;
/* 0x148 */ J2DPicture* unk148;
/* 0x14C */ J2DPicture* unk14C;
/* 0x150 */ J2DPane* unk150;
/* 0x154 */ J2DPane* unk154[3];
/* 0x160 */ TExPane* unk160;
/* 0x164 */ JUTRect unk164;
/* 0x174 */ J2DTextBox* unk174;
/* 0x178 */ J2DTextBox* unk178;
/* 0x17C */ TExPane* unk17C;
/* 0x180 */ JUTRect unk180;
/* 0x190 */ J2DTextBox* unk190;
/* 0x194 */ J2DTextBox* unk194;
/* 0x198 */ J2DTextBox* unk198[2][2];
/* 0x1A8 */ J2DPane* unk1A8[2][2];
/* 0x1B8 */ u16 unk1B8;
/* 0x1BC */ TExPane* unk1BC;
/* 0x1C0 */ JUTRect unk1C0;
/* 0x1D0 */ J2DTextBox* unk1D0[3][2];
/* 0x1E8 */ J2DTextBox* unk1E8[3][2];
/* 0x200 */ J2DTextBox* unk200;
/* 0x204 */ J2DTextBox* unk204;
/* 0x208 */ J2DTextBox* unk208;
/* 0x20C */ J2DTextBox* unk20C;
/* 0x210 */ J2DPane* unk210;
/* 0x214 */ J2DPane* unk214;
/* 0x218 */ J2DPane* unk218;
/* 0x21C */ J2DPane* unk21C;
/* 0x220 */ J2DPane* unk220;
/* 0x224 */ J2DPane* unk224;
/* 0x228 */ J2DPane* unk228;
/* 0x22C */ J2DPane* unk22C;
/* 0x230 */ J2DPane* unk230[3];
/* 0x23C */ u16 unk23C;
/* 0x240 */ TExPane* unk240;
/* 0x244 */ JUTRect unk244;
/* 0x254 */ J2DTextBox* unk254[2][2];
/* 0x264 */ J2DTextBox* unk264[2];
/* 0x26C */ u16 unk26C;
/* 0x270 */ TMarioGamePad* unk270;
/* 0x274 */ char unk274[0x4];
/* 0x278 */ TCardBookmarkInfo unk278[3];
/* 0x2D8 */ TPauseMenu2* unk2D8;
/* 0x2DC */ u8 unk2DC;
/* 0x2DD */ u8 unk2DD;
/* 0x2DE */ u8 unk2DE;
/* 0x2DF */ u8 unk2DF;
/* 0x2E0 */ int unk2E0;
/* 0x2E4 */ void* unk2E4;
/* 0x2E8 */ u8 unk2E8;
/* 0x2E9 */ s8 unk2E9;
/* 0x2EA */ s8 unk2EA;
/* 0x2EB */ char unk2EB[0x2F8 - 0x2EB];
/* 0x2F8 */ u8 unk2F8;
/* 0x2F9 */ u8 unk2F9;
/* 0x2FC */ s32 unk2FC;
/* 0x300 */ s16 unk300;
/* 0x302 */ s16 unk302;
/* 0x304 */ s16 unk304;
/* 0x308 */ int unk308;
/* 0x30C */ u8 unk30C;
/* 0x310 */ TEProgress unk310;
/* 0x314 */ TEProgress unk314;
/* 0x318 */ u32 unk318;
/* 0x31C */ char unk31C[0x4];
};

#endif
Loading