Skip to content

feat: loop desugaring#1655

Open
volsa wants to merge 12 commits intomasterfrom
vosa/loop-desugaring
Open

feat: loop desugaring#1655
volsa wants to merge 12 commits intomasterfrom
vosa/loop-desugaring

Conversation

@volsa
Copy link
Copy Markdown
Member

@volsa volsa commented Apr 1, 2026

This commit canonicalizes all loops into a WHILE TRUE ... END_WHILE form, for example

WHILE <cond> DO
    <body>
END_WHILE

becomes

WHILE TRUE DO
  IF NOT <cond> THEN
    EXIT
  END_IF
  
  <body>
END_WHILE

This fixes some issues related to polymorphism and calls with aggregate return types in loop conditions. Specifically without the loop desugaring (in the context of polymorphic calls)

WHILE consumer(instances[i]) DO
    i := i + 1;
END_WHILE;

would become

alloca __fatpointer_0: __FATPOINTER
__fatpointer_0.data := ADR(instances[i])
__fatpointer_0.table := ADR(__itable_IA_FbA_instance)
WHILE consumer(__fatpointer_0) DO
    i := i + 1
END_WHILE

which is incorrect because the __fatpointer_0 is stale between iterations. With the loop desugaring we now get

WHILE TRUE DO
    alloca __fatpointer_0: __FATPOINTER
    __fatpointer_0.data := ADR(instances[i])
    __fatpointer_0.table := ADR(__itable_IA_FbA_instance)
    IF NOT consumer(__fatpointer_0) THEN
        EXIT;
    END_IF
    i := i + 1
END_WHILE

Same for use cases of aggregate return type lowering, which in fact already did loop desugaring, but only for specific cases. This commit does it globally now. As a nice "side effect" the codegen no longer needs to generate for loops.

Edit: fyi I tested the debugging DX using gdb, seems to be fine after the desugaring.

@volsa volsa marked this pull request as ready for review April 1, 2026 15:23
@volsa
Copy link
Copy Markdown
Member Author

volsa commented Apr 2, 2026

The failing metrics CI should be fixed by #1656

Copy link
Copy Markdown
Collaborator

@ghaith ghaith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found some minimal issues on the first pass.
One more issue the ai pointed out is that we dynamically evaluate the end condition but the standard says it's evaluated once. i did not check that, i think it's a halucination but it might be worth checking what codesys does on changing end conditions for FOR loops.
The AI also mentioned that the increment by number is computed only once.
For both cases i think we can open new issues if they are things we need to fix.

}
}

pub fn is_repeat(&self) -> bool {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems not to be used

//! It is rewritten to:
//!
//! ```st
//! WHILE TRUE DO
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when i saw this, and looked at other changes on the statement generate it got me thinking: we already did this (kind of) in codegen, but you did not remove it there. So i asked my clancker to write a test to show that issue.
I think codegen will now always have a redundant br true body else which causes an extra loop branch, i think it's minimal but worh mentioning. According to claude, this will be optimised out starting O1 but not on O0

Here's a test that reproduces this: tests/lit/ir_tests/while_loop_double_condition.st

// RUN: %COMPILE_IR %s | %CHECK_IR %s

// After loop desugaring, `WHILE x < 10` becomes `WHILE TRUE DO IF NOT (x < 10) THEN EXIT`.
// The codegen for WHILE still emits a `condition_check` block that evaluates the (now always-true)
// condition and branches on it.  This means we get TWO condition evaluations per iteration:
//
//   condition_check:          ;; codegen-emitted, always-true literal
//     br i1 true, label %while_body, label %continue
//   while_body:               ;; lowered guard: the real user condition
//     %cmp = icmp slt i32 %x, 10
//     ...
//     br i1 %not_cmp, label %exit, label %body
//
// The `condition_check` block is dead code — its false-edge to `continue` is unreachable.
// This test documents the redundancy so it can be tracked as a known issue.

FUNCTION main : DINT
VAR
    x : DINT;
END_VAR

WHILE x < 10 DO
    x := x + 1;
END_WHILE

main := x;
END_FUNCTION

// The codegen emits a `condition_check` block with the desugared `TRUE` literal,
// producing a redundant always-taken branch.
// CHECK: condition_check:
// CHECK-NEXT: br i1 true, label %while_body, label %continue

// The actual user condition (x < 10) is evaluated inside while_body as the lowered IF NOT guard.
// CHECK: while_body:
// CHECK: icmp slt i32 %{{.*}}, 10

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude then also identified a possible nit, a user's WHILE TRUE DO becomes a WHILE TRUE IF NOT TRUE .. effectively adding dead code but it thinks llvm will simply clean that up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants