Skip to content

Conversation

@karlseguin
Copy link
Collaborator

@karlseguin karlseguin commented Oct 27, 2025

zigdom

lightpanda with a Zig-native DOM

Install

Needs cargo

install-html5ever-dev

Testing

Still a work in progress. But the test filter has been improved. TEST_FILTER="..." or (make test F="...") can now run specific HTML file cases using the #partial_file_name:

make test F="Document"
make test F="Document#query_selector.html"
make test F="Document#query_selector"
make test F="#query_selector.html"

Prototype

The prototype system relies on two special fields. The _proto field references the parent of a type. For example:

    • Element has a _proto: *Node field,
    • Node has a _proto: *EventTarget field,
    • EventTarget has no _proto field

Going the other way, "parents" have a _type field which is a tagged union:

pub const Node = @This;

_type: Type,

pub const Type = union(enum) {
  element: *Element,
  document: *Document,
  // ....
};

As a convention, parents expose an as method:

const el = node.as(Element) orelse return;
// or
const input = node.as(Element.Html.Input) orelse return;

As a convenience, children expose an as$Ancestor, e.g. input.asElement() or input.asNode(), although some code might simply access input._proto.

Bonus: The special return union types, e.g. Node.Union or Element.Union, are no longer needed. You can return any part of the prototype chain. In other words, as far as the JS bridge is concerned, you can return input or input.asElement() or input.asNode().

Explicit JS Mapping

Naming conventions are no longer used to create the JS mapping. Every type that is mapped has a nested JsApi and JsApi.Meta struct:

const Window = @This();

...

pub const JsApi = struct {
    pub const bridge = js.Bridge(Window);

    pub const Meta = struct {
        pub const name = "Window";
        pub const prototype_chain = bridge.prototypeChain();
        pub var class_index: u16 = 0;
    };

    pub const self = bridge.accessor(Window.getWindow, null, .{ });
    pub const window = bridge.accessor(Window.getWindow, null, .{ });
    ...
};

A bit more tedious, but new APIs aren't added that often. Allows for having per-definition configuration (e.g. enabling DOMExceptions on individually methods, rather than the entire type). Can also result in more idiomatic Zig code. For example, Element.innerHTML is able to take an *Io.Writer, with the mapper providing a wrapper:

    pub const innerText = bridge.accessor(_innerText, null, .{});
    fn _innerText(self: *Element, page: *const Page) ![]const u8 {
        var buf = std.Io.Writer.Allocating.init(page.call_arena);
        try self.getInnerText(&buf.writer);
        return buf.written();
    }

Consistent DOM handling

Whether an element is created by the parser or via document.createElement, the same code path is taken (as much as possible). This creates more consistency, e.g. in setting the select.selectedIndex. Individual DOM elements can opt-into build callbacks. For example Input.zig gets "created" events:

// Input.zig

pub const Build = struct {
    pub fn created(node: *Node, page: *Page) !void {
        var self = node.as(Input);
        const element = self.asElement();

        // Store initial values from attributes
        self._default_value = element.getAttributeSafe("value");
        // ....
    }
}

Naming Convention

Underscore fields names are used throughout the WebAPI in large part to avoid naming conflicts. Structs-as-a-file is used extensively, and field names are more likely to cause conflicts in this setup.

In non WebAPI classes (e.g. the Page), they are used as "private" markers. Within the project, there's now a clear "lightpanda" library, and I'm starting to think about what should and shouldn't be exposed from the library.

Memory

This branch was born from an experiment that used a hybrid memory management - arenas for DOM objects and reference counting for other types (e.g. XHR objects). Some of this complexity is retained (page._factory), despite everything using either page.arena or page.call_arena. I'm hopeful that some explicit memory management can be re-added in the future (XHR objects can hold onto large amounts of memory) - but right now, it's worthless complexity.

Note 1

In both main and zigdom, most types that are returned to JavaScript are placed on the heap. The JS bridge handles this. So zigdom isn't putting more objects on the heap than main, it's just being more explicit about it.

Note 2

Values of the _type union are almost always pointers. The only exception are 8-byte leaf nodes (i.e. leaf nodes that only have a _proto: *Parent field) which can be directly embedded into the union.

There's a real tradeoff here. But in general, zigdom aims for memory efficiency at the cost of performance and, in this case, potential fragmentation (which I'm hoping can be solved by a smarter allocation strategy). This is clearly generating many more small allocations than libdom was.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants