[WIP] Add BiSeqDict, MultiSeqDict, and MultiBiSeqDict modules #1

CharlonTank · 2025-11-04T09:11:44Z

Summary

This PR adds three new bidirectional and multi-value dictionary types built on top of SeqDict:

BiSeqDict: Many-to-one bidirectional dictionary with reverse lookups
MultiSeqDict: One-to-many dictionary (keys map to multiple values)
MultiBiSeqDict: Many-to-many bidirectional dictionary

Key Features

✅ Built directly on SeqDict's HAMT implementation
✅ No comparable constraint (uses FNV hashing)
✅ Preserves insertion order
✅ Better performance than AssocList-based implementations

Changes

Added 3 new modules (1,246 lines of code)
Updated elm.json to expose new modules
Added elm-community/list-extra dependency (required for MultiSeqDict.fromFlatList)

Status

⚠️ Work In Progress - Needs Testing

The modules compile successfully but comprehensive tests need to be added.

Example Usage

BiSeqDict (Many-to-One)

manyToOne : BiSeqDict String Int
manyToOne =
    BiSeqDict.empty
        |> BiSeqDict.insert "A" 1
        |> BiSeqDict.insert "B" 2
        |> BiSeqDict.insert "C" 1  -- Multiple keys can map to same value

BiSeqDict.getReverse 1 manyToOne
--> Set.fromList ["A", "C"]

MultiSeqDict (One-to-Many)

oneToMany : MultiSeqDict String Int
oneToMany =
    MultiSeqDict.empty
        |> MultiSeqDict.insert "A" 1
        |> MultiSeqDict.insert "A" 2  -- One key can map to multiple values

MultiSeqDict.get "A" oneToMany
--> Set.fromList [1, 2]

MultiBiSeqDict (Many-to-Many)

manyToMany : MultiBiSeqDict String Int
manyToMany =
    MultiBiSeqDict.empty
        |> MultiBiSeqDict.insert "A" 1
        |> MultiBiSeqDict.insert "B" 2
        |> MultiBiSeqDict.insert "A" 2  -- Both directions support multiple mappings

MultiBiSeqDict.get "A" manyToMany
--> Set.fromList [1, 2]

MultiBiSeqDict.getReverse 2 manyToMany
--> Set.fromList ["A", "B"]

TODO

Add comprehensive test suite
Add documentation examples
Performance benchmarks
Review API naming consistency with SeqDict

🤖 This PR is based on adapting the elm-bidict library to use SeqDict instead of Dict.

Add three new dictionary types built on top of SeqDict: - BiSeqDict: Many-to-one bidirectional dictionary with reverse lookups - MultiSeqDict: One-to-many dictionary (keys map to multiple values) - MultiBiSeqDict: Many-to-many bidirectional dictionary These modules preserve insertion order and work with any types (not just comparable), leveraging SeqDict's FNV hashing implementation.

MartinSStewart · 2025-11-04T09:17:07Z

Thanks for this PR! I checked the BiSeqDict module. It looks like you still have comparable for all of the type vars which will needlessly restrict the user.

CharlonTank · 2025-11-04T11:26:42Z

@MartinSStewart Thanks for the feedback! You're absolutely right.

I've completely fixed all three modules:

✅ Replaced Set with SeqSet everywhere (function calls, type signatures, doc examples)
✅ Renamed type variables from comparable1/comparable2 to k/v for clarity
✅ Fixed normalizeSet function signatures and implementations

All modules now work with any types, fully leveraging SeqDict and SeqSet's FNV hashing. No more comparable restrictions anywhere! 🎉

Latest commit: e94d10b

@MartinSStewart

Fixes feedback from @MartinSStewart: The modules were unnecessarily restricting type variables to 'comparable'. Changes: - Replace Set with SeqSet (no comparable constraint needed) - Rename type vars from 'comparable1/comparable2' to 'k/v' for clarity - All three modules (BiSeqDict, MultiSeqDict, MultiBiSeqDict) now work with any types, not just comparables This fully leverages SeqDict's FNV hashing implementation.

Add comprehensive tests proving the modules work with non-comparable types: - BiSeqDict tests with custom types and records - MultiSeqDict tests with custom types - MultiBiSeqDict tests with custom types and bidirectional lookups Note: Tests cannot be run due to pre-existing lamdera/codecs dependency issue in the repo.

supermario

This will need a very thorough review as it's a lot of presumably AI-genn'ed code addition to primitive concepts we want to be both correct and performant.

Can you update the readme as well please with another section for these additional types and their uses?

I'd also like to see the tests be a little more comprehensive as well to match the existing level of testing for SeqDict/SeqSet.

elm.json

@supermario

…dency As suggested by @supermario, inlined gatherEqualsBy and gatherWith into Internal.ListHelpers module to keep dependency tree minimal for a core package. This removes the elm-community/list-extra dependency.

- Add encodeBiSeqDict/decodeBiSeqDict to BiSeqDict.elm - Add encodeMultiSeqDict/decodeMultiSeqDict to MultiSeqDict.elm - Add encodeMultiBiSeqDict/decodeMultiBiSeqDict to MultiBiSeqDict.elm - Enhance README with realistic opaque ID type examples - Add many-to-many chat/documents example for MultiBiSeqDict - Document known codec generation issue with Lamdera compiler

This adds Wire3 codec generation support for three new container types from lamdera/containers: - BiSeqDict: Bidirectional many-to-one dictionary - MultiSeqDict: One-to-many dictionary - MultiBiSeqDict: Many-to-many bidirectional dictionary Changes: - Add DiffableType constructors for the three new types - Add Wire3 encoder/decoder patterns following SeqDict pattern - Add TypeHash support for type hashing and migrations - Add Evergreen migration helpers for all three types Related PR: lamdera/containers#1 Test repo: https://github.com/supermario/qwertytrewq All three types use the same 2-parameter (key, value) structure as SeqDict and work with opaque types that aren't comparable.

CharlonTank · 2025-11-05T06:37:07Z

Related PRs and Testing

Compiler support PR: lamdera/compiler#69

This PR requires the compiler changes in lamdera/compiler#69 to properly generate Wire3 codecs for the three new container types when used in BackendModel.

Test repository: https://github.com/CharlonTank/qwertytrewq

The test repo demonstrates real-world usage with MultiBiSeqDict (Id ChatId) (Id DocumentId) and compiles successfully with both PRs applied.

Testing Instructions

# Clone test repo  
git clone https://github.com/CharlonTank/qwertytrewq
cd qwertytrewq

# Use override script
./override-dillon.sh /path/to/containers

# Compile (requires compiler from lamdera/compiler#69)
LDEBUG=1 EXPERIMENTAL=1 LOVR="$(pwd)/overrides" lamdera make src/Backend.elm

✅ With both PRs: Compilation succeeds
❌ Without compiler PR: Codec generation errors

- Add Id.elm with proper opaque ID types (ChatId, DocumentId) - Update Backend.elm to use Id types instead of Never constructors - Update Types.elm to use Id ChatId and Id DocumentId - Update README with links to both PRs and test results - Add COMPILER_CHANGES_NEEDED.md documentation - Install UUID package dependency Test now compiles successfully with modified compiler! Related PRs: - lamdera/containers#1 - lamdera/compiler#69

CharlonTank · 2025-11-05T07:15:31Z

This will need a very thorough review as it's a lot of presumably AI-genn'ed code addition to primitive concepts we want to be both correct and performant.

Can you update the readme as well please with another section for these additional types and their uses?

I'd also like to see the tests be a little more comprehensive as well to match the existing level of testing for SeqDict/SeqSet.

I'm working on making extensive tests for these modules but how do you run the tests locally with the modified Lamdera compiler? I tried LDEBUG=1 elm-test --compiler and LDEBUG=1 elm-test-rs --compiler but both fail on the kernel module imports (Elm.Kernel.FNV, Elm.Kernel.JsArray2). What's the recommended approach for testing Lamdera packages with kernel code?

- BiSeqDictTests.elm: 93 → 424 lines (45+ tests covering all operations plus bidirectional lookups) - MultiSeqDictTests.elm: 82 → 598 lines (51 tests for one-to-many relationships) - MultiBiSeqDictTests.elm: 91 → 913 lines (70 tests for many-to-many with reverse index consistency) All test suites now match SeqDict/SeqSet coverage level with: - Build, query, transform, and combine operations - Custom type tests (proving no comparable constraint) - Comprehensive fuzz tests for property-based testing - Reverse lookup tests specific to bidirectional types Updated README to confirm Wire3 codec support is working. Co-Authored-By: Claude <noreply@anthropic.com>

supermario · 2025-11-27T21:45:40Z

@CharlonTank sorry missed your questions. I think you'll need to make sure you compile the project with lamdera make successfully first, and then after that

LDEBUG=1 elm-test --compiler `which lamdera`

should work, assuming that lamdera --version-full you have on PATH is the correct compiled one with the additions you have.

supermario

A few comments to discuss, apologies I've not had the time to do some proper evaluation before this.

supermario · 2025-11-27T21:49:28Z

README.md

+--> Just workspace1
+
+-- Reverse lookup: Who are all members of workspace1?
+BiSeqDict.getReverse workspace1 userWorkspaces


Perhaps BiSeqDict.getKeys workspace1 userWorkspaces would be clearer?

Makes sense. Though still reads a bit awkward with the collection name - "get the keys for doc1 chat documents"... open to other ideas

supermario · 2025-11-27T21:52:39Z

README.md

+        |> MultiSeqDict.insert property2 unit201
+
+-- Get all units for property1
+MultiSeqDict.get property1 propertyUnits


Suggested change

MultiSeqDict.get property1 propertyUnits

MultiSeqDict.getAll property1 propertyUnits

A little torn on this one. .get might be what you reach for naturally when writing code.

But .getAll will be much more helpful and clear when reading code – all other .get are singular in other collections.

Agreed 👍

supermario · 2025-11-27T22:02:22Z

README.md

+--> SeqSet.fromList [doc1, doc2]
+
+-- Which chats contain doc1?
+MultiBiSeqDict.getReverse doc1 chatDocuments


Again, consideration for change to both .get -> .getAll and .getReverse -> .getKeys.

But when reading through the example code, the insertion/removal reads beautifully, but the retrieval reads clunkily, my brain is juggling which direction things are going.

MultiBiSeqDict.getAll chat1 chatDocuments

Reads to me as "get all chat1 documents" -> not bad.

MultiBiSeqDict.getReverse doc1 chatDocuments MultiBiSeqDict.getKeys doc1 chatDocuments

Reads to me as "get the reverse document chat documents?"
Or "get the keys for doc1 chat documents...?"

Feel like there's an opportunity for something much clearer here. Will think a bit more but ideas welcome.

Same - open to ideas if something reads better

supermario · 2025-11-27T22:22:45Z

src/MultiSeqDict.elm

+remove from to (MultiSeqDict d) =
+    MultiSeqDict <|
+        SeqDict.update from (Maybe.andThen (SeqSet.remove to >> normalizeSet)) d
+


Seems like we're missing a removeValues version – i.e. for the x-to-many variants, we'd want to be able to remove all values of some given value on the right hand side.

propertyUnits |> OneToMany.removeValues unit102 chatDocuments |> ManyToMany.removeValues doc1

Good catch, will add

supermario · 2025-11-27T22:26:32Z

README.md

 <sup>*Non-equatable Elm values are currently: functions, `Bytes`, `Html`, `Json.Value`, `Task`, `Cmd`, `Sub`, `Never`, `Texture`, `Shader`, and any datastructures containing these types.</sup>


+## BiSeqDict, MultiSeqDict, and MultiBiSeqDict (bidirectional and multi-value dictionaries)


One general thing I felt reading through as a whole was even after fully understanding what's going on, reading BiSeqDict and MultiBiSeqDict and MultiSeqDict I kept thinking "wait hold on which one is that?", scrolling back to the titles below to see the BiSeqDict (Many-to-One) and going "Ah right yes, this is the many-to-one one.

That just kept happening over and over... 😅

So what if instead, we names them the thing they are?

ManyToOne

OneToMany

ManyToMany

I'm not 100% a fan of this naming in general, but after redoing the examples with this naming at least to me reads a lot nicer and helps keep clarity in my head of what's going on, so I think it is a step better naming than the SeqDict versions:

ManyToOne (formerly BiSeqDict)

import ManyToOne exposing (ManyToOne) type UserId = UserId Never type WorkspaceId = WorkspaceId Never userWorkspaces : ManyToOne (Id UserId) (Id WorkspaceId) userWorkspaces = ManyToOne.empty |> ManyToOne.insert aliceId workspace1 |> ManyToOne.insert bobId workspace1 |> ManyToOne.insert charlieId workspace2 -- Forward lookup ManyToOne.get aliceId userWorkspaces --> Just workspace1 -- Reverse lookup (renamed API) ManyToOne.getKeys workspace1 userWorkspaces --> SeqSet.fromList [ aliceId, bobId ]

OneToMany (formerly MultiSeqDict)

import OneToMany exposing (OneToMany) type PropertyId = PropertyId Never type UnitId = UnitId Never propertyUnits : OneToMany (Id PropertyId) (Id UnitId) propertyUnits = OneToMany.empty |> OneToMany.insert property1 unit101 |> OneToMany.insert property1 unit102 |> OneToMany.insert property2 unit201 -- Forward lookup OneToMany.getAll property1 propertyUnits --> SeqSet.fromList [ unit101, unit102 ] -- Reverse lookup (renamed API) OneToMany.getKeys unit101 propertyUnits --> SeqSet.fromList [ property1 ] -- Remove a specific association OneToMany.remove property1 unit102 propertyUnits

ManyToMany (formerly MultiBiSeqDict)

import ManyToMany exposing (ManyToMany) type ChatId = ChatId Never type DocumentId = DocumentId Never chatDocuments : ManyToMany (Id ChatId) (Id DocumentId) chatDocuments = ManyToMany.empty |> ManyToMany.insert chat1 doc1 |> ManyToMany.insert chat1 doc2 |> ManyToMany.insert chat2 doc1 -- Forward lookup ManyToMany.getAll chat1 chatDocuments --> SeqSet.fromList [ doc1, doc2 ] -- Reverse lookup (renamed API) ManyToMany.getKeys doc1 chatDocuments --> SeqSet.fromList [ chat1, chat2 ] -- Transfer chatDocuments |> ManyToMany.remove chat1 doc2 |> ManyToMany.insert chat3 doc2

I like this naming because it naturally fits in with OneToOne.elm module I'd like to add

I like this. Will wait for team consensus on exact names

CharlonTank · 2025-11-28T15:10:44Z

Yes - will implement API changes (getAll, getKeys, removeValues) and wait for naming consensus before renaming

- get → getAll for x-to-many (MultiSeqDict, MultiBiSeqDict) - getReverse → getKeys for bidirectional types (BiSeqDict, MultiBiSeqDict) - Add removeValues to MultiSeqDict and MultiBiSeqDict

miniBill

First pass review. In general, I think these data structures could be very useful, but I wonder if putting them inside lamdera/containers is the best path forward, especially considering that versioning lamdera/* packages is currently... interesting.

Ideally I'd love to see a containers-extra package, but at the moment it would be unpublishable which would make it hard to use.

miniBill · 2025-12-03T17:41:45Z

src/Internal/ListHelpers.elm

+gatherEqualsBy : (a -> b) -> List a -> List ( a, List a )
+gatherEqualsBy extract list =
+    gatherWith (\a b -> extract a == extract b) list
+
+
+{-| Group equal elements together using a custom equality function. Elements will be
+grouped in the same order as they appear in the original list. The same applies to
+elements within each group.
+
+    gatherWith (==) [1,2,1,3,2]
+    --> [(1,[1]),(2,[2]),(3,[])]
+
+-}
+gatherWith : (a -> a -> Bool) -> List a -> List ( a, List a )
+gatherWith testFn list =
+    let
+        helper : List a -> List ( a, List a ) -> List ( a, List a )
+        helper scattered gathered =
+            case scattered of
+                [] ->
+                    List.reverse gathered
+
+                toGather :: population ->
+                    let
+                        ( gathering, remaining ) =
+                            List.partition (testFn toGather) population
+                    in
+                    helper remaining (( toGather, gathering ) :: gathered)
+    in
+    helper list []


This is O(n²). We can do better with hashing

miniBill · 2025-12-03T17:44:46Z

src/BiSeqDict.elm

+                                        )
+                in
+                reverseWithoutOld
+                    |> SeqDict.update to (Maybe.withDefault SeqSet.empty >> SeqSet.insert from >> Just)


>> is inefficient, let's expand this to a single lambda with a pipeline inside. Actually, because update is just get + insert, we can ~inline them and avoid touching the dictionary at all if not needed

miniBill · 2025-12-03T17:46:26Z

src/BiSeqDict.elm

+                                d.reverse
+                                    |> SeqDict.update oldTo_
+                                        (Maybe.map (SeqSet.remove from)
+                                            >> Maybe.andThen normalizeSet


Replace usage of >> with a pipeline

miniBill · 2025-12-03T17:48:13Z

src/BiSeqDict.elm

+    SeqDict.update from fn d.forward
+        |> fromDict


This is going to be much slower and have worse memory costs than implementing the full logic. To avoid duplication we could implement it in terms of insert and remove, but let's not call fromDict

miniBill · 2025-12-03T17:49:23Z

src/BiSeqDict.elm

+    BiSeqDict
+        { d
+            | forward = SeqDict.remove from d.forward
+            , reverse = SeqDict.filterMap (\_ set -> SeqSet.remove from set |> normalizeSet) d.reverse


This is O(n), we can do better

miniBill · 2025-12-03T17:52:27Z

src/BiSeqDict.elm

+                        SeqDict.update value
+                            (\maybeKeys ->
+                                Just <|
+                                    case maybeKeys of
+                                        Nothing ->
+                                            SeqSet.singleton key
+
+                                        Just keys_ ->
+                                            SeqSet.insert key keys_
+                            )


We can probably ~inline update here for simplicity?

src/BiSeqDict.elm

miniBill · 2025-12-03T17:54:14Z

src/BiSeqDict.elm

+-}
+filter : (k -> v -> Bool) -> BiSeqDict k v -> BiSeqDict k v
+filter fn (BiSeqDict d) =
+    -- TODO diff instead of throwing away and creating from scratch?


It really depends on how much the filter keeps, unfortunately, and we can't predict that. If filter only keeps a small portion of items, fromDict is faster, if it keeps most then doing a folded remove is going to be faster.

miniBill · 2025-12-03T17:55:48Z

src/BiSeqDict.elm

+union : BiSeqDict k v -> BiSeqDict k v -> BiSeqDict k v
+union (BiSeqDict left) (BiSeqDict right) =
+    -- TODO diff instead of throwing away and creating from scratch?
+    SeqDict.union left.forward right.forward
+        |> fromDict
+
+
+{-| Keep a key-value pair when its key appears in the second dictionary.
+Preference is given to values in the first dictionary.
+-}
+intersect : BiSeqDict k v -> BiSeqDict k v -> BiSeqDict k v
+intersect (BiSeqDict left) (BiSeqDict right) =
+    -- TODO diff instead of throwing away and creating from scratch?
+    SeqDict.intersect left.forward right.forward
+        |> fromDict
+
+
+{-| Keep a key-value pair when its key does not appear in the second dictionary.
+-}
+diff : BiSeqDict k v -> BiSeqDict k v -> BiSeqDict k v
+diff (BiSeqDict left) (BiSeqDict right) =
+    -- TODO diff instead of throwing away and creating from scratch?
+    SeqDict.diff left.forward right.forward
+        |> fromDict


For union/intersect/diff the algorithm for updating the reverse mapping might actually be nontrivial, will need to think about it

CharlonTank force-pushed the main branch from a296112 to 54d4864 Compare November 4, 2025 11:34

CharlonTank force-pushed the main branch from 54d4864 to e94d10b Compare November 4, 2025 11:40

supermario requested changes Nov 5, 2025

View reviewed changes

elm.json Outdated Show resolved Hide resolved

CharlonTank added 2 commits November 5, 2025 11:35

CharlonTank mentioned this pull request Nov 5, 2025

Add compiler support for BiSeqDict, MultiSeqDict, and MultiBiSeqDict lamdera/compiler#69

Open

CharlonTank marked this pull request as ready for review November 5, 2025 06:55

CharlonTank requested a review from supermario November 5, 2025 09:30

supermario reviewed Nov 27, 2025

View reviewed changes

Rename APIs per review feedback

449d686

- get → getAll for x-to-many (MultiSeqDict, MultiBiSeqDict) - getReverse → getKeys for bidirectional types (BiSeqDict, MultiBiSeqDict) - Add removeValues to MultiSeqDict and MultiBiSeqDict

CharlonTank requested a review from supermario November 29, 2025 10:46

miniBill reviewed Dec 3, 2025

View reviewed changes

Fix foldl/foldr docs: insertion order, not key order

eb5ad25

	MultiSeqDict.get property1 propertyUnits
	MultiSeqDict.getAll property1 propertyUnits

		<sup>*Non-equatable Elm values are currently: functions, `Bytes`, `Html`, `Json.Value`, `Task`, `Cmd`, `Sub`, `Never`, `Texture`, `Shader`, and any datastructures containing these types.</sup>


		## BiSeqDict, MultiSeqDict, and MultiBiSeqDict (bidirectional and multi-value dictionaries)

[WIP] Add BiSeqDict, MultiSeqDict, and MultiBiSeqDict modules #1

Are you sure you want to change the base?

[WIP] Add BiSeqDict, MultiSeqDict, and MultiBiSeqDict modules #1

Uh oh!

Conversation

CharlonTank commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Features

Changes

Status

Example Usage

BiSeqDict (Many-to-One)

MultiSeqDict (One-to-Many)

MultiBiSeqDict (Many-to-Many)

TODO

Uh oh!

MartinSStewart commented Nov 4, 2025

Uh oh!

CharlonTank commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supermario left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CharlonTank commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related PRs and Testing

Testing Instructions

Uh oh!

CharlonTank commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supermario commented Nov 27, 2025

Uh oh!

supermario left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ManyToOne (formerly BiSeqDict)

OneToMany (formerly MultiSeqDict)

ManyToMany (formerly MultiBiSeqDict)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CharlonTank commented Nov 28, 2025

Uh oh!

miniBill left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

CharlonTank commented Nov 4, 2025 •

edited

Loading

CharlonTank commented Nov 4, 2025 •

edited

Loading

CharlonTank commented Nov 5, 2025 •

edited

Loading

CharlonTank commented Nov 5, 2025 •

edited

Loading

ManyToOne (formerly `BiSeqDict`)

OneToMany (formerly `MultiSeqDict`)

ManyToMany (formerly `MultiBiSeqDict`)