-
Notifications
You must be signed in to change notification settings - Fork 1
Description
👋 Hey folks! I saw the NUM project Reddit and thought it was a great idea. Diving in I found MODL, which made me even more excited, but noticed it didn't have a ton of libraries yet, so I started hacking on one just to see if I could get something working. It's still very much a work in progress, and just something I'm hacking on in my free time, so no promises on quality.
https://github.com/bign8/modl.go
Anyway, I ran into an issue with my unicode parsing logic. Based on the test added in d066849, it appears MODL is supporting non-4 digit unicode characters which doesn't seem to match with the grammar defined below or the written specification: https://www.modl.uk/specification#hex-values.
Lines 73 to 78 in 3c78809
| fragment UNICODE | |
| : 'u' HEX HEX HEX HEX | |
| ; | |
| fragment HEX | |
| : [0-9a-fA-F] | |
| ; |
But, the Java library looks to support this behavior, which is great, I just didn't notice it really documented anywhere besides the test case and in the java source.
Given the complexity of the UnicodeEscapeReplacer, I'm not really sure the best way to represent those nuances in the grammar effectively. But having a note somewhere that non-4 digit code points are supported would be dope. Anyway, let me know what you think and I can get something in a PR for ya.
Cheers 🍻