Skip to content

fix: Error when using paths with non-ASCII characters for file watching on Unix/MacOS#1498

Merged
rubenporras merged 1 commit intoeclipse-lsp4e:mainfrom
FlorianKroiss:non-ascii-support
Feb 19, 2026
Merged

fix: Error when using paths with non-ASCII characters for file watching on Unix/MacOS#1498
rubenporras merged 1 commit intoeclipse-lsp4e:mainfrom
FlorianKroiss:non-ascii-support

Conversation

@FlorianKroiss
Copy link
Copy Markdown
Contributor

@FlorianKroiss FlorianKroiss commented Feb 7, 2026

Make sure that the URI that we pass to the FileSystemWatcherManager contain only ASCII characters

Fixes #1497

@FlorianKroiss FlorianKroiss force-pushed the non-ascii-support branch 2 times, most recently from 942d2bf to 18c011d Compare February 8, 2026 13:18
@FlorianKroiss FlorianKroiss changed the title Add a dummy test Fix non-ASCII characters in URI Feb 8, 2026
@FlorianKroiss
Copy link
Copy Markdown
Contributor Author

FlorianKroiss commented Feb 8, 2026

The problem seems to be that the java.nio.Path implementation for Unix/MacOS expects the non-ASCII characters in the path of a file URI to be percent escaped, see JDK-8162518. For example the URI file:///ß.txt contains the German Eszett, which is obviously non-ASCII.
Trying to convert this to a Path leads to this Exception on Linux (WSL):

jshell> Paths.get(URI.create("file:///ß.txt"))
|  Exception java.lang.IllegalArgumentException: Bad escape
|        at UnixUriUtils.fromUri (UnixUriUtils.java:88)
|        at UnixFileSystemProvider.getPath (UnixFileSystemProvider.java:123)
|        at Path.of (Path.java:201)
|        at Paths.get (Paths.java:95)
|        at (#3:1)

On the other hand, Windows does not seem to have a problem with non-ASCII characters in the URI:

jshell> Paths.get(URI.create("file:///ß.txt"))
$2 ==> \ß.txt

The problematic URI in question is created by us, when processing resource changes here, so we can actually influence its creation.

My suggested fix is to do a round-trip: URI -> toASCIIString -> URI, which then leads to correct behavior, at least for my example above.

TBH I don't think that this is an elegant fix, because the same problem may also occur in other locations.
However, we applied a similar previously: #368

Sidenote for the interested reader (as I spent way too much time on this): The JDK seems quite inconsistent with conversion between URI or Paths/Files because new File(URI.create("file:///ß.txt")).toPath() works just fine because it uses a different code path.

Paths.get(URI.create("file:/ß.txt")) also works, even though it contains a non-ASCII char in the path. It only works, because there is hard-coded case to make the code compatible with File, because the URI starting with file:/ is produced by File.toURI().toString() :(

@FlorianKroiss FlorianKroiss marked this pull request as ready for review February 8, 2026 18:42
@FlorianKroiss FlorianKroiss changed the title Fix non-ASCII characters in URI fix: Error when using paths with non-ASCII characters for file watching on Unix/MacOS Feb 8, 2026
@FlorianKroiss FlorianKroiss marked this pull request as draft February 8, 2026 18:57
@FlorianKroiss
Copy link
Copy Markdown
Contributor Author

FlorianKroiss commented Feb 8, 2026

  • Add test case for file watching which triggers the problematic code path

@FlorianKroiss FlorianKroiss marked this pull request as ready for review February 11, 2026 19:21
@FlorianKroiss
Copy link
Copy Markdown
Contributor Author

This partially rolls back #1360, though the tests added there still pass.

@rubenporras rubenporras merged commit 096ada5 into eclipse-lsp4e:main Feb 19, 2026
12 checks passed
@FlorianKroiss FlorianKroiss deleted the non-ascii-support branch February 19, 2026 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

At FileSystemWatcherManager, An error occurs when path of Eclipse workspace includes non-ASCII characters.

2 participants