Zombie driver fails when url contains "high bytes", non-ascii characters. The following example contains a valid Hungarian with accented characters.
https://hu.wikipedia.org/wiki/Műemlék
Desktop browsers and Mink Goutte driver translate the high bytes correctly:
https://hu.wikipedia.org/wiki/M%C5%B1eml%C3%A9k
Zombie driver sends string as-is to javascript, then bytes above 0x7f go wrong somewhere in Zombie:
https://hu.wikipedia.org/wiki/Mqeml\xe9k
It's a bit strange how characters are truncated:
- letter
é becomes \xe9 that is character code in ISO-8859-1
- letter
ű becomes q because this character does not exists in that code page
Characters that don't exist in ISO-8859-1 encoding are represented with regular letters, for example q, damage is irreversible.
Example shows that desktop browsers translate non-asci characters to percent-encoded bytes using their UTF-8 character codes:
- letter
é becomes %C3%A9
- letter
ű becomes %C5%B1
That's correct, web servers expect urls in this way.