Skip to content

Bug: Error trying to decode index url from mirrors that include charset in Content-Type header #260

@MiaowFISH

Description

@MiaowFISH

Describe the bug

When using micropip.install with a custom index_urls pointing to a PyPI mirror, the installation fails if the mirror's response for the JSON API includes a charset in the Content-Type header (e.g., application/json; charset=utf-8).

The official pypi.org repository does not include this charset, but many popular third-party mirrors (such as mirrors.tuna.tsinghua.edu.cn) do, which makes them incompatible with micropip.

To Reproduce

The following code, when run in a Pyodide environment, demonstrates the failure:

async function installPackages() {
    await pyodide.loadPackage(["micropip"]);
    await pyodide.runPythonAsync(`
        import micropip
        await micropip.install(
            requirements=["commonx"],
            index_urls=["https://mirrors.tuna.tsinghua.edu.cn/pypi/{package_name}/json"],
            verbose=True,
        )
    `);
}

Expected behavior

The package commonx should be installed successfully from the mirror without any errors.

Traceback

Click to expand traceback
PythonError: Traceback (most recent call last):
  File "/lib/python3.13/site-packages/micropip/package_index.py", line 333, in query_package
    parser = _select_parser(content_type, name, index_base_url=url)
  File "/lib/python3.13/site-packages/micropip/package_index.py", line 269, in _select_parser
    raise ValueError(f"Unsupported content type: {content_type}")
ValueError: Unsupported content type: application/json; charset=utf-8

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/lib/python3.13/site-packages/micropip/package_manager.py", line 205, in install
    await transaction.gather_requirements(requirements)
  File "/lib/python3.13/site-packages/micropip/transaction.py", line 74, in gather_requirements
    await asyncio.gather(*requirement_promises)
  File "/lib/python3.13/site-packages/micropip/transaction.py", line 96, in add_requirement
    return await self.add_requirement_from_url(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.13/site-packages/micropip/transaction.py", line 103, in add_requirement_from_url
    return await self.add_wheel(wheel, extras=extras or set(), specifier="")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.13/site-packages/micropip/transaction.py", line 356, in add_wheel
    await self.gather_requirements(wheel.requires(extras))
  File "/lib/python3.13/site-packages/micropip/transaction.py", line 74, in gather_requirements
    await asyncio.gather(*requirement_promises)
  File "/lib/python3.13/site-packages/micropip/transaction.py", line 78, in add_requirement
    return await self.add_requirement_inner(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.13/site-packages/micropip/transaction.py", line 224, in add_requirement_inner
    await self._add_requirement_from_package_index(req)
  File "/lib/python3.13/site-packages/micropip/transaction.py", line 268, in _add_requirement_from_package_index
    metadata = await package_index.query_package(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
    )
    ^
  File "/lib/python3.13/site-packages/micropip/package_index.py", line 335, in query_package
    raise ValueError(f"Error trying to decode url: {url}") from e
ValueError: Error trying to decode url: https://mirrors.tuna.tsinghua.edu.cn/pypi/commonx/json

Analysis

The root cause is a strict check for the Content-Type header in the micropip source code.

As shown by the curl outputs below, pypi.org returns application/json, while the Tuna mirror returns application/json; charset=utf-8.

Official PyPI:

$ curl -I https://pypi.org/pypi/commonx/json | grep -i "content-type"
content-type: application/json

Tuna Mirror:

$ curl -I https://mirrors.tuna.tsinghua.edu.cn/pypi/commonx/json | grep -i "content-type"
content-type: application/json; charset=utf-8

The responsible code in micropip/package_index.py only allows for an exact match of application/json:

match content_type:
case "application/vnd.pypi.simple.v1+json":
return ProjectInfo.from_simple_json_api
case "application/json":
return ProjectInfo.from_json_api
case (
"application/vnd.pypi.simple.v1+html"
| "text/html"
| "text/html; charset=utf-8"
):
return partial(
ProjectInfo.from_simple_html_api,
pkgname=pkgname,
index_base_url=index_base_url,
)
case _:
raise ValueError(f"Unsupported content type: {content_type}")

This strict check leads to the ValueError: Unsupported content type when a charset is present.

Suggested Solution

To improve compatibility with third-party mirrors, the content type check should be made more flexible. For instance, checking if the content_type string starts with application/json would resolve the issue.
This would correctly handle both application/json and the common application/json; charset=utf-8 variant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions