-
Notifications
You must be signed in to change notification settings - Fork 189
Description
Description
_unknown_error() in sdk/errors/parser.py crashes with TypeError: object of type '_io.BytesIO' has no len() when the API returns an unparseable error response on a request that had a data parameter.
The bug is self-contained within the SDK: _BaseClient.do() wraps all data in BytesIO for retry/seek support (line 164-166 of _base_client.py), but _unknown_error() creates RoundTrip(raw=False), which calls _redacted_dump()
→ len(body) on the BytesIO request body. This masks the actual API error with a TypeError, making production issues impossible to diagnose.
The normal request logging path in _base_client.py:295 handles this correctly by passing raw=True when data is not None, but _unknown_error() doesn't.
Reproduction
from unittest.mock import MagicMock
import requests
from databricks.sdk._base_client import _BaseClient
def test_do_crashes_when_api_returns_unparseable_error():
client = _BaseClient(retry_timeout_seconds=1)
def mock_request(method, url, **kwargs):
prep = requests.PreparedRequest()
prep.method = method
prep.prepare_url(url, kwargs.get("params"))
prep.prepare_headers(kwargs.get("headers"))
prep.prepare_body(data=kwargs.get("data"), files=kwargs.get("files"), json=kwargs.get("json"))
response = requests.Response()
response.status_code = 500
response._content = b"\x00\x01 binary garbage that no parser can handle"
response.request = prep
return response
client._session.request = MagicMock(side_effect=mock_request)
# Pass plain bytes — the SDK wraps them in BytesIO (line 164-166), then crashes on it.
client.do("PUT", "https://example.com/api/2.0/fs/files/test.json", data=b'{"key": "value"}')
# TypeError: object of type '_io.BytesIO' has no len()
if __name__ == "__main__":
test_do_crashes_when_api_returns_unparseable_error()
Expected behavior
When the API returns an unparseable error, the SDK should surface the actual HTTP error (status code, response body) instead of crashing with a TypeError about BytesIO.
The fix is in _unknown_error() (parser.py:39): RoundTrip should be created with raw=True when request.body is not a str, or _redacted_dump() should handle non-string body types gracefully.
Is it a regression?
Unknown. The BytesIO wrapping in do() and the _unknown_error() fallback path appear to have been present for multiple versions. The bug surfaces only when the API returns a response that none of the standard error parsers
can handle, which may be rare in practice.
Debug Logs
the error occurs before the SDK can produce a debug log for the failed request.
Other Information
- OS: Windows 11
- Version: 0.67.0
Additional context
call chain:
- _BaseClient.do() wraps data in BytesIO (line 164-166 of _base_client.py)
- _perform() sends the request, calls _record_request_log() with raw=True (works fine)
- get_api_error() finds response.ok == False, tries all error parsers, none can parse it
- Falls back to _unknown_error() which creates RoundTrip(raw=False) (line 39 of parser.py)
- RoundTrip.generate() calls _redacted_dump("> ", request.body) (line 47 of round_trip_logger.py)
- _redacted_dump() calls len(body) on the BytesIO → TypeError
Also note: the error message in _unknown_error() points users to databricks-sdk-go/issues instead of databricks-sdk-py/issues.