Payload Too Large (HTTP code 413) -- Revisit Multipart threshold behavior?

**Description of Bug:**

It would be great if we could stream to `S3://` through the data.source.coop endpoint using duckdb, similar to what we can do with other S3-compliant interfaces (Minio, Ceph, etc).  It looks like `data.source.coop` requires the configuration:


```
s3 =
  multipart_threshold = 44MB
```

which is understood by the aws cli client.  Unfortunately, this option does not appear to be understood by other tools such as `duckdb` (or I think GDAL) that otherwise implement much but not all of the S3 interface.  This causes writes from these common geospatial utilities to fail to source.coop.  (Note, I understand this is _not_ the same thing as `multipart_chunksize`, which is configurable at least for GDAL).  



**Steps to Reproduce:**

```
import streamlit as st
import ibis
con = ibis.duckdb.connect()

# fill in creds
query = f'''
    CREATE OR REPLACE SECRET {key} (
        TYPE S3,
        KEY_ID '{key}',
        SECRET '{secret}',
        ENDPOINT 'data.source.coop',
        URL_STYLE 'path'
    );
    '''
    con.raw_sql(query)


# Try to write a > 44 MB file:

(con
.read_parquet("s3://cboettig/gbif/app/redlined_cities_gbif.parquet")
.to_parquet( "s3://cboettig/gbif/app/redlined_cities_gbif2.parquet")
)
```



**Expected Behavior:**


Writes a copy of the parquet file to the bucket.

**Actual Behavior:**

Error: 

```
---------------------------------------------------------------------------
HTTPException                             Traceback (most recent call last)
Cell In[9], line 3
      1 (con
      2 .read_parquet("s3://cboettig/gbif/app/redlined_cities_gbif.parquet")
----> 3 .to_parquet( "s3://cboettig/gbif/app/redlined_cities_gbif2.parquet")
      4 )

File /opt/conda/lib/python3.12/site-packages/ibis/expr/types/core.py:608, in Expr.to_parquet(self, path, params, **kwargs)
    563 @experimental
    564 def to_parquet(
    565     self,
   (...)
    569     **kwargs: Any,
    570 ) -> None:
    571     """Write the results of executing the given expression to a parquet file.
    572 
    573     This method is eager and will execute the associated expression
   (...)
    606     :::
    607     """
--> 608     self._find_backend(use_default=True).to_parquet(self, path, **kwargs)

File /opt/conda/lib/python3.12/site-packages/ibis/backends/duckdb/__init__.py:1550, in Backend.to_parquet(self, expr, path, params, **kwargs)
   1548 args = ["FORMAT 'parquet'", *(f"{k.upper()} {v!r}" for k, v in kwargs.items())]
   1549 copy_cmd = f"COPY ({query}) TO {str(path)!r} ({', '.join(args)})"
-> 1550 with self._safe_raw_sql(copy_cmd):
   1551     pass

File /opt/conda/lib/python3.12/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
    135 del self.args, self.kwds, self.func
    136 try:
--> 137     return next(self.gen)
    138 except StopIteration:
    139     raise RuntimeError("generator didn't yield") from None

File /opt/conda/lib/python3.12/site-packages/ibis/backends/duckdb/__init__.py:323, in Backend._safe_raw_sql(self, *args, **kwargs)
    321 @contextlib.contextmanager
    322 def _safe_raw_sql(self, *args, **kwargs):
--> 323     yield self.raw_sql(*args, **kwargs)

File /opt/conda/lib/python3.12/site-packages/ibis/backends/duckdb/__init__.py:97, in Backend.raw_sql(self, query, **kwargs)
     95 with contextlib.suppress(AttributeError):
     96     query = query.sql(dialect=self.name)
---> 97 return self.con.execute(query, **kwargs)

HTTPException: HTTP Error: Unable to connect to URL https://data.source.coop/cboettig/gbif/app/redlined_cities_gbif2.parquet?partNumber=1&uploadId=mUB0k9fxk6YvYYJbN7SgpWfrCU2lzfLZ7FjULA2l_IzcigHjY15G06DTuYfoI70tBE01h5A9o.WOf3gnX8ranlinYQNvfJ7N5EZhmAkJ_6bnC2mO3deAIZZCPVfFe8pRuHRGYRgE5I2xY_wWiDC_tZ3WdDRyvL7QAqHuv7j5GXs- Payload Too Large (HTTP code 413)
```




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Payload Too Large (HTTP code 413) -- Revisit Multipart threshold behavior? #40

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Payload Too Large (HTTP code 413) -- Revisit Multipart threshold behavior? #40

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions