How do you write a single parquet file with a specified compression? #17578
Answered
by
Jefffrey
theelderbeever
asked this question in
Q&A
-
|
As the title says? How do you just write a single parquet file with configuration? The two Datafusion 49.0.2 let options = WriterProperties::builder()
.set_compression(datafusion::parquet::basic::Compression::ZSTD(
ZstdLevel::try_new(3)?,
))
.build();
let write_options = DataFrameWriteOptions::new().with_single_file_output(true);
// This writes a single file but takes `TableParquetOptions` which can't configure compression. The docs say this is tied to `ParquetWriterOptions` but there is no way to convert between the two.
df.repartition(Partitioning::RoundRobinBatch(1))?
.write_parquet("data/data.zstd.parquet", write_options, None)
.await?;
// This accepts the `WriterProperties` but can't be configured to write a single file.
ctx.write_parquet(
df.repartition(Partitioning::RoundRobinBatch(1))?
.create_physical_plan()
.await?,
"data/data.zstd.parquet",
Some(options)
)
.await?; |
Beta Was this translation helpful? Give feedback.
Answered by
Jefffrey
Dec 3, 2025
Replies: 1 comment 2 replies
-
|
You can configure the options when using the let mut options = TableParquetOptions::default();
options.global.compression =
Some(datafusion::parquet::basic::Compression::SNAPPY.to_string());
parquet_df
.write_parquet(
"test_parquet1",
DataFrameWriteOptions::default().with_single_file_output(true),
Some(options),
)
.await?;Which should achieve what you need. To note there is currently an issue with writing single files where a directory is actually being produced, see #13323 |
Beta Was this translation helpful? Give feedback.
2 replies
Answer selected by
Jefffrey
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
You can configure the options when using the
DataFrame::write_parquetAPI like so:Which should achieve what you need. To note there is currently an issue with writing single files where a directory is actually being produced, see #13323