Skip to content

Allow disabling the writing of Parquet Offset Index #6778

@etseidl

Description

@etseidl

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
As of now, when writing Parquet files, the offset index structures are populated and written regardless of whether statistics and column indexes are written. It is unclear if this behavior is intended or not.

Describe the solution you'd like
Add a writer option to disable the collection and writing of the offset index. Of course the offset index is required if the column index is written, so this option would probably only be useful when not writing column indexes (i.e. when the statistics level is None or Chunk).

Describe alternatives you've considered
Alternatively, the writing of the offset index could be disabled whenever the column index is disabled (i.e. when the stats level is not Page). This solution assumes the current behavior is not intentional.

Additional context

Metadata

Metadata

Assignees

Labels

enhancementAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crate

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions