-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
As of now, when writing Parquet files, the offset index structures are populated and written regardless of whether statistics and column indexes are written. It is unclear if this behavior is intended or not.
Describe the solution you'd like
Add a writer option to disable the collection and writing of the offset index. Of course the offset index is required if the column index is written, so this option would probably only be useful when not writing column indexes (i.e. when the statistics level is None or Chunk).
Describe alternatives you've considered
Alternatively, the writing of the offset index could be disabled whenever the column index is disabled (i.e. when the stats level is not Page). This solution assumes the current behavior is not intentional.
Additional context