Skip to content

Implement Bring Your Own DataFrame (BYOD) Strategy #1306

@MariusWirtz

Description

@MariusWirtz

Summary

Implement a Bring-Your-Own-DataFrame (BYOD) strategy in TM1py to support both pandas and polars as interchangeable DataFrame backends.

Description

Currently, TM1py functions that work with tabular data rely on pandas DataFrames. To increase flexibility and performance, these functions should be able to accept either pandas or polars DataFrames as input and return the same type as output.

Both pandas and polars should be optional dependencies to keep the core installation lightweight.

Proposed Changes

  • Update all functions that handle DataFrames (e.g., write_dataframe, execute_view_dataframe, etc.) to:

    • Accept either pandas or polars DataFrames.
    • Preserve the user’s chosen DataFrame type in outputs.
  • Add lightweight detection logic to determine which backend is being used.

  • Introduce optional dependencies in setup.py (e.g., tm1py[pandas], tm1py[polars]).

Motivation

Preliminary testing shows promising results with polars:

  • ~10% faster (end to end) write operations.

  • ~20% lower memory usage during large dataset handling.

This approach enables users to choose their preferred DataFrame engine without sacrificing TM1py’s ease of use.

Example

# Using pandas
df = pandas.DataFrame(...)
tm1.cubes.cells.write_dataframe(df, use_blob=True)

# Using polars
df = polars.DataFrame(...)
tm1.cubes.cells.write_dataframe(df, use_blob=True)

Benefits

  • Improved performance and memory efficiency for large workloads.
  • Greater flexibility for developers using different DataFrame ecosystems.
  • Backward compatibility with existing pandas-based code.

Next Steps

  • Identify all functions currently requiring pandas DataFrames.
  • Abstract common DataFrame operations (indexing, melting, etc.) to backend-neutral utilities.
  • Add test coverage for both backends.
  • Update documentation accordingly.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions