Skip to content

Refactor ingestion, save and load functions #2394

@miratepuffin

Description

@miratepuffin

After the merge of #2391 we will have the following functions:

  • load_edges
  • load_nodes
  • load_edge_metadata
  • load_node_metadata

These will handle any dataframe inputs including (tested):

  • Pandas dataframes
  • FireDucks(.pandas) dataframes
  • Polars dataframes
  • Arrow tables
  • DuckDB (eg. DuckDBPyRelation obtained from running an SQL query)

In addition to this, we should modify the function to internally check if the user has provided a dataframe or a filepath as an argument and transparantly handle loading from this if possible. Through this we can remove the following functions as they would be rolled into the above:

  • load_edge_props_from_pandas
  • load_edge_props_from_parquet
  • load_edges_from_pandas
  • load_edges_from_parquet
  • load_node_props_from_pandas
  • load_node_props_from_parquet
  • load_nodes_from_pandas
  • load_nodes_from_parquet

To reduce confusion we will change the following:

  • load_from_file -> load_graph
  • save_to_file -> save_graph
  • save_to_zip -> flag on save_graph (to be discussed)

Post v4 merge the following functions will be removed:

  • from_parquet/to_parquet -- as we are removing the proto format and this will become the default format for save/load_graph
  • load_cached/write_updates/cache -- we have the WAL now

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions