Open data studio is an open initiative to bring machine learning and large scale data processing open-source software to click away for everyone.
Please visit open-datastudio.io
| Component | Project | Description | Integration Status |
|---|---|---|---|
| Notebook | jupyter | Jupyter Lab | Integrated |
| zeppelin | Integrates with Apache Zeppelin and Apache Spark on Kubernetes mode | Integrated | |
| Data Lake | hive-metastore | Provides hive metastore server with Postgresql database | Integrated |
| spark-thriftserver | Spark cluster on Kubernetes for ODBC/JDBC connection | Integrated | |
| Computing | ray-cluster | Ray cluster | Integrated |
| spark-serverless | On-demand Spark cluster from everywhere | Integrated | |
| Machine learning | mlflow-server | MLflow model remote tracking server and ui | Integrated |
| mlflow-model-serving | Deploy models from mlflow-server and get endpoint | Integrated | |
| Business Intelligence | metabase | Metabase Business Intelligence | Integrated |
| superset | Apache Superset Business Intelligence | Integrated | |
| Misc | spark | It does not integrates to Staroid but publishes docker image for other projects | - |
You can create issues or pull requests to contribute individual repositories under open-datasicence.
If you'd like to create a new integration project here, please create an issue in this repository.
We need your help!
- Open data studio slack channel - Join
Open data studio is an open source projects. LICENSE file is included in each repository.
