From ce06972742fbb2064e64c5e9002a03594c76a226 Mon Sep 17 00:00:00 2001 From: Samy OUBOUAZIZ Date: Thu, 4 Dec 2025 15:21:16 +0100 Subject: [PATCH 1/5] feat(dlb): add v2 doc MTA-6795 --- .../data-lab/how-to/use-private-networks.mdx | 39 +++++++++++++++++++ 1 file changed, 39 insertions(+) create mode 100644 pages/data-lab/how-to/use-private-networks.mdx diff --git a/pages/data-lab/how-to/use-private-networks.mdx b/pages/data-lab/how-to/use-private-networks.mdx new file mode 100644 index 0000000000..e547fa50ed --- /dev/null +++ b/pages/data-lab/how-to/use-private-networks.mdx @@ -0,0 +1,39 @@ +--- +title: How to use Private Networks with your Data Lab cluster +description: This page explains how to use Private Networks with Scaleway Data Lab for Apache Spark™ +tags: private-networks private networks data lab spark apache cluster vpc +dates: + validation: 2025-06-25 + posted: 2021-06-25 +--- +import Requirements from '@macros/iam/requirements.mdx' + + +[Private Networks](/vpc/concepts/#private-networks) allow your Data Lab for Apache Spark™ cluster to communicate in an isolated and secure network without needing to be connected to the public internet. + +For full information about Scaleway Private Networks and VPC, see our [dedicated documentation](/vpc/) and [best practices guide](/vpc/reference-content/getting-most-private-networks/). + + + +- A Scaleway account logged into the [console](https://console.scaleway.com) +- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization +- [Created a Private Network](/vpc/how-to/create-private-network/) + + +## How to create a Private Network + +This action must be carried out from the Private Networks section of the console. Follow the procedure detailed in our [dedicated Private Networks documentation](/vpc/how-to/create-private-network/). + +## How to attach and detach a cluster to a Private Network + +Data Lab clusters can only be attached to a Private Network during their creation, and cannot be detached and reattached to another Private Network afterward. + +Refer to the [dedicated documentation](/data-lab/how-to/create-data-lab/) for comprehensive information on how to create a Data Lab for Apache Spark™ cluster. + +## How to delete a Private Network + + + Before deleting a Private Network, you must [detach](/vpc/how-to/attach-resources-to-pn/#how-to-detach-a-resource-from-a-private-network) all resources attached to it. + + +This must be carried out from the Private Networks section of the console. Follow the procedure detailed in our [dedicated Private Networks documentation](/vpc/how-to/delete-private-network/). \ No newline at end of file From 813b49004c1618ee112c7e52c4330587796e92a0 Mon Sep 17 00:00:00 2001 From: Samy OUBOUAZIZ Date: Thu, 4 Dec 2025 15:24:45 +0100 Subject: [PATCH 2/5] feat(dlb): update --- pages/data-lab/how-to/use-private-networks.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/data-lab/how-to/use-private-networks.mdx b/pages/data-lab/how-to/use-private-networks.mdx index e547fa50ed..bd8ae1e7eb 100644 --- a/pages/data-lab/how-to/use-private-networks.mdx +++ b/pages/data-lab/how-to/use-private-networks.mdx @@ -26,7 +26,7 @@ This action must be carried out from the Private Networks section of the console ## How to attach and detach a cluster to a Private Network -Data Lab clusters can only be attached to a Private Network during their creation, and cannot be detached and reattached to another Private Network afterward. +At the moment, Data Lab clusters can only be attached to a Private Network during their creation, and cannot be detached and reattached to another Private Network afterward. Refer to the [dedicated documentation](/data-lab/how-to/create-data-lab/) for comprehensive information on how to create a Data Lab for Apache Spark™ cluster. From 82ae49458306aa4996d21673b8e8091fbb54746f Mon Sep 17 00:00:00 2001 From: Samy OUBOUAZIZ Date: Thu, 4 Dec 2025 16:09:41 +0100 Subject: [PATCH 3/5] feat(dlb): update --- pages/data-lab/concepts.mdx | 4 ++- pages/data-lab/how-to/access-spark-ui.mdx | 31 +++++++++++++++++++++++ pages/data-lab/how-to/create-data-lab.mdx | 8 +++--- 3 files changed, 39 insertions(+), 4 deletions(-) create mode 100644 pages/data-lab/how-to/access-spark-ui.mdx diff --git a/pages/data-lab/concepts.mdx b/pages/data-lab/concepts.mdx index 6e8d9269b7..d3a732f77c 100644 --- a/pages/data-lab/concepts.mdx +++ b/pages/data-lab/concepts.mdx @@ -38,13 +38,15 @@ Lighter is a technology that enables SparkMagic commands to be readable and exec A notebook for an Apache Spark cluster is an interactive, web-based tool that allows users to write and execute code, visualize data, and share results in a collaborative environment. It connects to an Apache Spark cluster to run large-scale data processing tasks directly from the notebook interface, making it easier to develop and test data workflows. +Adding a notebook to your cluster requires 1 GB of storage. + ## Persistent volume A Persistent Volume (PV) is a cluster-wide storage resource that ensures data persistence beyond the lifecycle of individual Pods. Persistent volumes abstract the underlying storage details, allowing administrators to use various storage solutions. Apache Spark® executors require storage space for various operations, particularly to shuffle data during wide operations such as sorting, grouping, and aggregation. Wide operations are transformations that require data from different partitions to be combined, often resulting in data movement across the cluster. During the map phase, executors write data to shuffle storage, which is then read by reducers. -A PV sized properly ensures a smooth execution of your workload. +A persistent volume sized properly ensures a smooth execution of your workload. ## SparkMagic diff --git a/pages/data-lab/how-to/access-spark-ui.mdx b/pages/data-lab/how-to/access-spark-ui.mdx new file mode 100644 index 0000000000..89b6394357 --- /dev/null +++ b/pages/data-lab/how-to/access-spark-ui.mdx @@ -0,0 +1,31 @@ +--- +title: How to Access the Apache Spark™ UI +description: Step-by-step guide to access and use the Apache Spark™ UI in a Data Lab for Apache Spark™ on Scaleway. +tags: data lab apache spark ui gui console +dates: + validation: 2025-12-04 + posted: 2025-12-04 +--- + +import Requirements from '@macros/iam/requirements.mdx' + +This page explains how to Access the Apache Spark™ UI of your Data Lab for Apache Spark™ cluster. + + + +- A Scaleway account logged into the [console](https://console.scaleway.com) +- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization +- Created a [Data Lab for Apache Spark™ cluster](/data-lab/how-to/create-data-lab/) +- Created an [IAM API key](/iam/how-to/create-api-keys/) + +1. Click **Data Lab** under **Data & Analytics** on the side menu. The Data Lab for Apache Spark™ page displays. + +2. Click the name of the desired Data Lab cluster. The overview tab of the cluster displays. + +3. Click the **Open Apache Spark™ UI** button. A login page displays. + +4. Enter the **secret key** of your API key, then click **Authenticate**. The Apache Spark™ UI dashboard displays. + +From this view, you can view and monitor worker nodes, executors and applications. + +Refer to the [official Apache Spark™ documentation](https://spark.apache.org/docs/latest/web-ui.html) for comprehensive information on how to use the web UI. \ No newline at end of file diff --git a/pages/data-lab/how-to/create-data-lab.mdx b/pages/data-lab/how-to/create-data-lab.mdx index 335efadc4d..1f42fe894a 100644 --- a/pages/data-lab/how-to/create-data-lab.mdx +++ b/pages/data-lab/how-to/create-data-lab.mdx @@ -21,9 +21,11 @@ Data Lab for Apache Spark™ is a product designed to assist data scientists and 2. Click **Create Data Lab cluster**. The creation wizard displays. -3. Complete the following steps in the wizard: - - Choose an Apache Spark version from the drop-down menu. - - Select a worker node configuration. +3. Choose an Apache Spark version from the drop-down menu. + +4. Choose a main node type. If you plan to add a notebook to your cluster, select the **DDL-PLAY2-MICRO** configuration to provision sufficient resources for it. + +5. Select a worker node configuration. - Enter the desired number of worker nodes. Provisioning zero worker nodes lets you retain and access you cluster and notebook configurations, but will not allow you to run calculations. From fb2992184263e32f70255515948760bc87533e0e Mon Sep 17 00:00:00 2001 From: Samy OUBOUAZIZ Date: Thu, 4 Dec 2025 17:38:40 +0100 Subject: [PATCH 4/5] feat(dlb): update --- pages/data-lab/how-to/access-notebook.mdx | 32 +++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 pages/data-lab/how-to/access-notebook.mdx diff --git a/pages/data-lab/how-to/access-notebook.mdx b/pages/data-lab/how-to/access-notebook.mdx new file mode 100644 index 0000000000..bf7351fdce --- /dev/null +++ b/pages/data-lab/how-to/access-notebook.mdx @@ -0,0 +1,32 @@ +--- +title: How to access and use the notebook of a Data Lab cluster +description: Step-by-step guide to access and use the notebook environment in a Data Lab for Apache Spark™ on Scaleway. +tags: data lab apache spark notebook environment jupyterlab +dates: + validation: 2025-12-04 + posted: 2025-12-04 +--- + +import Requirements from '@macros/iam/requirements.mdx' + +This page explains how to access and use the notebook environment of your Data Lab for Apache Spark™ cluster. + + + +- A Scaleway account logged into the [console](https://console.scaleway.com) +- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization +- Created a [Data Lab for Apache Spark™ cluster](/data-lab/how-to/create-data-lab/) with a notebook +- Created an [IAM API key](/iam/how-to/create-api-keys/) + +## How to access the notebook of your cluster + +1. Click **Data Lab** under **Data & Analytics** on the side menu. The Data Lab for Apache Spark™ page displays. + +2. Click the name of the desired Data Lab cluster. The overview tab of the cluster displays. + +3. Click the **Open notebook** button. A login page displays. + +4. Enter the **secret key** of your API key, then click **Authenticate**. The notebook dashboard displays. + + + From 40bef224c3381360f2dc6d5a86936c6739217fa7 Mon Sep 17 00:00:00 2001 From: Samy OUBOUAZIZ Date: Thu, 4 Dec 2025 17:45:10 +0100 Subject: [PATCH 5/5] feat(dlb): update --- pages/data-lab/how-to/use-private-networks.mdx | 2 ++ 1 file changed, 2 insertions(+) diff --git a/pages/data-lab/how-to/use-private-networks.mdx b/pages/data-lab/how-to/use-private-networks.mdx index bd8ae1e7eb..9e1caeaf39 100644 --- a/pages/data-lab/how-to/use-private-networks.mdx +++ b/pages/data-lab/how-to/use-private-networks.mdx @@ -24,6 +24,8 @@ For full information about Scaleway Private Networks and VPC, see our [dedicated This action must be carried out from the Private Networks section of the console. Follow the procedure detailed in our [dedicated Private Networks documentation](/vpc/how-to/create-private-network/). +## How to use + ## How to attach and detach a cluster to a Private Network At the moment, Data Lab clusters can only be attached to a Private Network during their creation, and cannot be detached and reattached to another Private Network afterward.