Skip to content

AutoMLPipelineBuilder error #138

@numericx

Description

@numericx

I obtain some errors while trying to run the MMSA AutoML example with the OJ Dataset.

Create the experiment

from azureml.core import Experiment

experiment = Experiment(workspace=ws, name='mmsa-automl-training')

Connect to the dataset

from azureml.core.dataset import Dataset

oj_data_small_train_ds = Dataset.get_by_name(workspace=ws, name='oj_data_small_train')
oj_data_small_train_input = oj_data_small_train_ds.as_named_input(name='oj_data_small_train')

Choose a compute target

from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget

Choose a name for your cluster.

amlcompute_cluster_name = "cpu-cluster"

found = False

Check if this compute target already exists in the workspace.

cts = ws.compute_targets
if amlcompute_cluster_name in cts and cts[amlcompute_cluster_name].type == 'AmlCompute':
found = True
print('Found existing compute target.')
compute = cts[amlcompute_cluster_name]
[...]

Select AutoML settings

import logging

partition_column_names = ['Store', 'Brand']

automl_settings = {
"task" : 'forecasting',
"primary_metric" : 'normalized_root_mean_squared_error',
"iteration_timeout_minutes" : 20,
"iterations" : 15,
"experiment_timeout_hours" : 1,
"label_column_name" : 'Quantity',
"n_cross_validations" : 3,
# "verbosity" : logging.INFO,
"debug_log": 'automl_oj_sales_debug.txt',
"time_column_name": 'WeekStarting',
"max_horizon" : 20,
"track_child_runs": False,
"partition_column_names": partition_column_names,
"grain_column_names": ['Store', 'Brand'],
"pipeline_fetch_max_batch_size": 15
}

Create the AutoML pipeline

from azureml.contrib.automl.pipeline.steps import AutoMLPipelineBuilder

train_steps = AutoMLPipelineBuilder.get_many_models_train_steps(experiment=experiment,
automl_settings=automl_settings,
train_data=oj_data_small_train_ds,
compute_target=compute,
partition_column_names=partition_column_names,
node_count=5,
process_count_per_node=20,
run_invocation_timeout=3700,
output_datastore=default_store)

I will link the error message in a text file.
MMSA-AutoML.txt

I run this with the following Conda environment:

name: azureml-env
channels:

  • conda-forge
  • defaults
    dependencies:
  • python=3.7
  • numpy
  • pandas
  • pyarrow
  • seaborn
  • nb_conda
  • ipykernel
  • jupyterlab
  • matplotlib
  • scikit-learn
  • pip
  • pip:
    • websocket
    • azureml-sdk
    • azureml-mlflow
    • azureml-widgets
    • azureml-defaults
    • azureml-train-automl
    • azureml-opendatasets
    • azureml-pipeline-steps
    • azureml-contrib-automl-pipeline-steps

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions