- A repo to expose some examples of 'data drift' in Azure Machine Learning
- First ensure you have a file called
sub.envin the './scripts' folder with the following line:SUB_ID=<your subscription id> - Then run the
create-workspace-sprbac.shshell script to create the AML workspace- This will also create two environment files:
config.jsonandvariables.envwhich will help with service principal authentication and be used in theauthentication.pyscript. - As part of the
create-workspace-sprbac.shscript, names are derived based upon a random choice combining thenouns.txtand theadjectives.txtfile, implemented in therandom_name.pyscript.
- This will also create two environment files:
- In the './data-drift' folder, run the
clusters.pyscript to create a cluster. - The 'get-data' folder contains some scripts for pulling data from Azure Open Datasets.
- The 'seattle-weather-data' folder contains pre-downloaded files for the 2018-2020 Seattle NOAA Weather Data.
- To trigger the data drift monitor for the Seattle Weather, run
seattle_weather_drift.py. - To trigger the data drift monitor for the US County data, run
us_county_drift.py.
- If on Mac,
brew install libomp, refer article - While the clusters.py is more of a setup script, to allow for interaction with the variables.env file, have
stored this under the
data-driftfolder.