{ "cells": [ { "cell_type": "markdown", "id": "9271e4af", "metadata": {}, "source": [ "# Fault Detection Experiment\n", "\n", "- The main goal here is to show the simple interface for running a wide range of experiments in forecasting consumer energy utilization.\n", "- We aim to show modularity with respect to:\n", " 1. Dataset: the underlying data being forecasted\n", " 2. Models: take sensor measurements (mostly time series data) and output forecast of these results\n", " 3. Tasks: here we define some of the basic configurations of time series experiments such as _horizon_ and _history_.\n", "- Here we use synthetic data from [Smart-DS](https://www.nrel.gov/grid/smart-ds.html) and downloaded from [BetterGrids.org](https://db.bettergrids.org/bettergrids/handle/1001/94)\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "8ec72603", "metadata": {}, "outputs": [], "source": [ "import sys\n", "from gridds.experimenter import Experimenter\n", "from gridds.data import SmartDS \n", "from gridds.tools.viz import visualize_output\n", "import os\n", "import matplotlib.pyplot as plt\n", "from gridds.methods import VRAE, LSTM\n", "# from gridds.tools.utils import *\n", "import gridds.tools.tasks as tasks\n", "import shutil\n", "import numpy as np" ] }, { "cell_type": "markdown", "id": "c63f18e0", "metadata": {}, "source": [ "- Run all experiments from root directory" ] }, { "cell_type": "code", "execution_count": 2, "id": "cf958e46", "metadata": {}, "outputs": [], "source": [ "# run experiments from root dir ( twp up)\n", "os.chdir('../../')" ] }, { "cell_type": "markdown", "id": "af92f932", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "73257a20", "metadata": {}, "source": [ "## Instantiate Dataset\n", "- The first thing we do is build the dataset class.\n", "- We choose a train/test percentage and the the \"size\" or number of total points we want to use.\n", "- Since this is an example we use a fairly small number of data points.\n", "- We provide the dataset reader class instructions about what part of this dataset we would like to fetch and how we would like it ordered. This is applicable across many types of data. Ie; `sources` might be \"transformers\", `modalities` would be \"[phase angle, voltage]\" and `replicates` might be \"sites'.\n", " - Each of these entries needs to be a folder.\n", "- Prepare data converts these into X and y for machine learning." ] }, { "cell_type": "code", "execution_count": 5, "id": "52b6c134", "metadata": {}, "outputs": [], "source": [ "dataset = SmartDS('univariate_nrel', sites=1, test_pct=.5, normalize=False, size=300)\n", "\n", "reader_instructions = {\n", " 'sources': ['P1U'],\n", " 'modalities': ['load_data'],\n", " 'target': '', # NREL synthetic data doesn't have faults\n", " 'replicates': ['customers']\n", "}\n", "dataset.prepare_data(reader_instructions)\n", "\n", "# add faults\n", "dataset.pull_data_during_shuffle = False\n", "dataset.add_faults(n_faults=30, fault_duration=1)\n" ] }, { "cell_type": "markdown", "id": "bddac1a3", "metadata": {}, "source": [ "## Instantiate Methods\n", "- Here the modularity and instantiation procedure for methods is very clear since they are all just getting stored in a list.\n", "- We can set some parameters for each method." ] }, { "cell_type": "code", "execution_count": 6, "id": "da55ff02", "metadata": {}, "outputs": [], "source": [ "methods = []\n", "methods += [VRAE('VRAE',train_iters=50, batch_size=5, learning_rate=.001)]\n", "methods += [LSTM('LSTM', train_iters=50, batch_size=5, learning_rate=.003, layer_dim=1, hidden_size=32, dropout=0)]\n", " " ] }, { "cell_type": "markdown", "id": "2c418525", "metadata": {}, "source": [ "## Instantiate Task and Run Experiment\n", "- `Experimenter` handles the bulk of running these experiments.\n", "- we choose a task when we run `experimenter.run_experiment`. Here we chose `tasks.default_autoregression`" ] }, { "cell_type": "code", "execution_count": 7, "id": "2fae27d3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'name': 'fault_pred',\n", " 'procedure': ['fit', 'predict'],\n", " 'metrics': [,\n", " ,\n", " ,\n", " ]}" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tasks.default_fault_pred" ] }, { "cell_type": "code", "execution_count": 9, "id": "5e3e41ed", "metadata": {}, "outputs": [], "source": [ "exp = Experimenter('basic_impute', runs=1)\n", "exp.run_experiment(dataset,methods,task=tasks.default_fault_pred, clean=False)" ] }, { "cell_type": "code", "execution_count": null, "id": "0ecb6fa6", "metadata": {}, "outputs": [], "source": [ "visualize_output(os.path.join('outputs', exp.name))" ] }, { "cell_type": "code", "execution_count": null, "id": "ca098879", "metadata": {}, "outputs": [], "source": [ "# copy this file into output folder for archive \n", "curr_filepath = os.path.join(os.getcwd(), 'experiments', __file__)\n", "shutil.copy(curr_filepath, os.path.join('outputs', exp.name))" ] } ], "metadata": { "kernelspec": { "display_name": "gmlc", "language": "python", "name": "gmlc" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 5 }