* initial core for forecasting library

* syncing with new structure

* __init__ files in modules

* renamed lib directory

* Added legal headers and some formatting of py files

* restructured benchmarking directory in lib

* fixed imports, warnings, legal headers

* more import fixes and legal headers

* updated instructions with package installation

* barebones library README

* moved energy benchmark to contrib

* formatting changes plus more legal headers

* Added license to the setup

* moved .swp file to contrib, not sure we need to keep it at all

* added missing headers and a brief snipet to README file

* minor wording change in readme


Former-commit-id: a5e3b26ca8
This commit is contained in:
vapaunic 2020-01-15 20:55:54 +00:00 коммит произвёл GitHub
Родитель f291a051b4
Коммит a11d74efc8
182 изменённых файлов: 2666 добавлений и 2638 удалений

Просмотреть файл

@ -16,4 +16,6 @@ ignore =
# line too long
E501
# trailing white spaces
W291
W291
# missing white space after ,
E231

1
.gitignore поставляемый
Просмотреть файл

@ -1,6 +1,7 @@
**/__pycache__
**/.ipynb_checkpoints
*.egg-info/
.vscode/
data/*
energy_load/GEFCom2017-D_Prob_MT_hourly/data/*

Просмотреть файл

@ -35,6 +35,22 @@ This will create the appropriate conda environment to run experiments. Next acti
conda activate forecast
```
During development, in case you need to update the environment due to a conda env file change, you can run
```
conda env update --file tools/environment.yaml
```
from the root of Forecasting repo.
#### Package Installation
Next you will need to install the common package for forecasting:
```bash
pip install -e forecasting_lib
```
The library is installed in developer mode with the `-e` flag. This means that all changes made to the library locally, are immediately available.
## Steps to Contributing
Here are the basic steps to get started with your first contribution. Please reach out with any questions.

Просмотреть файл

@ -1,73 +1,3 @@
# TSPerf
TSPerf is a collection of implementations of time-series forecasting algorithms in Azure cloud and comparison of their performance over benchmark datasets. Algorithm implementations are compared by model accuracy, training and scoring time and cost. Each implementation includes all the necessary instructions and tools that ensure its reproducibility.
The following table summarizes benchmarks that are currently included in TSPerf.
Benchmark | Dataset | Benchmark directory
--------------------------------------------|------------------------|---------------------------------------------
Probabilistic electricity load forecasting | GEFCom2017 | `energy_load/GEFCom2017-D_Prob_MT_Hourly`
Retail sales forecasting | Orange Juice dataset | `retail_sales/OrangeJuice_Pt_3Weeks_Weekly`
A complete documentation of TSPerf, along with the instructions for submitting and reviewing implementations, can be found [here](./docs/tsperf_rules.md). The tables below show performance of implementations that are developed so far. Source code of implementations and instructions for reproducing their performance can be found in submission folders, which are linked in the first column.
## Probabilistic energy forecasting performance board
The following table lists the current submision for the energy forecasting and their respective performances.
Submission Name | Pinball Loss | Training and Scoring Time (sec) | Training and Scoring Cost($) | Architecture | Framework | Algorithm | Uni/Multivariate | External Feature Support
-----------------------------------------------------------------|----------------|-----------------------------------|--------------------------------|-----------------------------------------------|------------------------------------|---------------------------------------|--------------------|--------------------------
[Baseline](benchmarks%2FGEFCom2017_D_Prob_MT_hourly%2Fbaseline) | 84.12 | 188 | 0.0201 | Linux DSVM (Standard D8s v3 - Premium SSD) | quantreg package of R | Linear Quantile Regression | Multivariate | Yes
[GBM](benchmarks%2FGEFCom2017_D_Prob_MT_hourly%2FGBM) | 78.84 | 269 | 0.0287 | Linux DSVM (Standard D8s v3 - Premium SSD) | gbm package of R | Gradient Boosting Decision Tree | Multivariate | Yes
[QRF](benchmarks%2FGEFCom2017_D_Prob_MT_hourly%2Fqrf) | 76.29 | 20322 | 17.19 | Linux DSVM (F72s v2 - Premium SSD) | scikit-garden package of Python | Quantile Regression Forest | Multivariate | Yes
[FNN](benchmarks%2FGEFCom2017_D_Prob_MT_hourly%2Ffnn) | 80.06 | 1085 | 0.1157 | Linux DSVM (Standard D8s v3 - Premium SSD) | qrnn package of R | Quantile Regression Neural Network | Multivariate | Yes
| | | | | | | |
The following chart compares the submissions performance on accuracy in Pinball Loss vs. Training and Scoring cost in $:
![EnergyPBLvsTime](./docs/images/Energy-Cost.png)
## Retail sales forecasting performance board
The following table lists the current submision for the retail forecasting and their respective performances.
Submission Name | MAPE (%) | Training and Scoring Time (sec) | Training and Scoring Cost ($) | Architecture | Framework | Algorithm | Uni/Multivariate | External Feature Support
----------------------------------------------------------------------------|------------|-----------------------------------|---------------------------------|----------------------------------------------|------------------------------|---------------------------------------------------------------------|--------------------|--------------------------
[Baseline](benchmarks%2FOrangeJuice_Pt_3Weeks_Weekly%2Fbaseline) | 109.67 | 114.06 | 0.003 | Linux DSVM(Standard D2s v3 - Premium SSD) | forecast package of R | Naive Forecast | Univariate | No
[AutoARIMA](benchmarks%2FOrangeJuice_Pt_3Weeks_Weekly%2FARIMA) | 70.80 | 265.94 | 0.0071 | Linux DSVM(Standard D2s v3 - Premium SSD) | forecast package of R | Auto ARIMA | Multivariate | Yes
[ETS](benchmarks%2FOrangeJuice_Pt_3Weeks_Weekly%2FETS) | 70.99 | 277 | 0.01 | Linux DSVM(Standard D2s v3 - Premium SSD) | forecast package of R | ETS | Multivariate | No
[MeanForecast](benchmarks%2FOrangeJuice_Pt_3Weeks_Weekly%2FMeanForecast) | 70.74 | 69.88 | 0.002 | Linux DSVM(Standard D2s v3 - Premium SSD) | forecast package of R | Mean forecast | Univariate | No
[SeasonalNaive](benchmarks%2FOrangeJuice_Pt_3Weeks_Weekly%2FSeasonalNaive) | 165.06 | 160.45 | 0.004 | Linux DSVM(Standard D2s v3 - Premium SSD) | forecast package of R | Seasonal Naive | Univariate | No
[LightGBM](benchmarks%2FOrangeJuice_Pt_3Weeks_Weekly%2FLightGBM) | 36.28 | 625.10 | 0.0167 | Linux DSVM (Standard D2s v3 - Premium SSD) | lightGBM package of Python | Gradient Boosting Decision Tree | Multivariate | Yes
[DilatedCNN](benchmarks%2FOrangeJuice_Pt_3Weeks_Weekly%2FDilatedCNN) | 37.09 | 413 | 0.1032 | Ubuntu VM(NC6 - Standard HDD) | Keras and Tensorflow | Python + Dilated convolutional neural network | Multivariate | Yes
[RNN Encoder-Decoder](benchmarks%2FOrangeJuice_Pt_3Weeks_Weekly%2FRNN) | 37.68 | 669 | 0.2 | Ubuntu VM(NC6 - Standard HDD) | Tensorflow | Python + Encoder-decoder architecture of recurrent neural network | Multivariate | Yes
The following chart compares the submissions performance on accuracy in %MAPE vs. Training and Scoring cost in $:
![EnergyPBLvsTime](./docs/images/Retail-Cost.png)
## Build Status
| Build Type | Branch | Status | | Branch | Status |
| --- | --- | --- | --- | --- | --- |
| **Python Linux CPU** | master | [![Build Status](https://dev.azure.com/best-practices/forecasting/_apis/build/status/python_unit_tests_base?branchName=master)](https://dev.azure.com/best-practices/forecasting/_build/latest?definitionId=12&branchName=master) | | staging | [![Build Status](https://dev.azure.com/best-practices/forecasting/_apis/build/status/python_unit_tests_base?branchName=chenhui/python_test_pipeline)](https://dev.azure.com/best-practices/forecasting/_build/latest?definitionId=12&branchName=chenhui/python_test_pipeline) |
| **R Linux CPU** | master | [![Build Status](https://dev.azure.com/best-practices/forecasting/_apis/build/status/Forecasting/r_unit_tests_prototype?branchName=master)](https://dev.azure.com/best-practices/forecasting/_build/latest?definitionId=9&branchName=master) | | staging | [![Build Status](https://dev.azure.com/best-practices/forecasting/_apis/build/status/Forecasting/r_unit_tests_prototype?branchName=zhouf/r_test_pipeline)](https://dev.azure.com/best-practices/forecasting/_build/latest?definitionId=9&branchName=zhouf/r_test_pipeline) |
# Forecasting Best Practices
This repository contains examples and best practices for building Forecasting solutions and systems, provided as [Jupyter notebooks](examples) and [a library of utility functions](forecasting_lib). The focus of the repository is on state-of-the-art methods and common scenarios that are popular among researchers and practitioners working on forecasting problems.

Просмотреть файл

@ -1,59 +0,0 @@
"""
This script passes the input arguments of AzureML job to the R script train_validate_aml.R,
and then passes the output of train_validate_aml.R back to AzureML.
"""
import subprocess
import os
import sys
import getopt
import pandas as pd
from datetime import datetime
from azureml.core import Run
import time
start_time = time.time()
run = Run.get_submitted_run()
base_command = "Rscript train_validate_aml.R"
if __name__ == '__main__':
opts, args = getopt.getopt(
sys.argv[1:], '', ['path=', 'cv_path=', 'n_hidden_1=', 'n_hidden_2=',
'iter_max=', 'penalty='])
for opt, arg in opts:
if opt == '--path':
path = arg
elif opt == '--cv_path':
cv_path = arg
elif opt == '--n_hidden_1':
n_hidden_1 = arg
elif opt == '--n_hidden_2':
n_hidden_2 = arg
elif opt == '--iter_max':
iter_max = arg
elif opt == '--penalty':
penalty = arg
time_stamp = datetime.now().strftime('%Y%m%d%H%M%S')
task = " ".join([base_command,
'--path', path,
'--cv_path', cv_path,
'--n_hidden_1', n_hidden_1,
'--n_hidden_2', n_hidden_2,
'--iter_max', iter_max,
'--penalty', penalty,
'--time_stamp', time_stamp])
process = subprocess.call(task, shell=True)
# process.communicate()
# process.wait()
output_file_name = 'cv_output_' + time_stamp + '.csv'
result = pd.read_csv(os.path.join(cv_path, output_file_name))
APL = result['loss'].mean()
print(APL)
print("--- %s seconds ---" % (time.time() - start_time))
run.log('average pinball loss', APL)

Просмотреть файл

@ -1,77 +0,0 @@
# coding: utf-8
# Create input features for the Dilated Convolutional Neural Network (CNN) model.
import os
import sys
import math
import datetime
import numpy as np
import pandas as pd
# Append TSPerf path to sys.path
tsperf_dir = '.'
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
# Import TSPerf components
from utils import *
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
def make_features(pred_round, train_dir, pred_steps, offset, store_list, brand_list):
"""Create a dataframe of the input features.
Args:
pred_round (Integer): Prediction round
train_dir (String): Path of the training data directory
pred_steps (Integer): Number of prediction steps
offset (Integer): Length of training data skipped in the retraining
store_list (Numpy Array): List of all the store IDs
brand_list (Numpy Array): List of all the brand IDs
Returns:
data_filled (Dataframe): Dataframe including the input features
data_scaled (Dataframe): Dataframe including the normalized features
"""
# Load training data
train_df = pd.read_csv(os.path.join(train_dir, 'train_round_'+str(pred_round+1)+'.csv'))
train_df['move'] = train_df['logmove'].apply(lambda x: round(math.exp(x)))
train_df = train_df[['store', 'brand', 'week', 'move']]
# Create a dataframe to hold all necessary data
week_list = range(bs.TRAIN_START_WEEK+offset, bs.TEST_END_WEEK_LIST[pred_round]+1)
d = {'store': store_list,
'brand': brand_list,
'week': week_list}
data_grid = df_from_cartesian_product(d)
data_filled = pd.merge(data_grid, train_df, how='left',
on=['store', 'brand', 'week'])
# Get future price, deal, and advertisement info
aux_df = pd.read_csv(os.path.join(train_dir, 'aux_round_'+str(pred_round+1)+'.csv'))
data_filled = pd.merge(data_filled, aux_df, how='left',
on=['store', 'brand', 'week'])
# Create relative price feature
price_cols = ['price1', 'price2', 'price3', 'price4', 'price5', 'price6', 'price7', 'price8', \
'price9', 'price10', 'price11']
data_filled['price'] = data_filled.apply(lambda x: x.loc['price' + str(int(x.loc['brand']))], axis=1)
data_filled['avg_price'] = data_filled[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
data_filled['price_ratio'] = data_filled['price'] / data_filled['avg_price']
data_filled.drop(price_cols, axis=1, inplace=True)
# Fill missing values
data_filled = data_filled.groupby(['store', 'brand']). \
apply(lambda x: x.fillna(method='ffill').fillna(method='bfill'))
# Create datetime features
data_filled['week_start'] = data_filled['week'].apply(lambda x: bs.FIRST_WEEK_START + datetime.timedelta(days=(x-1)*7))
data_filled['month'] = data_filled['week_start'].apply(lambda x: x.month)
data_filled['week_of_month'] = data_filled['week_start'].apply(lambda x: week_of_month(x))
data_filled.drop('week_start', axis=1, inplace=True)
# Normalize the dataframe of features
cols_normalize = data_filled.columns.difference(['store', 'brand', 'week'])
data_scaled, min_max_scaler = normalize_dataframe(data_filled, cols_normalize)
return data_filled, data_scaled

Просмотреть файл

@ -1,199 +0,0 @@
# coding: utf-8
# Train and score a Dilated Convolutional Neural Network (CNN) model using Keras package with TensorFlow backend.
import os
import sys
import keras
import random
import argparse
import numpy as np
import pandas as pd
import tensorflow as tf
from keras import optimizers
from keras.layers import *
from keras.models import Model, load_model
from keras.callbacks import ModelCheckpoint
# Append TSPerf path to sys.path (assume we run the script from TSPerf directory)
tsperf_dir = '.'
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
# Import TSPerf components
from utils import *
from make_features import make_features
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
# Model definition
def create_dcnn_model(seq_len, kernel_size=2, n_filters=3, n_input_series=1, n_outputs=1):
"""Create a Dilated CNN model.
Args:
seq_len (Integer): Input sequence length
kernel_size (Integer): Kernel size of each convolutional layer
n_filters (Integer): Number of filters in each convolutional layer
n_outputs (Integer): Number of outputs in the last layer
Returns:
Keras Model object
"""
# Sequential input
seq_in = Input(shape=(seq_len, n_input_series))
# Categorical input
cat_fea_in = Input(shape=(2,), dtype='uint8')
store_id = Lambda(lambda x: x[:, 0, None])(cat_fea_in)
brand_id = Lambda(lambda x: x[:, 1, None])(cat_fea_in)
store_embed = Embedding(MAX_STORE_ID+1, 7, input_length=1)(store_id)
brand_embed = Embedding(MAX_BRAND_ID+1, 4, input_length=1)(brand_id)
# Dilated convolutional layers
c1 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=1,
padding='causal', activation='relu')(seq_in)
c2 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=2,
padding='causal', activation='relu')(c1)
c3 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=4,
padding='causal', activation='relu')(c2)
# Skip connections
c4 = concatenate([c1, c3])
# Output of convolutional layers
conv_out = Conv1D(8, 1, activation='relu')(c4)
conv_out = Dropout(args.dropout_rate)(conv_out)
conv_out = Flatten()(conv_out)
# Concatenate with categorical features
x = concatenate([conv_out, Flatten()(store_embed), Flatten()(brand_embed)])
x = Dense(16, activation='relu')(x)
output = Dense(n_outputs, activation='linear')(x)
# Define model interface, loss function, and optimizer
model = Model(inputs=[seq_in, cat_fea_in], outputs=output)
return model
if __name__ == '__main__':
# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument('--seed', type=int, dest='seed', default=1, help='random seed')
parser.add_argument('--seq-len', type=int, dest='seq_len', default=15, help='length of the input sequence')
parser.add_argument('--dropout-rate', type=float, dest='dropout_rate', default=0.01, help='dropout ratio')
parser.add_argument('--batch-size', type=int, dest='batch_size', default=64, help='mini batch size for training')
parser.add_argument('--learning-rate', type=float, dest='learning_rate', default=0.015, help='learning rate')
parser.add_argument('--epochs', type=int, dest='epochs', default=25, help='# of epochs')
args = parser.parse_args()
# Fix random seeds
np.random.seed(args.seed)
random.seed(args.seed)
tf.set_random_seed(args.seed)
# Data paths
DATA_DIR = os.path.join(tsperf_dir, 'retail_sales', 'OrangeJuice_Pt_3Weeks_Weekly', 'data')
SUBMISSION_DIR = os.path.join(tsperf_dir, 'retail_sales', 'OrangeJuice_Pt_3Weeks_Weekly', 'submissions', 'DilatedCNN')
TRAIN_DIR = os.path.join(DATA_DIR, 'train')
# Dataset parameters
MAX_STORE_ID = 137
MAX_BRAND_ID = 11
# Parameters of the model
PRED_HORIZON = 3
PRED_STEPS = 2
SEQ_LEN = args.seq_len
DYNAMIC_FEATURES = ['deal', 'feat', 'month', 'week_of_month', 'price', 'price_ratio']
STATIC_FEATURES = ['store', 'brand']
# Get unique stores and brands
train_df = pd.read_csv(os.path.join(TRAIN_DIR, 'train_round_1.csv'))
store_list = train_df['store'].unique()
brand_list = train_df['brand'].unique()
store_brand = [(x,y) for x in store_list for y in brand_list]
# Train and predict for all forecast rounds
pred_all = []
file_name = os.path.join(SUBMISSION_DIR, 'dcnn_model.h5')
for r in range(bs.NUM_ROUNDS):
print('---- Round ' + str(r+1) + ' ----')
offset = 0 if r==0 else 40+r*PRED_STEPS
# Create features
data_filled, data_scaled = make_features(r, TRAIN_DIR, PRED_STEPS, offset, store_list, brand_list)
# Create sequence array for 'move'
start_timestep = 0
end_timestep = bs.TRAIN_END_WEEK_LIST[r]-bs.TRAIN_START_WEEK-PRED_HORIZON
train_input1 = gen_sequence_array(data_scaled, store_brand, SEQ_LEN, ['move'], start_timestep, end_timestep-offset)
# Create sequence array for other dynamic features
start_timestep = PRED_HORIZON
end_timestep = bs.TRAIN_END_WEEK_LIST[r]-bs.TRAIN_START_WEEK
train_input2 = gen_sequence_array(data_scaled, store_brand, SEQ_LEN, DYNAMIC_FEATURES, start_timestep, end_timestep-offset)
seq_in = np.concatenate([train_input1, train_input2], axis=2)
# Create array of static features
total_timesteps = bs.TRAIN_END_WEEK_LIST[r]-bs.TRAIN_START_WEEK-SEQ_LEN-PRED_HORIZON+2
cat_fea_in = static_feature_array(data_filled, total_timesteps-offset, STATIC_FEATURES)
# Create training output
start_timestep = SEQ_LEN+PRED_HORIZON-PRED_STEPS
end_timestep = bs.TRAIN_END_WEEK_LIST[r]-bs.TRAIN_START_WEEK
train_output = gen_sequence_array(data_filled, store_brand, PRED_STEPS, ['move'], start_timestep, end_timestep-offset)
train_output = np.squeeze(train_output)
# Create and train model
if r == 0:
model = create_dcnn_model(seq_len=SEQ_LEN, n_filters=2, n_input_series=1+len(DYNAMIC_FEATURES), n_outputs=PRED_STEPS)
adam = optimizers.Adam(lr=args.learning_rate)
model.compile(loss='mape', optimizer=adam, metrics=['mape'])
# Define checkpoint and fit model
checkpoint = ModelCheckpoint(file_name, monitor='loss', save_best_only=True, mode='min', verbose=0)
callbacks_list = [checkpoint]
history = model.fit([seq_in, cat_fea_in], train_output, epochs=args.epochs, batch_size=args.batch_size, callbacks=callbacks_list, verbose=0)
else:
model = load_model(file_name)
checkpoint = ModelCheckpoint(file_name, monitor='loss', save_best_only=True, mode='min', verbose=0)
callbacks_list = [checkpoint]
history = model.fit([seq_in, cat_fea_in], train_output, epochs=1, batch_size=args.batch_size, callbacks=callbacks_list, verbose=0)
# Get inputs for prediction
start_timestep = bs.TEST_START_WEEK_LIST[r] - bs.TRAIN_START_WEEK - SEQ_LEN - PRED_HORIZON + PRED_STEPS
end_timestep = bs.TEST_START_WEEK_LIST[r] - bs.TRAIN_START_WEEK + PRED_STEPS - 1 - PRED_HORIZON
test_input1 = gen_sequence_array(data_scaled, store_brand, SEQ_LEN, ['move'], start_timestep-offset, end_timestep-offset)
start_timestep = bs.TEST_END_WEEK_LIST[r] - bs.TRAIN_START_WEEK - SEQ_LEN + 1
end_timestep = bs.TEST_END_WEEK_LIST[r] - bs.TRAIN_START_WEEK
test_input2 = gen_sequence_array(data_scaled, store_brand, SEQ_LEN, DYNAMIC_FEATURES, start_timestep-offset, end_timestep-offset)
seq_in = np.concatenate([test_input1, test_input2], axis=2)
total_timesteps = 1
cat_fea_in = static_feature_array(data_filled, total_timesteps, STATIC_FEATURES)
# Make prediction
pred = np.round(model.predict([seq_in, cat_fea_in]))
# Create dataframe for submission
exp_output = data_filled[data_filled.week >= bs.TEST_START_WEEK_LIST[r]].reset_index(drop=True)
exp_output = exp_output[['store', 'brand', 'week']]
pred_df = exp_output.sort_values(['store', 'brand', 'week']).\
loc[:,['store', 'brand', 'week']].\
reset_index(drop=True)
pred_df['weeks_ahead'] = pred_df['week'] - bs.TRAIN_END_WEEK_LIST[r]
pred_df['round'] = r+1
pred_df['prediction'] = np.reshape(pred, (pred.size, 1))
pred_all.append(pred_df)
# Generate submission
submission = pd.concat(pred_all, axis=0).reset_index(drop=True)
submission = submission[['round', 'store', 'brand', 'week', 'weeks_ahead', 'prediction']]
filename = 'submission_seed_' + str(args.seed) + '.csv'
submission.to_csv(os.path.join(SUBMISSION_DIR, filename), index=False)
print('Done')

Просмотреть файл

@ -1,200 +0,0 @@
# coding: utf-8
# Perform cross validation of a Dilated Convolutional Neural Network (CNN) model on the training data of the 1st forecast round.
import os
import sys
import math
import keras
import argparse
import datetime
import numpy as np
import pandas as pd
from utils import *
from keras.layers import *
from keras.models import Model
from keras import optimizers
from keras.utils import multi_gpu_model
from azureml.core import Run
# Model definition
def create_dcnn_model(seq_len, kernel_size=2, n_filters=3, n_input_series=1, n_outputs=1):
"""Create a Dilated CNN model.
Args:
seq_len (Integer): Input sequence length
kernel_size (Integer): Kernel size of each convolutional layer
n_filters (Integer): Number of filters in each convolutional layer
n_outputs (Integer): Number of outputs in the last layer
Returns:
Keras Model object
"""
# Sequential input
seq_in = Input(shape=(seq_len, n_input_series))
# Categorical input
cat_fea_in = Input(shape=(2,), dtype='uint8')
store_id = Lambda(lambda x: x[:, 0, None])(cat_fea_in)
brand_id = Lambda(lambda x: x[:, 1, None])(cat_fea_in)
store_embed = Embedding(MAX_STORE_ID+1, 7, input_length=1)(store_id)
brand_embed = Embedding(MAX_BRAND_ID+1, 4, input_length=1)(brand_id)
# Dilated convolutional layers
c1 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=1,
padding='causal', activation='relu')(seq_in)
c2 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=2,
padding='causal', activation='relu')(c1)
c3 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=4,
padding='causal', activation='relu')(c2)
# Skip connections
c4 = concatenate([c1, c3])
# Output of convolutional layers
conv_out = Conv1D(8, 1, activation='relu')(c4)
conv_out = Dropout(args.dropout_rate)(conv_out)
conv_out = Flatten()(conv_out)
# Concatenate with categorical features
x = concatenate([conv_out, Flatten()(store_embed), Flatten()(brand_embed)])
x = Dense(16, activation='relu')(x)
output = Dense(n_outputs, activation='linear')(x)
# Define model interface, loss function, and optimizer
model = Model(inputs=[seq_in, cat_fea_in], outputs=output)
return model
if __name__ == '__main__':
# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument('--data-folder', type=str, dest='data_folder', help='data folder mounting point')
parser.add_argument('--seq-len', type=int, dest='seq_len', default=20, help='length of the input sequence')
parser.add_argument('--batch-size', type=int, dest='batch_size', default=64, help='mini batch size for training')
parser.add_argument('--dropout-rate', type=float, dest='dropout_rate', default=0.10, help='dropout ratio')
parser.add_argument('--learning-rate', type=float, dest='learning_rate', default=0.01, help='learning rate')
parser.add_argument('--epochs', type=int, dest='epochs', default=30, help='# of epochs')
args = parser.parse_args()
args.dropout_rate = round(args.dropout_rate, 2)
print(args)
# Start an Azure ML run
run = Run.get_context()
# Data paths
DATA_DIR = args.data_folder
TRAIN_DIR = os.path.join(DATA_DIR, 'train')
# Data and forecast problem parameters
MAX_STORE_ID = 137
MAX_BRAND_ID = 11
PRED_HORIZON = 3
PRED_STEPS = 2
TRAIN_START_WEEK = 40
TRAIN_END_WEEK_LIST = list(range(135,159,2))
TEST_START_WEEK_LIST = list(range(137,161,2))
TEST_END_WEEK_LIST = list(range(138,162,2))
# The start datetime of the first week in the record
FIRST_WEEK_START = pd.to_datetime('1989-09-14 00:00:00')
# Input sequence length and feature names
SEQ_LEN = args.seq_len
DYNAMIC_FEATURES = ['deal', 'feat', 'month', 'week_of_month', 'price', 'price_ratio']
STATIC_FEATURES = ['store', 'brand']
# Get unique stores and brands
train_df = pd.read_csv(os.path.join(TRAIN_DIR, 'train_round_1.csv'))
store_list = train_df['store'].unique()
brand_list = train_df['brand'].unique()
store_brand = [(x,y) for x in store_list for y in brand_list]
# Train and validate the model using only the first round data
r = 0
print('---- Round ' + str(r+1) + ' ----')
# Load training data
train_df = pd.read_csv(os.path.join(TRAIN_DIR, 'train_round_'+str(r+1)+'.csv'))
train_df['move'] = train_df['logmove'].apply(lambda x: round(math.exp(x)))
train_df = train_df[['store', 'brand', 'week', 'move']]
# Create a dataframe to hold all necessary data
week_list = range(TRAIN_START_WEEK, TEST_END_WEEK_LIST[r]+1)
d = {'store': store_list,
'brand': brand_list,
'week': week_list}
data_grid = df_from_cartesian_product(d)
data_filled = pd.merge(data_grid, train_df, how='left',
on=['store', 'brand', 'week'])
# Get future price, deal, and advertisement info
aux_df = pd.read_csv(os.path.join(TRAIN_DIR, 'aux_round_'+str(r+1)+'.csv'))
data_filled = pd.merge(data_filled, aux_df, how='left',
on=['store', 'brand', 'week'])
# Create relative price feature
price_cols = ['price1', 'price2', 'price3', 'price4', 'price5', 'price6', 'price7', 'price8', \
'price9', 'price10', 'price11']
data_filled['price'] = data_filled.apply(lambda x: x.loc['price' + str(int(x.loc['brand']))], axis=1)
data_filled['avg_price'] = data_filled[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
data_filled['price_ratio'] = data_filled.apply(lambda x: x['price'] / x['avg_price'], axis=1)
# Fill missing values
data_filled = data_filled.groupby(['store', 'brand']). \
apply(lambda x: x.fillna(method='ffill').fillna(method='bfill'))
# Create datetime features
data_filled['week_start'] = data_filled['week'].apply(lambda x: FIRST_WEEK_START + datetime.timedelta(days=(x-1)*7))
data_filled['day'] = data_filled['week_start'].apply(lambda x: x.day)
data_filled['week_of_month'] = data_filled['week_start'].apply(lambda x: week_of_month(x))
data_filled['month'] = data_filled['week_start'].apply(lambda x: x.month)
data_filled.drop('week_start', axis=1, inplace=True)
# Normalize the dataframe of features
cols_normalize = data_filled.columns.difference(['store', 'brand', 'week'])
data_scaled, min_max_scaler = normalize_dataframe(data_filled, cols_normalize)
# Create sequence array for 'move'
start_timestep = 0
end_timestep = TRAIN_END_WEEK_LIST[r]-TRAIN_START_WEEK-PRED_HORIZON
train_input1 = gen_sequence_array(data_scaled, store_brand, SEQ_LEN, ['move'], start_timestep, end_timestep)
# Create sequence array for other dynamic features
start_timestep = PRED_HORIZON
end_timestep = TRAIN_END_WEEK_LIST[r]-TRAIN_START_WEEK
train_input2 = gen_sequence_array(data_scaled, store_brand, SEQ_LEN, DYNAMIC_FEATURES, start_timestep, end_timestep)
seq_in = np.concatenate((train_input1, train_input2), axis=2)
# Create array of static features
total_timesteps = TRAIN_END_WEEK_LIST[r]-TRAIN_START_WEEK-SEQ_LEN-PRED_HORIZON+2
cat_fea_in = static_feature_array(data_filled, total_timesteps, STATIC_FEATURES)
# Create training output
start_timestep = SEQ_LEN+PRED_HORIZON-PRED_STEPS
end_timestep = TRAIN_END_WEEK_LIST[r]-TRAIN_START_WEEK
train_output = gen_sequence_array(data_filled, store_brand, PRED_STEPS, ['move'], start_timestep, end_timestep)
train_output = np.squeeze(train_output)
# Create model
model = create_dcnn_model(seq_len=SEQ_LEN, n_filters=2, n_input_series=1+len(DYNAMIC_FEATURES), n_outputs=PRED_STEPS)
# Convert to GPU model
try:
model = multi_gpu_model(model)
print('Training using multiple GPUs...')
except:
print('Training using single GPU or CPU...')
adam = optimizers.Adam(lr=args.learning_rate)
model.compile(loss='mape', optimizer=adam, metrics=['mape', 'mae'])
# Model training and validation
history = model.fit([seq_in, cat_fea_in], train_output, epochs=args.epochs, batch_size=args.batch_size, validation_split=0.05)
val_loss = history.history['val_loss'][-1]
print('Validation loss is {}'.format(val_loss))
# Log the validation loss/MAPE
run.log('MAPE', np.float(val_loss))

Просмотреть файл

@ -1,128 +0,0 @@
# coding: utf-8
# Train and score a boosted decision tree model using [LightGBM Python package](https://github.com/Microsoft/LightGBM) from Microsoft,
# which is a fast, distributed, high performance gradient boosting framework based on decision tree algorithms.
import os
import sys
import argparse
import numpy as np
import pandas as pd
import lightgbm as lgb
import warnings
warnings.filterwarnings('ignore')
# Append TSPerf path to sys.path
tsperf_dir = os.getcwd()
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
from make_features import make_features
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
def make_predictions(df, model):
"""Predict sales with the trained GBM model.
Args:
df (Dataframe): Dataframe including all needed features
model (Model): Trained GBM model
Returns:
Dataframe including the predicted sales of every store-brand
"""
predictions = pd.DataFrame({'move': model.predict(df.drop('move', axis=1))})
predictions['move'] = predictions['move'].apply(lambda x: round(x))
return pd.concat([df[['brand', 'store', 'week']].reset_index(drop=True), predictions], axis=1)
if __name__ == '__main__':
# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument('--seed', type=int, dest='seed', default=1, help='Random seed of GBM model')
parser.add_argument('--num-leaves', type=int, dest='num_leaves', default=124, help='# of leaves of the tree')
parser.add_argument('--min-data-in-leaf', type=int, dest='min_data_in_leaf', default=340, help='minimum # of samples in each leaf')
parser.add_argument('--learning-rate', type=float, dest='learning_rate', default=0.1, help='learning rate')
parser.add_argument('--feature-fraction', type=float, dest='feature_fraction', default=0.65, help='ratio of features used in each iteration')
parser.add_argument('--bagging-fraction', type=float, dest='bagging_fraction', default=0.87, help='ratio of samples used in each iteration')
parser.add_argument('--bagging-freq', type=int, dest='bagging_freq', default=19, help='bagging frequency')
parser.add_argument('--max-rounds', type=int, dest='max_rounds', default=940, help='# of boosting iterations')
parser.add_argument('--max-lag', type=int, dest='max_lag', default=19, help='max lag of unit sales')
parser.add_argument('--window-size', type=int, dest='window_size', default=40, help='window size of moving average of unit sales')
args = parser.parse_args()
print(args)
# Data paths
DATA_DIR = os.path.join(tsperf_dir, 'retail_sales', 'OrangeJuice_Pt_3Weeks_Weekly', 'data')
SUBMISSION_DIR = os.path.join(tsperf_dir, 'retail_sales', 'OrangeJuice_Pt_3Weeks_Weekly', 'submissions', 'LightGBM')
TRAIN_DIR = os.path.join(DATA_DIR, 'train')
# Parameters of GBM model
params = {
'objective': 'mape',
'num_leaves': args.num_leaves,
'min_data_in_leaf': args.min_data_in_leaf,
'learning_rate': args.learning_rate,
'feature_fraction': args.feature_fraction,
'bagging_fraction': args.bagging_fraction,
'bagging_freq': args.bagging_freq,
'num_rounds': args.max_rounds,
'early_stopping_rounds': 125,
'num_threads': 4,
'seed': args.seed
}
# Lags and categorical features
lags = np.arange(2, args.max_lag+1)
used_columns = ['store', 'brand', 'week', 'week_of_month', 'month', 'deal', 'feat', 'move', 'price', 'price_ratio']
categ_fea = ['store', 'brand', 'deal']
# Get unique stores and brands
train_df = pd.read_csv(os.path.join(TRAIN_DIR, 'train_round_1.csv'))
store_list = train_df['store'].unique()
brand_list = train_df['brand'].unique()
# Train and predict for all forecast rounds
pred_all = []
metric_all = []
for r in range(bs.NUM_ROUNDS):
print('---- Round ' + str(r+1) + ' ----')
# Create features
features = make_features(r, TRAIN_DIR, lags, args.window_size, 0, used_columns, store_list, brand_list)
train_fea = features[features.week <= bs.TRAIN_END_WEEK_LIST[r]].reset_index(drop=True)
# Drop rows with NaN values
train_fea.dropna(inplace=True)
# Create training set
dtrain = lgb.Dataset(train_fea.drop('move', axis=1, inplace=False),
label = train_fea['move'])
if r %3 == 0:
# Train GBM model
print('Training model...')
bst = lgb.train(
params,
dtrain,
valid_sets = [dtrain],
categorical_feature = categ_fea,
verbose_eval = False
)
# Generate forecasts
print('Making predictions...')
test_fea = features[features.week >= bs.TEST_START_WEEK_LIST[r]].reset_index(drop=True)
pred = make_predictions(test_fea, bst).sort_values(by=['store','brand', 'week']).reset_index(drop=True)
# Additional columns required by the submission format
pred['round'] = r+1
pred['weeks_ahead'] = pred['week'] - bs.TRAIN_END_WEEK_LIST[r]
# Keep the predictions
pred_all.append(pred)
# Generate submission
submission = pd.concat(pred_all, axis=0)
submission.rename(columns={'move': 'prediction'}, inplace=True)
submission = submission[['round', 'store', 'brand', 'week', 'weeks_ahead', 'prediction']]
filename = 'submission_seed_' + str(args.seed) + '.csv'
submission.to_csv(os.path.join(SUBMISSION_DIR, filename), index=False)

Просмотреть файл

@ -1,219 +0,0 @@
# coding: utf-8
# Perform cross validation of a boosted decision tree model on the training data of the 1st forecast round.
import os
import sys
import math
import argparse
import datetime
import itertools
import numpy as np
import pandas as pd
import lightgbm as lgb
from azureml.core import Run
from sklearn.model_selection import train_test_split
from utils import week_of_month, df_from_cartesian_product
def lagged_features(df, lags):
"""Create lagged features based on time series data.
Args:
df (Dataframe): Input time series data sorted by time
lags (List): Lag lengths
Returns:
fea (Dataframe): Lagged features
"""
df_list = []
for lag in lags:
df_shifted = df.shift(lag)
df_shifted.columns = [x + '_lag' + str(lag) for x in df_shifted.columns]
df_list.append(df_shifted)
fea = pd.concat(df_list, axis=1)
return fea
def moving_averages(df, start_step, window_size=None):
"""Compute averages of every feature over moving time windows.
Args:
df (Dataframe): Input features as a dataframe
start_step (Integer): Starting time step of rolling mean
window_size (Integer): Windows size of rolling mean
Returns:
fea (Dataframe): Dataframe consisting of the moving averages
"""
if window_size == None: # Use a large window to compute average over all historical data
window_size = df.shape[0]
fea = df.shift(start_step).rolling(min_periods=1, center=False, window=window_size).mean()
fea.columns = fea.columns + '_mean'
return fea
def combine_features(df, lag_fea, lags, window_size, used_columns):
"""Combine different features for a certain store-brand.
Args:
df (Dataframe): Time series data of a certain store-brand
lag_fea (List): A list of column names for creating lagged features
lags (Numpy Array): Numpy array including all the lags
window_size (Integer): Windows size of rolling mean
used_columns (List): A list of names of columns used in model training (including target variable)
Returns:
fea_all (Dataframe): Dataframe including all features for the specific store-brand
"""
lagged_fea = lagged_features(df[lag_fea], lags)
moving_avg = moving_averages(df[lag_fea], 2, window_size)
fea_all = pd.concat([df[used_columns], lagged_fea, moving_avg], axis=1)
return fea_all
def make_predictions(df, model):
"""Predict sales with the trained GBM model.
Args:
df (Dataframe): Dataframe including all needed features
model (Model): Trained GBM model
Returns:
Dataframe including the predicted sales of a certain store-brand
"""
predictions = pd.DataFrame({'move': model.predict(df.drop('move', axis=1))})
predictions['move'] = predictions['move'].apply(lambda x: round(x))
return pd.concat([df[['brand', 'store', 'week']].reset_index(drop=True), predictions], axis=1)
if __name__ == '__main__':
# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument('--data-folder', type=str, dest='data_folder', default='.', help='data folder mounting point')
parser.add_argument('--num-leaves', type=int, dest='num_leaves', default=64, help='# of leaves of the tree')
parser.add_argument('--min-data-in-leaf', type=int, dest='min_data_in_leaf', default=50, help='minimum # of samples in each leaf')
parser.add_argument('--learning-rate', type=float, dest='learning_rate', default=0.001, help='learning rate')
parser.add_argument('--feature-fraction', type=float, dest='feature_fraction', default=1.0, help='ratio of features used in each iteration')
parser.add_argument('--bagging-fraction', type=float, dest='bagging_fraction', default=1.0, help='ratio of samples used in each iteration')
parser.add_argument('--bagging-freq', type=int, dest='bagging_freq', default=1, help='bagging frequency')
parser.add_argument('--max-rounds', type=int, dest='max_rounds', default=400, help='# of boosting iterations')
parser.add_argument('--max-lag', type=int, dest='max_lag', default=10, help='max lag of unit sales')
parser.add_argument('--window-size', type=int, dest='window_size', default=10, help='window size of moving average of unit sales')
args = parser.parse_args()
args.feature_fraction = round(args.feature_fraction, 2)
args.bagging_fraction = round(args.bagging_fraction, 2)
print(args)
# Start an Azure ML run
run = Run.get_context()
# Data paths
DATA_DIR = args.data_folder
TRAIN_DIR = os.path.join(DATA_DIR, 'train')
# Data and forecast problem parameters
TRAIN_START_WEEK = 40
TRAIN_END_WEEK_LIST = list(range(135,159,2))
TEST_START_WEEK_LIST = list(range(137,161,2))
TEST_END_WEEK_LIST = list(range(138,162,2))
# The start datetime of the first week in the record
FIRST_WEEK_START = pd.to_datetime('1989-09-14 00:00:00')
# Parameters of GBM model
params = {
'objective': 'mape',
'num_leaves': args.num_leaves,
'min_data_in_leaf': args.min_data_in_leaf,
'learning_rate': args.learning_rate,
'feature_fraction': args.feature_fraction,
'bagging_fraction': args.bagging_fraction,
'bagging_freq': args.bagging_freq,
'num_rounds': args.max_rounds,
'early_stopping_rounds': 125,
'num_threads': 16
}
# Lags and used column names
lags = np.arange(2, args.max_lag+1)
used_columns = ['store', 'brand', 'week', 'week_of_month', 'month', 'deal', 'feat', 'move', 'price', 'price_ratio']
categ_fea = ['store', 'brand', 'deal']
# Train and validate the model using only the first round data
r = 0
print('---- Round ' + str(r+1) + ' ----')
# Load training data
train_df = pd.read_csv(os.path.join(TRAIN_DIR, 'train_round_'+str(r+1)+'.csv'))
train_df['move'] = train_df['logmove'].apply(lambda x: round(math.exp(x)))
train_df = train_df[['store', 'brand', 'week', 'move']]
# Create a dataframe to hold all necessary data
store_list = train_df['store'].unique()
brand_list = train_df['brand'].unique()
week_list = range(TRAIN_START_WEEK, TEST_END_WEEK_LIST[r]+1)
d = {'store': store_list,
'brand': brand_list,
'week': week_list}
data_grid = df_from_cartesian_product(d)
data_filled = pd.merge(data_grid, train_df, how='left',
on=['store', 'brand', 'week'])
# Get future price, deal, and advertisement info
aux_df = pd.read_csv(os.path.join(TRAIN_DIR, 'aux_round_'+str(r+1)+'.csv'))
data_filled = pd.merge(data_filled, aux_df, how='left',
on=['store', 'brand', 'week'])
# Create relative price feature
price_cols = ['price1', 'price2', 'price3', 'price4', 'price5', 'price6', 'price7', 'price8', \
'price9', 'price10', 'price11']
data_filled['price'] = data_filled.apply(lambda x: x.loc['price' + str(int(x.loc['brand']))], axis=1)
data_filled['avg_price'] = data_filled[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
data_filled['price_ratio'] = data_filled['price'] / data_filled['avg_price']
data_filled.drop(price_cols, axis=1, inplace=True)
# Fill missing values
data_filled = data_filled.groupby(['store', 'brand']).apply(lambda x: x.fillna(method='ffill').fillna(method='bfill'))
# Create datetime features
data_filled['week_start'] = data_filled['week'].apply(lambda x: FIRST_WEEK_START + datetime.timedelta(days=(x-1)*7))
data_filled['year'] = data_filled['week_start'].apply(lambda x: x.year)
data_filled['month'] = data_filled['week_start'].apply(lambda x: x.month)
data_filled['week_of_month'] = data_filled['week_start'].apply(lambda x: week_of_month(x))
data_filled['day'] = data_filled['week_start'].apply(lambda x: x.day)
data_filled.drop('week_start', axis=1, inplace=True)
# Create other features (lagged features, moving averages, etc.)
features = data_filled.groupby(['store','brand']).apply(lambda x: combine_features(x, ['move'], lags, args.window_size, used_columns))
train_fea = features[features.week <= TRAIN_END_WEEK_LIST[r]].reset_index(drop=True)
# Drop rows with NaN values
train_fea.dropna(inplace=True)
# Model training and validation
# Create a training/validation split
train_fea, valid_fea, train_label, valid_label = train_test_split(train_fea.drop('move', axis=1, inplace=False), \
train_fea['move'], test_size=0.05, random_state=1)
dtrain = lgb.Dataset(train_fea, train_label)
dvalid = lgb.Dataset(valid_fea, valid_label)
# A dictionary to record training results
evals_result = {}
# Train GBM model
bst = lgb.train(
params,
dtrain,
valid_sets = [dtrain, dvalid],
categorical_feature = categ_fea,
evals_result = evals_result
)
# Get final training loss & validation loss
train_loss = evals_result['training']['mape'][-1]
valid_loss = evals_result['valid_1']['mape'][-1]
print('Final training loss is {}'.format(train_loss))
print('Final validation loss is {}'.format(valid_loss))
# Log the validation loss/MAPE
run.log('MAPE', np.float(valid_loss)*100)

Просмотреть файл

@ -1,5 +1,7 @@
### Benchmarks
TODO: Modify below based on the benchmarks we include.
Each entry below describes a forecasting benchmark.
| **Benchmark** | **Description** |

Просмотреть файл

Просмотреть файл

@ -9,11 +9,10 @@ import getopt
import localpath
from tsperf.benchmarking.GEFCom2017_D_Prob_MT_hourly.feature_engineering \
import compute_features
from tsperf.benchmarking.GEFCom2017_D_Prob_MT_hourly.feature_engineering import compute_features
SUBMISSIONS_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
DATA_DIR = os.path.join(SUBMISSIONS_DIR,'data')
DATA_DIR = os.path.join(SUBMISSIONS_DIR, "data")
print("Data directory used: {}".format(DATA_DIR))
OUTPUT_DIR = os.path.join(DATA_DIR, "features")
@ -40,14 +39,8 @@ feature_config_list = [
("temporal", {"feature_list": ["hour_of_day", "month_of_year"]}),
("annual_fourier", {"n_harmonics": 3}),
("weekly_fourier", {"n_harmonics": 3}),
(
"previous_year_load_lag",
{"input_col_names": "DEMAND", "round_agg_result": True},
),
(
"previous_year_temp_lag",
{"input_col_names": "DryBulb", "round_agg_result": True},
),
("previous_year_load_lag", {"input_col_names": "DEMAND", "round_agg_result": True},),
("previous_year_temp_lag", {"input_col_names": "DryBulb", "round_agg_result": True},),
]
if __name__ == "__main__":
@ -55,9 +48,7 @@ if __name__ == "__main__":
for opt, arg in opts:
if opt == "--submission":
submission_folder = arg
output_data_dir = os.path.join(
SUBMISSIONS_DIR, submission_folder, "data"
)
output_data_dir = os.path.join(SUBMISSIONS_DIR, submission_folder, "data")
if not os.path.isdir(output_data_dir):
os.mkdir(output_data_dir)
OUTPUT_DIR = os.path.join(output_data_dir, "features")

Просмотреть файл

@ -3,6 +3,7 @@ This script inserts the TSPerf directory into sys.path, so that scripts can impo
"""
import os, sys
_CURR_DIR = os.path.dirname(os.path.abspath(__file__))
TSPERF_DIR = os.path.dirname(os.path.dirname(os.path.dirname(_CURR_DIR)))

Просмотреть файл

@ -9,11 +9,10 @@ import getopt
import localpath
from tsperf.benchmarking.GEFCom2017_D_Prob_MT_hourly.feature_engineering \
import compute_features
from tsperf.benchmarking.GEFCom2017_D_Prob_MT_hourly.feature_engineering import compute_features
SUBMISSIONS_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
DATA_DIR = os.path.join(SUBMISSIONS_DIR,'data')
DATA_DIR = os.path.join(SUBMISSIONS_DIR, "data")
print("Data directory used: {}".format(DATA_DIR))
OUTPUT_DIR = os.path.join(DATA_DIR, "features")
@ -40,14 +39,8 @@ feature_config_list = [
("temporal", {"feature_list": ["hour_of_day", "month_of_year"]}),
("annual_fourier", {"n_harmonics": 3}),
("weekly_fourier", {"n_harmonics": 3}),
(
"previous_year_load_lag",
{"input_col_names": "DEMAND", "round_agg_result": True},
),
(
"previous_year_temp_lag",
{"input_col_names": "DryBulb", "round_agg_result": True},
),
("previous_year_load_lag", {"input_col_names": "DEMAND", "round_agg_result": True},),
("previous_year_temp_lag", {"input_col_names": "DryBulb", "round_agg_result": True},),
]
@ -56,9 +49,7 @@ if __name__ == "__main__":
for opt, arg in opts:
if opt == "--submission":
submission_folder = arg
output_data_dir = os.path.join(
SUBMISSIONS_DIR, submission_folder, "data"
)
output_data_dir = os.path.join(SUBMISSIONS_DIR, submission_folder, "data")
if not os.path.isdir(output_data_dir):
os.mkdir(output_data_dir)
OUTPUT_DIR = os.path.join(output_data_dir, "features")

Просмотреть файл

@ -5,6 +5,7 @@ localpath.py file.
"""
import os, sys
_CURR_DIR = os.path.dirname(os.path.abspath(__file__))
TSPERF_DIR = os.path.dirname(os.path.dirname(os.path.dirname(_CURR_DIR)))

Просмотреть файл

@ -0,0 +1,70 @@
"""
This script passes the input arguments of AzureML job to the R script train_validate_aml.R,
and then passes the output of train_validate_aml.R back to AzureML.
"""
import subprocess
import os
import sys
import getopt
import pandas as pd
from datetime import datetime
from azureml.core import Run
import time
start_time = time.time()
run = Run.get_submitted_run()
base_command = "Rscript train_validate_aml.R"
if __name__ == "__main__":
opts, args = getopt.getopt(
sys.argv[1:], "", ["path=", "cv_path=", "n_hidden_1=", "n_hidden_2=", "iter_max=", "penalty="]
)
for opt, arg in opts:
if opt == "--path":
path = arg
elif opt == "--cv_path":
cv_path = arg
elif opt == "--n_hidden_1":
n_hidden_1 = arg
elif opt == "--n_hidden_2":
n_hidden_2 = arg
elif opt == "--iter_max":
iter_max = arg
elif opt == "--penalty":
penalty = arg
time_stamp = datetime.now().strftime("%Y%m%d%H%M%S")
task = " ".join(
[
base_command,
"--path",
path,
"--cv_path",
cv_path,
"--n_hidden_1",
n_hidden_1,
"--n_hidden_2",
n_hidden_2,
"--iter_max",
iter_max,
"--penalty",
penalty,
"--time_stamp",
time_stamp,
]
)
process = subprocess.call(task, shell=True)
# process.communicate()
# process.wait()
output_file_name = "cv_output_" + time_stamp + ".csv"
result = pd.read_csv(os.path.join(cv_path, output_file_name))
APL = result["loss"].mean()
print(APL)
print("--- %s seconds ---" % (time.time() - start_time))
run.log("average pinball loss", APL)

Просмотреть файл

@ -10,11 +10,10 @@ import getopt
import localpath
from tsperf.benchmarking.GEFCom2017_D_Prob_MT_hourly.feature_engineering \
import compute_features
from tsperf.benchmarking.GEFCom2017_D_Prob_MT_hourly.feature_engineering import compute_features
SUBMISSIONS_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
DATA_DIR = os.path.join(SUBMISSIONS_DIR,'data')
DATA_DIR = os.path.join(SUBMISSIONS_DIR, "data")
print("Data directory used: {}".format(DATA_DIR))
OUTPUT_DIR = os.path.join(DATA_DIR, "features")
@ -41,14 +40,8 @@ feature_config_list = [
("temporal", {"feature_list": ["hour_of_day", "month_of_year"]}),
("annual_fourier", {"n_harmonics": 3}),
("weekly_fourier", {"n_harmonics": 3}),
(
"previous_year_load_lag",
{"input_col_names": "DEMAND", "round_agg_result": True},
),
(
"previous_year_temp_lag",
{"input_col_names": "DryBulb", "round_agg_result": True},
),
("previous_year_load_lag", {"input_col_names": "DEMAND", "round_agg_result": True},),
("previous_year_temp_lag", {"input_col_names": "DryBulb", "round_agg_result": True},),
]
@ -57,9 +50,7 @@ if __name__ == "__main__":
for opt, arg in opts:
if opt == "--submission":
submission_folder = arg
output_data_dir = os.path.join(
SUBMISSIONS_DIR, submission_folder, "data"
)
output_data_dir = os.path.join(SUBMISSIONS_DIR, submission_folder, "data")
if not os.path.isdir(output_data_dir):
os.mkdir(output_data_dir)
OUTPUT_DIR = os.path.join(output_data_dir, "features")

Просмотреть файл

@ -4,6 +4,7 @@ all the modules in TSPerf. Each submission folder needs its own localpath.py fil
"""
import os, sys
_CURR_DIR = os.path.dirname(os.path.abspath(__file__))
TSPERF_DIR = os.path.dirname(os.path.dirname(os.path.dirname(_CURR_DIR)))

Просмотреть файл

@ -9,11 +9,10 @@ import getopt
import localpath
from tsperf.benchmarking.GEFCom2017_D_Prob_MT_hourly.feature_engineering \
import compute_features
from tsperf.benchmarking.GEFCom2017_D_Prob_MT_hourly.feature_engineering import compute_features
SUBMISSIONS_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
DATA_DIR = os.path.join(SUBMISSIONS_DIR,'data')
DATA_DIR = os.path.join(SUBMISSIONS_DIR, "data")
print("Data directory used: {}".format(DATA_DIR))
OUTPUT_DIR = os.path.join(DATA_DIR, "features")
@ -59,23 +58,11 @@ feature_config_list = [
("normalized_datehour", {}),
("normalized_year", {}),
("day_type", {"holiday_col_name": HOLIDAY_COLNAME}),
(
"previous_year_load_lag",
{"input_col_names": "DEMAND", "round_agg_result": True},
),
(
"previous_year_temp_lag",
{"input_col_names": ["DryBulb", "DewPnt"], "round_agg_result": True},
),
("previous_year_load_lag", {"input_col_names": "DEMAND", "round_agg_result": True},),
("previous_year_temp_lag", {"input_col_names": ["DryBulb", "DewPnt"], "round_agg_result": True},),
(
"recent_load_lag",
{
"input_col_names": "DEMAND",
"start_week": 10,
"window_size": 4,
"agg_count": 8,
"round_agg_result": True,
},
{"input_col_names": "DEMAND", "start_week": 10, "window_size": 4, "agg_count": 8, "round_agg_result": True,},
),
(
"recent_temp_lag",
@ -95,9 +82,7 @@ if __name__ == "__main__":
for opt, arg in opts:
if opt == "--submission":
submission_folder = arg
output_data_dir = os.path.join(
SUBMISSIONS_DIR, submission_folder, "data"
)
output_data_dir = os.path.join(SUBMISSIONS_DIR, submission_folder, "data")
if not os.path.isdir(output_data_dir):
os.mkdir(output_data_dir)
OUTPUT_DIR = os.path.join(output_data_dir, "features")
@ -105,10 +90,5 @@ if __name__ == "__main__":
os.mkdir(OUTPUT_DIR)
compute_features(
TRAIN_DATA_DIR,
TEST_DATA_DIR,
OUTPUT_DIR,
DF_CONFIG,
feature_config_list,
filter_by_month=False,
TRAIN_DATA_DIR, TEST_DATA_DIR, OUTPUT_DIR, DF_CONFIG, feature_config_list, filter_by_month=False,
)

Просмотреть файл

@ -13,6 +13,7 @@ from skgarden.quantile.tree import DecisionTreeQuantileRegressor
from skgarden.quantile.ensemble import generate_sample_indices
from ensemble_parallel_utils import weighted_percentile_vectorized
class BaseForestQuantileRegressor(ForestRegressor):
"""Training and scoring of Quantile Regression Random Forest
@ -34,6 +35,7 @@ class BaseForestQuantileRegressor(ForestRegressor):
a weight of zero when estimator j is fit, then the value is -1.
"""
def fit(self, X, y):
"""Builds a forest from the training set (X, y).
@ -68,8 +70,7 @@ class BaseForestQuantileRegressor(ForestRegressor):
Returns self.
"""
# apply method requires X to be of dtype np.float32
X, y = check_X_y(
X, y, accept_sparse="csc", dtype=np.float32, multi_output=False)
X, y = check_X_y(X, y, accept_sparse="csc", dtype=np.float32, multi_output=False)
super(BaseForestQuantileRegressor, self).fit(X, y)
self.y_train_ = y
@ -78,8 +79,7 @@ class BaseForestQuantileRegressor(ForestRegressor):
for i, est in enumerate(self.estimators_):
if self.bootstrap:
bootstrap_indices = generate_sample_indices(
est.random_state, len(y))
bootstrap_indices = generate_sample_indices(est.random_state, len(y))
else:
bootstrap_indices = np.arange(len(y))
@ -87,8 +87,7 @@ class BaseForestQuantileRegressor(ForestRegressor):
y_train_leaves = est.y_train_leaves_
for curr_leaf in np.unique(y_train_leaves):
y_ind = y_train_leaves == curr_leaf
self.y_weights_[i, y_ind] = (
est_weights[y_ind] / np.sum(est_weights[y_ind]))
self.y_weights_[i, y_ind] = est_weights[y_ind] / np.sum(est_weights[y_ind])
self.y_train_leaves_[i, bootstrap_indices] = y_train_leaves[bootstrap_indices]
@ -167,21 +166,24 @@ class RandomForestQuantileRegressor(BaseForestQuantileRegressor):
oob_prediction_ : array of shape = [n_samples]
Prediction computed with out-of-bag estimate on the training set.
"""
def __init__(self,
n_estimators=10,
criterion='mse',
max_depth=None,
min_samples_split=2,
min_samples_leaf=1,
min_weight_fraction_leaf=0.0,
max_features='auto',
max_leaf_nodes=None,
bootstrap=True,
oob_score=False,
n_jobs=1,
random_state=None,
verbose=0,
warm_start=False):
def __init__(
self,
n_estimators=10,
criterion="mse",
max_depth=None,
min_samples_split=2,
min_samples_leaf=1,
min_weight_fraction_leaf=0.0,
max_features="auto",
max_leaf_nodes=None,
bootstrap=True,
oob_score=False,
n_jobs=1,
random_state=None,
verbose=0,
warm_start=False,
):
"""Initialize RandomForestQuantileRegressor class
Args:
@ -271,16 +273,23 @@ class RandomForestQuantileRegressor(BaseForestQuantileRegressor):
super(RandomForestQuantileRegressor, self).__init__(
base_estimator=DecisionTreeQuantileRegressor(),
n_estimators=n_estimators,
estimator_params=("criterion", "max_depth", "min_samples_split",
"min_samples_leaf", "min_weight_fraction_leaf",
"max_features", "max_leaf_nodes",
"random_state"),
estimator_params=(
"criterion",
"max_depth",
"min_samples_split",
"min_samples_leaf",
"min_weight_fraction_leaf",
"max_features",
"max_leaf_nodes",
"random_state",
),
bootstrap=bootstrap,
oob_score=oob_score,
n_jobs=n_jobs,
random_state=random_state,
verbose=verbose,
warm_start=warm_start)
warm_start=warm_start,
)
self.criterion = criterion
self.max_depth = max_depth
@ -289,5 +298,3 @@ class RandomForestQuantileRegressor(BaseForestQuantileRegressor):
self.min_weight_fraction_leaf = min_weight_fraction_leaf
self.max_features = max_features
self.max_leaf_nodes = max_leaf_nodes

Просмотреть файл

@ -3,6 +3,7 @@
import numpy as np
def weighted_percentile_vectorized(a, quantiles, weights=None, sorter=None):
"""Returns the weighted percentile of a at q given weights.
@ -69,8 +70,7 @@ def weighted_percentile_vectorized(a, quantiles, weights=None, sorter=None):
percentiles = np.zeros_like(quantiles)
for i, q in enumerate(quantiles):
if q > 100 or q < 0:
raise ValueError("q should be in-between 0 and 100, "
"got %d" % q)
raise ValueError("q should be in-between 0 and 100, " "got %d" % q)
start = np.searchsorted(partial_sum, q) - 1
if start == len(sorted_cum_weights) - 1:

Просмотреть файл

@ -4,9 +4,9 @@ all the modules in TSPerf. Each submission folder needs its own localpath.py fil
"""
import os, sys
_CURR_DIR = os.path.dirname(os.path.abspath(__file__))
TSPERF_DIR = os.path.dirname(os.path.dirname(os.path.dirname(_CURR_DIR)))
if TSPERF_DIR not in sys.path:
sys.path.insert(0, TSPERF_DIR)

Просмотреть файл

@ -9,16 +9,10 @@ from ensemble_parallel import RandomForestQuantileRegressor
# get seed value
parser = argparse.ArgumentParser()
parser.add_argument(
"--data-folder",
type=str,
dest="data_folder",
help="data folder mounting point",
"--data-folder", type=str, dest="data_folder", help="data folder mounting point",
)
parser.add_argument(
"--output-folder",
type=str,
dest="output_folder",
help="output folder mounting point",
"--output-folder", type=str, dest="output_folder", help="output folder mounting point",
)
parser.add_argument("--seed", type=int, dest="seed", help="random seed")
args = parser.parse_args()
@ -27,9 +21,7 @@ args = parser.parse_args()
data_dir = join(args.data_folder, "features")
train_dir = join(data_dir, "train")
test_dir = join(data_dir, "test")
output_file = join(
args.output_folder, "submission_seed_{}.csv".format(args.seed)
)
output_file = join(args.output_folder, "submission_seed_{}.csv".format(args.seed))
# do 6 rounds of forecasting, at each round output 9 quantiles
n_rounds = 6
@ -59,19 +51,13 @@ for i in range(1, n_rounds + 1):
train_df_hour = pd.get_dummies(train_df_hour, columns=["Zone"])
# remove column that are not useful (Datetime) or are not
# available in the test set (DEMAND, DryBulb, DewPnt)
X_train = train_df_hour.drop(
columns=["Datetime", "DEMAND", "DryBulb", "DewPnt"]
).values
X_train = train_df_hour.drop(columns=["Datetime", "DEMAND", "DryBulb", "DewPnt"]).values
y_train = train_df_hour["DEMAND"].values
# train a model
rfqr = RandomForestQuantileRegressor(
random_state=args.seed,
n_jobs=-1,
n_estimators=1000,
max_features="sqrt",
max_depth=12,
random_state=args.seed, n_jobs=-1, n_estimators=1000, max_features="sqrt", max_depth=12,
)
rfqr.fit(X_train, y_train)

Просмотреть файл

@ -0,0 +1,88 @@
# coding: utf-8
# Create input features for the Dilated Convolutional Neural Network (CNN) model.
import os
import sys
import math
import datetime
import numpy as np
import pandas as pd
# Append TSPerf path to sys.path
tsperf_dir = "."
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
# Import TSPerf components
from utils import *
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
def make_features(pred_round, train_dir, pred_steps, offset, store_list, brand_list):
"""Create a dataframe of the input features.
Args:
pred_round (Integer): Prediction round
train_dir (String): Path of the training data directory
pred_steps (Integer): Number of prediction steps
offset (Integer): Length of training data skipped in the retraining
store_list (Numpy Array): List of all the store IDs
brand_list (Numpy Array): List of all the brand IDs
Returns:
data_filled (Dataframe): Dataframe including the input features
data_scaled (Dataframe): Dataframe including the normalized features
"""
# Load training data
train_df = pd.read_csv(os.path.join(train_dir, "train_round_" + str(pred_round + 1) + ".csv"))
train_df["move"] = train_df["logmove"].apply(lambda x: round(math.exp(x)))
train_df = train_df[["store", "brand", "week", "move"]]
# Create a dataframe to hold all necessary data
week_list = range(bs.TRAIN_START_WEEK + offset, bs.TEST_END_WEEK_LIST[pred_round] + 1)
d = {"store": store_list, "brand": brand_list, "week": week_list}
data_grid = df_from_cartesian_product(d)
data_filled = pd.merge(data_grid, train_df, how="left", on=["store", "brand", "week"])
# Get future price, deal, and advertisement info
aux_df = pd.read_csv(os.path.join(train_dir, "aux_round_" + str(pred_round + 1) + ".csv"))
data_filled = pd.merge(data_filled, aux_df, how="left", on=["store", "brand", "week"])
# Create relative price feature
price_cols = [
"price1",
"price2",
"price3",
"price4",
"price5",
"price6",
"price7",
"price8",
"price9",
"price10",
"price11",
]
data_filled["price"] = data_filled.apply(lambda x: x.loc["price" + str(int(x.loc["brand"]))], axis=1)
data_filled["avg_price"] = data_filled[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
data_filled["price_ratio"] = data_filled["price"] / data_filled["avg_price"]
data_filled.drop(price_cols, axis=1, inplace=True)
# Fill missing values
data_filled = data_filled.groupby(["store", "brand"]).apply(
lambda x: x.fillna(method="ffill").fillna(method="bfill")
)
# Create datetime features
data_filled["week_start"] = data_filled["week"].apply(
lambda x: bs.FIRST_WEEK_START + datetime.timedelta(days=(x - 1) * 7)
)
data_filled["month"] = data_filled["week_start"].apply(lambda x: x.month)
data_filled["week_of_month"] = data_filled["week_start"].apply(lambda x: week_of_month(x))
data_filled.drop("week_start", axis=1, inplace=True)
# Normalize the dataframe of features
cols_normalize = data_filled.columns.difference(["store", "brand", "week"])
data_scaled, min_max_scaler = normalize_dataframe(data_filled, cols_normalize)
return data_filled, data_scaled

Просмотреть файл

@ -0,0 +1,223 @@
# coding: utf-8
# Train and score a Dilated Convolutional Neural Network (CNN) model using Keras package with TensorFlow backend.
import os
import sys
import keras
import random
import argparse
import numpy as np
import pandas as pd
import tensorflow as tf
from keras import optimizers
from keras.layers import *
from keras.models import Model, load_model
from keras.callbacks import ModelCheckpoint
# Append TSPerf path to sys.path (assume we run the script from TSPerf directory)
tsperf_dir = "."
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
# Import TSPerf components
from utils import *
from make_features import make_features
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
# Model definition
def create_dcnn_model(seq_len, kernel_size=2, n_filters=3, n_input_series=1, n_outputs=1):
"""Create a Dilated CNN model.
Args:
seq_len (Integer): Input sequence length
kernel_size (Integer): Kernel size of each convolutional layer
n_filters (Integer): Number of filters in each convolutional layer
n_outputs (Integer): Number of outputs in the last layer
Returns:
Keras Model object
"""
# Sequential input
seq_in = Input(shape=(seq_len, n_input_series))
# Categorical input
cat_fea_in = Input(shape=(2,), dtype="uint8")
store_id = Lambda(lambda x: x[:, 0, None])(cat_fea_in)
brand_id = Lambda(lambda x: x[:, 1, None])(cat_fea_in)
store_embed = Embedding(MAX_STORE_ID + 1, 7, input_length=1)(store_id)
brand_embed = Embedding(MAX_BRAND_ID + 1, 4, input_length=1)(brand_id)
# Dilated convolutional layers
c1 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=1, padding="causal", activation="relu")(
seq_in
)
c2 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=2, padding="causal", activation="relu")(c1)
c3 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=4, padding="causal", activation="relu")(c2)
# Skip connections
c4 = concatenate([c1, c3])
# Output of convolutional layers
conv_out = Conv1D(8, 1, activation="relu")(c4)
conv_out = Dropout(args.dropout_rate)(conv_out)
conv_out = Flatten()(conv_out)
# Concatenate with categorical features
x = concatenate([conv_out, Flatten()(store_embed), Flatten()(brand_embed)])
x = Dense(16, activation="relu")(x)
output = Dense(n_outputs, activation="linear")(x)
# Define model interface, loss function, and optimizer
model = Model(inputs=[seq_in, cat_fea_in], outputs=output)
return model
if __name__ == "__main__":
# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument("--seed", type=int, dest="seed", default=1, help="random seed")
parser.add_argument("--seq-len", type=int, dest="seq_len", default=15, help="length of the input sequence")
parser.add_argument("--dropout-rate", type=float, dest="dropout_rate", default=0.01, help="dropout ratio")
parser.add_argument("--batch-size", type=int, dest="batch_size", default=64, help="mini batch size for training")
parser.add_argument("--learning-rate", type=float, dest="learning_rate", default=0.015, help="learning rate")
parser.add_argument("--epochs", type=int, dest="epochs", default=25, help="# of epochs")
args = parser.parse_args()
# Fix random seeds
np.random.seed(args.seed)
random.seed(args.seed)
tf.set_random_seed(args.seed)
# Data paths
DATA_DIR = os.path.join(tsperf_dir, "retail_sales", "OrangeJuice_Pt_3Weeks_Weekly", "data")
SUBMISSION_DIR = os.path.join(
tsperf_dir, "retail_sales", "OrangeJuice_Pt_3Weeks_Weekly", "submissions", "DilatedCNN"
)
TRAIN_DIR = os.path.join(DATA_DIR, "train")
# Dataset parameters
MAX_STORE_ID = 137
MAX_BRAND_ID = 11
# Parameters of the model
PRED_HORIZON = 3
PRED_STEPS = 2
SEQ_LEN = args.seq_len
DYNAMIC_FEATURES = ["deal", "feat", "month", "week_of_month", "price", "price_ratio"]
STATIC_FEATURES = ["store", "brand"]
# Get unique stores and brands
train_df = pd.read_csv(os.path.join(TRAIN_DIR, "train_round_1.csv"))
store_list = train_df["store"].unique()
brand_list = train_df["brand"].unique()
store_brand = [(x, y) for x in store_list for y in brand_list]
# Train and predict for all forecast rounds
pred_all = []
file_name = os.path.join(SUBMISSION_DIR, "dcnn_model.h5")
for r in range(bs.NUM_ROUNDS):
print("---- Round " + str(r + 1) + " ----")
offset = 0 if r == 0 else 40 + r * PRED_STEPS
# Create features
data_filled, data_scaled = make_features(r, TRAIN_DIR, PRED_STEPS, offset, store_list, brand_list)
# Create sequence array for 'move'
start_timestep = 0
end_timestep = bs.TRAIN_END_WEEK_LIST[r] - bs.TRAIN_START_WEEK - PRED_HORIZON
train_input1 = gen_sequence_array(
data_scaled, store_brand, SEQ_LEN, ["move"], start_timestep, end_timestep - offset
)
# Create sequence array for other dynamic features
start_timestep = PRED_HORIZON
end_timestep = bs.TRAIN_END_WEEK_LIST[r] - bs.TRAIN_START_WEEK
train_input2 = gen_sequence_array(
data_scaled, store_brand, SEQ_LEN, DYNAMIC_FEATURES, start_timestep, end_timestep - offset
)
seq_in = np.concatenate([train_input1, train_input2], axis=2)
# Create array of static features
total_timesteps = bs.TRAIN_END_WEEK_LIST[r] - bs.TRAIN_START_WEEK - SEQ_LEN - PRED_HORIZON + 2
cat_fea_in = static_feature_array(data_filled, total_timesteps - offset, STATIC_FEATURES)
# Create training output
start_timestep = SEQ_LEN + PRED_HORIZON - PRED_STEPS
end_timestep = bs.TRAIN_END_WEEK_LIST[r] - bs.TRAIN_START_WEEK
train_output = gen_sequence_array(
data_filled, store_brand, PRED_STEPS, ["move"], start_timestep, end_timestep - offset
)
train_output = np.squeeze(train_output)
# Create and train model
if r == 0:
model = create_dcnn_model(
seq_len=SEQ_LEN, n_filters=2, n_input_series=1 + len(DYNAMIC_FEATURES), n_outputs=PRED_STEPS
)
adam = optimizers.Adam(lr=args.learning_rate)
model.compile(loss="mape", optimizer=adam, metrics=["mape"])
# Define checkpoint and fit model
checkpoint = ModelCheckpoint(file_name, monitor="loss", save_best_only=True, mode="min", verbose=0)
callbacks_list = [checkpoint]
history = model.fit(
[seq_in, cat_fea_in],
train_output,
epochs=args.epochs,
batch_size=args.batch_size,
callbacks=callbacks_list,
verbose=0,
)
else:
model = load_model(file_name)
checkpoint = ModelCheckpoint(file_name, monitor="loss", save_best_only=True, mode="min", verbose=0)
callbacks_list = [checkpoint]
history = model.fit(
[seq_in, cat_fea_in],
train_output,
epochs=1,
batch_size=args.batch_size,
callbacks=callbacks_list,
verbose=0,
)
# Get inputs for prediction
start_timestep = bs.TEST_START_WEEK_LIST[r] - bs.TRAIN_START_WEEK - SEQ_LEN - PRED_HORIZON + PRED_STEPS
end_timestep = bs.TEST_START_WEEK_LIST[r] - bs.TRAIN_START_WEEK + PRED_STEPS - 1 - PRED_HORIZON
test_input1 = gen_sequence_array(
data_scaled, store_brand, SEQ_LEN, ["move"], start_timestep - offset, end_timestep - offset
)
start_timestep = bs.TEST_END_WEEK_LIST[r] - bs.TRAIN_START_WEEK - SEQ_LEN + 1
end_timestep = bs.TEST_END_WEEK_LIST[r] - bs.TRAIN_START_WEEK
test_input2 = gen_sequence_array(
data_scaled, store_brand, SEQ_LEN, DYNAMIC_FEATURES, start_timestep - offset, end_timestep - offset
)
seq_in = np.concatenate([test_input1, test_input2], axis=2)
total_timesteps = 1
cat_fea_in = static_feature_array(data_filled, total_timesteps, STATIC_FEATURES)
# Make prediction
pred = np.round(model.predict([seq_in, cat_fea_in]))
# Create dataframe for submission
exp_output = data_filled[data_filled.week >= bs.TEST_START_WEEK_LIST[r]].reset_index(drop=True)
exp_output = exp_output[["store", "brand", "week"]]
pred_df = (
exp_output.sort_values(["store", "brand", "week"]).loc[:, ["store", "brand", "week"]].reset_index(drop=True)
)
pred_df["weeks_ahead"] = pred_df["week"] - bs.TRAIN_END_WEEK_LIST[r]
pred_df["round"] = r + 1
pred_df["prediction"] = np.reshape(pred, (pred.size, 1))
pred_all.append(pred_df)
# Generate submission
submission = pd.concat(pred_all, axis=0).reset_index(drop=True)
submission = submission[["round", "store", "brand", "week", "weeks_ahead", "prediction"]]
filename = "submission_seed_" + str(args.seed) + ".csv"
submission.to_csv(os.path.join(SUBMISSION_DIR, filename), index=False)
print("Done")

Просмотреть файл

@ -0,0 +1,212 @@
# coding: utf-8
# Perform cross validation of a Dilated Convolutional Neural Network (CNN) model on the training data of the 1st forecast round.
import os
import sys
import math
import keras
import argparse
import datetime
import numpy as np
import pandas as pd
from utils import *
from keras.layers import *
from keras.models import Model
from keras import optimizers
from keras.utils import multi_gpu_model
from azureml.core import Run
# Model definition
def create_dcnn_model(seq_len, kernel_size=2, n_filters=3, n_input_series=1, n_outputs=1):
"""Create a Dilated CNN model.
Args:
seq_len (Integer): Input sequence length
kernel_size (Integer): Kernel size of each convolutional layer
n_filters (Integer): Number of filters in each convolutional layer
n_outputs (Integer): Number of outputs in the last layer
Returns:
Keras Model object
"""
# Sequential input
seq_in = Input(shape=(seq_len, n_input_series))
# Categorical input
cat_fea_in = Input(shape=(2,), dtype="uint8")
store_id = Lambda(lambda x: x[:, 0, None])(cat_fea_in)
brand_id = Lambda(lambda x: x[:, 1, None])(cat_fea_in)
store_embed = Embedding(MAX_STORE_ID + 1, 7, input_length=1)(store_id)
brand_embed = Embedding(MAX_BRAND_ID + 1, 4, input_length=1)(brand_id)
# Dilated convolutional layers
c1 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=1, padding="causal", activation="relu")(
seq_in
)
c2 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=2, padding="causal", activation="relu")(c1)
c3 = Conv1D(filters=n_filters, kernel_size=kernel_size, dilation_rate=4, padding="causal", activation="relu")(c2)
# Skip connections
c4 = concatenate([c1, c3])
# Output of convolutional layers
conv_out = Conv1D(8, 1, activation="relu")(c4)
conv_out = Dropout(args.dropout_rate)(conv_out)
conv_out = Flatten()(conv_out)
# Concatenate with categorical features
x = concatenate([conv_out, Flatten()(store_embed), Flatten()(brand_embed)])
x = Dense(16, activation="relu")(x)
output = Dense(n_outputs, activation="linear")(x)
# Define model interface, loss function, and optimizer
model = Model(inputs=[seq_in, cat_fea_in], outputs=output)
return model
if __name__ == "__main__":
# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument("--data-folder", type=str, dest="data_folder", help="data folder mounting point")
parser.add_argument("--seq-len", type=int, dest="seq_len", default=20, help="length of the input sequence")
parser.add_argument("--batch-size", type=int, dest="batch_size", default=64, help="mini batch size for training")
parser.add_argument("--dropout-rate", type=float, dest="dropout_rate", default=0.10, help="dropout ratio")
parser.add_argument("--learning-rate", type=float, dest="learning_rate", default=0.01, help="learning rate")
parser.add_argument("--epochs", type=int, dest="epochs", default=30, help="# of epochs")
args = parser.parse_args()
args.dropout_rate = round(args.dropout_rate, 2)
print(args)
# Start an Azure ML run
run = Run.get_context()
# Data paths
DATA_DIR = args.data_folder
TRAIN_DIR = os.path.join(DATA_DIR, "train")
# Data and forecast problem parameters
MAX_STORE_ID = 137
MAX_BRAND_ID = 11
PRED_HORIZON = 3
PRED_STEPS = 2
TRAIN_START_WEEK = 40
TRAIN_END_WEEK_LIST = list(range(135, 159, 2))
TEST_START_WEEK_LIST = list(range(137, 161, 2))
TEST_END_WEEK_LIST = list(range(138, 162, 2))
# The start datetime of the first week in the record
FIRST_WEEK_START = pd.to_datetime("1989-09-14 00:00:00")
# Input sequence length and feature names
SEQ_LEN = args.seq_len
DYNAMIC_FEATURES = ["deal", "feat", "month", "week_of_month", "price", "price_ratio"]
STATIC_FEATURES = ["store", "brand"]
# Get unique stores and brands
train_df = pd.read_csv(os.path.join(TRAIN_DIR, "train_round_1.csv"))
store_list = train_df["store"].unique()
brand_list = train_df["brand"].unique()
store_brand = [(x, y) for x in store_list for y in brand_list]
# Train and validate the model using only the first round data
r = 0
print("---- Round " + str(r + 1) + " ----")
# Load training data
train_df = pd.read_csv(os.path.join(TRAIN_DIR, "train_round_" + str(r + 1) + ".csv"))
train_df["move"] = train_df["logmove"].apply(lambda x: round(math.exp(x)))
train_df = train_df[["store", "brand", "week", "move"]]
# Create a dataframe to hold all necessary data
week_list = range(TRAIN_START_WEEK, TEST_END_WEEK_LIST[r] + 1)
d = {"store": store_list, "brand": brand_list, "week": week_list}
data_grid = df_from_cartesian_product(d)
data_filled = pd.merge(data_grid, train_df, how="left", on=["store", "brand", "week"])
# Get future price, deal, and advertisement info
aux_df = pd.read_csv(os.path.join(TRAIN_DIR, "aux_round_" + str(r + 1) + ".csv"))
data_filled = pd.merge(data_filled, aux_df, how="left", on=["store", "brand", "week"])
# Create relative price feature
price_cols = [
"price1",
"price2",
"price3",
"price4",
"price5",
"price6",
"price7",
"price8",
"price9",
"price10",
"price11",
]
data_filled["price"] = data_filled.apply(lambda x: x.loc["price" + str(int(x.loc["brand"]))], axis=1)
data_filled["avg_price"] = data_filled[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
data_filled["price_ratio"] = data_filled.apply(lambda x: x["price"] / x["avg_price"], axis=1)
# Fill missing values
data_filled = data_filled.groupby(["store", "brand"]).apply(
lambda x: x.fillna(method="ffill").fillna(method="bfill")
)
# Create datetime features
data_filled["week_start"] = data_filled["week"].apply(
lambda x: FIRST_WEEK_START + datetime.timedelta(days=(x - 1) * 7)
)
data_filled["day"] = data_filled["week_start"].apply(lambda x: x.day)
data_filled["week_of_month"] = data_filled["week_start"].apply(lambda x: week_of_month(x))
data_filled["month"] = data_filled["week_start"].apply(lambda x: x.month)
data_filled.drop("week_start", axis=1, inplace=True)
# Normalize the dataframe of features
cols_normalize = data_filled.columns.difference(["store", "brand", "week"])
data_scaled, min_max_scaler = normalize_dataframe(data_filled, cols_normalize)
# Create sequence array for 'move'
start_timestep = 0
end_timestep = TRAIN_END_WEEK_LIST[r] - TRAIN_START_WEEK - PRED_HORIZON
train_input1 = gen_sequence_array(data_scaled, store_brand, SEQ_LEN, ["move"], start_timestep, end_timestep)
# Create sequence array for other dynamic features
start_timestep = PRED_HORIZON
end_timestep = TRAIN_END_WEEK_LIST[r] - TRAIN_START_WEEK
train_input2 = gen_sequence_array(data_scaled, store_brand, SEQ_LEN, DYNAMIC_FEATURES, start_timestep, end_timestep)
seq_in = np.concatenate((train_input1, train_input2), axis=2)
# Create array of static features
total_timesteps = TRAIN_END_WEEK_LIST[r] - TRAIN_START_WEEK - SEQ_LEN - PRED_HORIZON + 2
cat_fea_in = static_feature_array(data_filled, total_timesteps, STATIC_FEATURES)
# Create training output
start_timestep = SEQ_LEN + PRED_HORIZON - PRED_STEPS
end_timestep = TRAIN_END_WEEK_LIST[r] - TRAIN_START_WEEK
train_output = gen_sequence_array(data_filled, store_brand, PRED_STEPS, ["move"], start_timestep, end_timestep)
train_output = np.squeeze(train_output)
# Create model
model = create_dcnn_model(
seq_len=SEQ_LEN, n_filters=2, n_input_series=1 + len(DYNAMIC_FEATURES), n_outputs=PRED_STEPS
)
# Convert to GPU model
try:
model = multi_gpu_model(model)
print("Training using multiple GPUs...")
except:
print("Training using single GPU or CPU...")
adam = optimizers.Adam(lr=args.learning_rate)
model.compile(loss="mape", optimizer=adam, metrics=["mape", "mae"])
# Model training and validation
history = model.fit(
[seq_in, cat_fea_in], train_output, epochs=args.epochs, batch_size=args.batch_size, validation_split=0.05
)
val_loss = history.history["val_loss"][-1]
print("Validation loss is {}".format(val_loss))
# Log the validation loss/MAPE
run.log("MAPE", np.float(val_loss))

Просмотреть файл

@ -1,11 +1,12 @@
# coding: utf-8
# Utility functions for building the Dilated Convolutional Neural Network (CNN) model.
# Utility functions for building the Dilated Convolutional Neural Network (CNN) model.
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
def week_of_month(dt):
"""Get the week of the month for the specified date.
@ -14,14 +15,16 @@ def week_of_month(dt):
Returns:
wom (Integer): Week of the month of the input date
"""
"""
from math import ceil
first_day = dt.replace(day=1)
dom = dt.day
adjusted_dom = dom + first_day.weekday()
wom = int(ceil(adjusted_dom/7.0))
wom = int(ceil(adjusted_dom / 7.0))
return wom
def df_from_cartesian_product(dict_in):
"""Generate a Pandas dataframe from Cartesian product of lists.
@ -33,11 +36,13 @@ def df_from_cartesian_product(dict_in):
"""
from collections import OrderedDict
from itertools import product
od = OrderedDict(sorted(dict_in.items()))
cart = list(product(*od.values()))
df = pd.DataFrame(cart, columns=od.keys())
return df
def gen_sequence(df, seq_len, seq_cols, start_timestep=0, end_timestep=None):
"""Reshape features into an array of dimension (time steps, features).
@ -54,9 +59,12 @@ def gen_sequence(df, seq_len, seq_cols, start_timestep=0, end_timestep=None):
data_array = df[seq_cols].values
if end_timestep is None:
end_timestep = df.shape[0]
for start, stop in zip(range(start_timestep, end_timestep-seq_len+2), range(start_timestep+seq_len, end_timestep+2)):
for start, stop in zip(
range(start_timestep, end_timestep - seq_len + 2), range(start_timestep + seq_len, end_timestep + 2)
):
yield data_array[start:stop, :]
def gen_sequence_array(df_all, store_brand, seq_len, seq_cols, start_timestep=0, end_timestep=None):
"""Combine feature sequences for all the combinations of (store, brand) into an 3d array.
@ -70,11 +78,22 @@ def gen_sequence_array(df_all, store_brand, seq_len, seq_cols, start_timestep=0,
Returns:
seq_array (Numpy Array): An array of the feature sequences of all stores and brands
"""
seq_gen = (list(gen_sequence(df_all[(df_all['store']==cur_store) & (df_all['brand']==cur_brand)], seq_len, seq_cols, start_timestep, end_timestep)) \
for cur_store, cur_brand in store_brand)
seq_gen = (
list(
gen_sequence(
df_all[(df_all["store"] == cur_store) & (df_all["brand"] == cur_brand)],
seq_len,
seq_cols,
start_timestep,
end_timestep,
)
)
for cur_store, cur_brand in store_brand
)
seq_array = np.concatenate(list(seq_gen)).astype(np.float32)
return seq_array
def static_feature_array(df_all, total_timesteps, seq_cols):
"""Generate an array which encodes all the static features.
@ -86,10 +105,11 @@ def static_feature_array(df_all, total_timesteps, seq_cols):
Return:
fea_array (Numpy Array): An array of static features of all stores and brands
"""
fea_df = df_all.groupby(['store', 'brand']).apply(lambda x: x.iloc[:total_timesteps,:]).reset_index(drop=True)
fea_df = df_all.groupby(["store", "brand"]).apply(lambda x: x.iloc[:total_timesteps, :]).reset_index(drop=True)
fea_array = fea_df[seq_cols].values
return fea_array
def normalize_dataframe(df, seq_cols, scaler=MinMaxScaler()):
"""Normalize a subset of columns of a dataframe.
@ -102,7 +122,6 @@ def normalize_dataframe(df, seq_cols, scaler=MinMaxScaler()):
df_scaled (Dataframe): Normalized dataframe
"""
cols_fixed = df.columns.difference(seq_cols)
df_scaled = pd.DataFrame(scaler.fit_transform(df[seq_cols]),
columns=seq_cols, index=df.index)
df_scaled = pd.DataFrame(scaler.fit_transform(df[seq_cols]), columns=seq_cols, index=df.index)
df_scaled = pd.concat([df[cols_fixed], df_scaled], axis=1)
return df_scaled, scaler
return df_scaled, scaler

Просмотреть файл

@ -1,6 +1,6 @@
# coding: utf-8
# Create input features for the boosted decision tree model.
# Create input features for the boosted decision tree model.
import os
import sys
@ -9,9 +9,9 @@ import itertools
import datetime
import numpy as np
import pandas as pd
import lightgbm as lgb
import lightgbm as lgb
# Append TSPerf path to sys.path
# Append TSPerf path to sys.path
tsperf_dir = os.getcwd()
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
@ -20,6 +20,7 @@ if tsperf_dir not in sys.path:
from utils import *
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
def lagged_features(df, lags):
"""Create lagged features based on time series data.
@ -33,11 +34,12 @@ def lagged_features(df, lags):
df_list = []
for lag in lags:
df_shifted = df.shift(lag)
df_shifted.columns = [x + '_lag' + str(lag) for x in df_shifted.columns]
df_shifted.columns = [x + "_lag" + str(lag) for x in df_shifted.columns]
df_list.append(df_shifted)
fea = pd.concat(df_list, axis=1)
return fea
def moving_averages(df, start_step, window_size=None):
"""Compute averages of every feature over moving time windows.
@ -49,12 +51,13 @@ def moving_averages(df, start_step, window_size=None):
Returns:
fea (Dataframe): Dataframe consisting of the moving averages
"""
if window_size == None: # Use a large window to compute average over all historical data
if window_size == None: # Use a large window to compute average over all historical data
window_size = df.shape[0]
fea = df.shift(start_step).rolling(min_periods=1, center=False, window=window_size).mean()
fea.columns = fea.columns + '_mean'
fea.columns = fea.columns + "_mean"
return fea
def combine_features(df, lag_fea, lags, window_size, used_columns):
"""Combine different features for a certain store-brand.
@ -73,6 +76,7 @@ def combine_features(df, lag_fea, lags, window_size, used_columns):
fea_all = pd.concat([df[used_columns], lagged_fea, moving_avg], axis=1)
return fea_all
def make_features(pred_round, train_dir, lags, window_size, offset, used_columns, store_list, brand_list):
"""Create a dataframe of the input features.
@ -88,46 +92,59 @@ def make_features(pred_round, train_dir, lags, window_size, offset, used_columns
Returns:
features (Dataframe): Dataframe including all the input features and target variable
"""
"""
# Load training data
train_df = pd.read_csv(os.path.join(train_dir, 'train_round_'+str(pred_round+1)+'.csv'))
train_df['move'] = train_df['logmove'].apply(lambda x: round(math.exp(x)))
train_df = train_df[['store', 'brand', 'week', 'move']]
train_df = pd.read_csv(os.path.join(train_dir, "train_round_" + str(pred_round + 1) + ".csv"))
train_df["move"] = train_df["logmove"].apply(lambda x: round(math.exp(x)))
train_df = train_df[["store", "brand", "week", "move"]]
# Create a dataframe to hold all necessary data
week_list = range(bs.TRAIN_START_WEEK + offset, bs.TEST_END_WEEK_LIST[pred_round]+1)
d = {'store': store_list,
'brand': brand_list,
'week': week_list}
week_list = range(bs.TRAIN_START_WEEK + offset, bs.TEST_END_WEEK_LIST[pred_round] + 1)
d = {"store": store_list, "brand": brand_list, "week": week_list}
data_grid = df_from_cartesian_product(d)
data_filled = pd.merge(data_grid, train_df, how='left',
on=['store', 'brand', 'week'])
data_filled = pd.merge(data_grid, train_df, how="left", on=["store", "brand", "week"])
# Get future price, deal, and advertisement info
aux_df = pd.read_csv(os.path.join(train_dir, 'aux_round_'+str(pred_round+1)+'.csv'))
data_filled = pd.merge(data_filled, aux_df, how='left',
on=['store', 'brand', 'week'])
aux_df = pd.read_csv(os.path.join(train_dir, "aux_round_" + str(pred_round + 1) + ".csv"))
data_filled = pd.merge(data_filled, aux_df, how="left", on=["store", "brand", "week"])
# Create relative price feature
price_cols = ['price1', 'price2', 'price3', 'price4', 'price5', 'price6', 'price7', 'price8', \
'price9', 'price10', 'price11']
data_filled['price'] = data_filled.apply(lambda x: x.loc['price' + str(int(x.loc['brand']))], axis=1)
data_filled['avg_price'] = data_filled[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
data_filled['price_ratio'] = data_filled['price'] / data_filled['avg_price']
data_filled.drop(price_cols, axis=1, inplace=True)
price_cols = [
"price1",
"price2",
"price3",
"price4",
"price5",
"price6",
"price7",
"price8",
"price9",
"price10",
"price11",
]
data_filled["price"] = data_filled.apply(lambda x: x.loc["price" + str(int(x.loc["brand"]))], axis=1)
data_filled["avg_price"] = data_filled[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
data_filled["price_ratio"] = data_filled["price"] / data_filled["avg_price"]
data_filled.drop(price_cols, axis=1, inplace=True)
# Fill missing values
data_filled = data_filled.groupby(['store', 'brand']).apply(lambda x: x.fillna(method='ffill').fillna(method='bfill'))
data_filled = data_filled.groupby(["store", "brand"]).apply(
lambda x: x.fillna(method="ffill").fillna(method="bfill")
)
# Create datetime features
data_filled['week_start'] = data_filled['week'].apply(lambda x: bs.FIRST_WEEK_START + datetime.timedelta(days=(x-1)*7))
data_filled['year'] = data_filled['week_start'].apply(lambda x: x.year)
data_filled['month'] = data_filled['week_start'].apply(lambda x: x.month)
data_filled['week_of_month'] = data_filled['week_start'].apply(lambda x: week_of_month(x))
data_filled['day'] = data_filled['week_start'].apply(lambda x: x.day)
data_filled.drop('week_start', axis=1, inplace=True)
data_filled["week_start"] = data_filled["week"].apply(
lambda x: bs.FIRST_WEEK_START + datetime.timedelta(days=(x - 1) * 7)
)
data_filled["year"] = data_filled["week_start"].apply(lambda x: x.year)
data_filled["month"] = data_filled["week_start"].apply(lambda x: x.month)
data_filled["week_of_month"] = data_filled["week_start"].apply(lambda x: week_of_month(x))
data_filled["day"] = data_filled["week_start"].apply(lambda x: x.day)
data_filled.drop("week_start", axis=1, inplace=True)
# Create other features (lagged features, moving averages, etc.)
features = data_filled.groupby(['store','brand']).apply(lambda x: combine_features(x, ['move'], lags, window_size, used_columns))
features = data_filled.groupby(["store", "brand"]).apply(
lambda x: combine_features(x, ["move"], lags, window_size, used_columns)
)
return features
return features

Просмотреть файл

@ -21,15 +21,12 @@ if tsperf_dir not in sys.path:
# Import TSPerf components
from utils import df_from_cartesian_product
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings \
as bs
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
pd.set_option("display.max_columns", None)
def oj_preprocess(
df, aux_df, week_list, store_list, brand_list, train_df=None
):
def oj_preprocess(df, aux_df, week_list, store_list, brand_list, train_df=None):
df["move"] = df["logmove"].apply(lambda x: round(math.exp(x)))
df = df[["store", "brand", "week", "move"]].copy()
@ -37,14 +34,10 @@ def oj_preprocess(
# Create a dataframe to hold all necessary data
d = {"store": store_list, "brand": brand_list, "week": week_list}
data_grid = df_from_cartesian_product(d)
data_filled = pd.merge(
data_grid, df, how="left", on=["store", "brand", "week"]
)
data_filled = pd.merge(data_grid, df, how="left", on=["store", "brand", "week"])
# Get future price, deal, and advertisement info
data_filled = pd.merge(
data_filled, aux_df, how="left", on=["store", "brand", "week"]
)
data_filled = pd.merge(data_filled, aux_df, how="left", on=["store", "brand", "week"])
# Fill missing values
if train_df is not None:
@ -60,22 +53,13 @@ def oj_preprocess(
)
if train_df is not None:
data_filled = data_filled.loc[
data_filled["week_start"] > forecast_creation_time
].copy()
data_filled = data_filled.loc[data_filled["week_start"] > forecast_creation_time].copy()
return data_filled
def make_features(
pred_round,
train_dir,
lags,
window_size,
offset,
used_columns,
store_list,
brand_list,
pred_round, train_dir, lags, window_size, offset, used_columns, store_list, brand_list,
):
"""Create a dataframe of the input features.
@ -95,19 +79,11 @@ def make_features(
target variable
"""
# Load training data
train_df = pd.read_csv(
os.path.join(train_dir, "train_round_" + str(pred_round + 1) + ".csv")
)
aux_df = pd.read_csv(
os.path.join(train_dir, "aux_round_" + str(pred_round + 1) + ".csv")
)
week_list = range(
bs.TRAIN_START_WEEK + offset, bs.TEST_END_WEEK_LIST[pred_round] + 1
)
train_df = pd.read_csv(os.path.join(train_dir, "train_round_" + str(pred_round + 1) + ".csv"))
aux_df = pd.read_csv(os.path.join(train_dir, "aux_round_" + str(pred_round + 1) + ".csv"))
week_list = range(bs.TRAIN_START_WEEK + offset, bs.TEST_END_WEEK_LIST[pred_round] + 1)
train_df_preprocessed = oj_preprocess(
train_df, aux_df, week_list, store_list, brand_list
)
train_df_preprocessed = oj_preprocess(train_df, aux_df, week_list, store_list, brand_list)
df_config = {
"time_col_name": "week_start",
@ -117,9 +93,7 @@ def make_features(
"time_format": "%Y-%m-%d",
}
temporal_featurizer = TemporalFeaturizer(
df_config=df_config, feature_list=["month_of_year", "week_of_month"]
)
temporal_featurizer = TemporalFeaturizer(df_config=df_config, feature_list=["month_of_year", "week_of_month"])
popularity_featurizer = PopularityFeaturizer(
df_config=df_config,
@ -143,12 +117,7 @@ def make_features(
return_feature_col=True,
)
lag_featurizer = LagFeaturizer(
df_config=df_config,
input_col_names="move",
lags=lags,
future_value_available=True,
)
lag_featurizer = LagFeaturizer(df_config=df_config, input_col_names="move", lags=lags, future_value_available=True,)
moving_average_featurizer = RollingWindowFeaturizer(
df_config=df_config,
input_col_names="move",

Просмотреть файл

@ -0,0 +1,137 @@
# coding: utf-8
# Train and score a boosted decision tree model using [LightGBM Python package](https://github.com/Microsoft/LightGBM) from Microsoft,
# which is a fast, distributed, high performance gradient boosting framework based on decision tree algorithms.
import os
import sys
import argparse
import numpy as np
import pandas as pd
import lightgbm as lgb
import warnings
warnings.filterwarnings("ignore")
# Append TSPerf path to sys.path
tsperf_dir = os.getcwd()
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
from make_features import make_features
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
def make_predictions(df, model):
"""Predict sales with the trained GBM model.
Args:
df (Dataframe): Dataframe including all needed features
model (Model): Trained GBM model
Returns:
Dataframe including the predicted sales of every store-brand
"""
predictions = pd.DataFrame({"move": model.predict(df.drop("move", axis=1))})
predictions["move"] = predictions["move"].apply(lambda x: round(x))
return pd.concat([df[["brand", "store", "week"]].reset_index(drop=True), predictions], axis=1)
if __name__ == "__main__":
# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument("--seed", type=int, dest="seed", default=1, help="Random seed of GBM model")
parser.add_argument("--num-leaves", type=int, dest="num_leaves", default=124, help="# of leaves of the tree")
parser.add_argument(
"--min-data-in-leaf", type=int, dest="min_data_in_leaf", default=340, help="minimum # of samples in each leaf"
)
parser.add_argument("--learning-rate", type=float, dest="learning_rate", default=0.1, help="learning rate")
parser.add_argument(
"--feature-fraction",
type=float,
dest="feature_fraction",
default=0.65,
help="ratio of features used in each iteration",
)
parser.add_argument(
"--bagging-fraction",
type=float,
dest="bagging_fraction",
default=0.87,
help="ratio of samples used in each iteration",
)
parser.add_argument("--bagging-freq", type=int, dest="bagging_freq", default=19, help="bagging frequency")
parser.add_argument("--max-rounds", type=int, dest="max_rounds", default=940, help="# of boosting iterations")
parser.add_argument("--max-lag", type=int, dest="max_lag", default=19, help="max lag of unit sales")
parser.add_argument(
"--window-size", type=int, dest="window_size", default=40, help="window size of moving average of unit sales"
)
args = parser.parse_args()
print(args)
# Data paths
DATA_DIR = os.path.join(tsperf_dir, "retail_sales", "OrangeJuice_Pt_3Weeks_Weekly", "data")
SUBMISSION_DIR = os.path.join(tsperf_dir, "retail_sales", "OrangeJuice_Pt_3Weeks_Weekly", "submissions", "LightGBM")
TRAIN_DIR = os.path.join(DATA_DIR, "train")
# Parameters of GBM model
params = {
"objective": "mape",
"num_leaves": args.num_leaves,
"min_data_in_leaf": args.min_data_in_leaf,
"learning_rate": args.learning_rate,
"feature_fraction": args.feature_fraction,
"bagging_fraction": args.bagging_fraction,
"bagging_freq": args.bagging_freq,
"num_rounds": args.max_rounds,
"early_stopping_rounds": 125,
"num_threads": 4,
"seed": args.seed,
}
# Lags and categorical features
lags = np.arange(2, args.max_lag + 1)
used_columns = ["store", "brand", "week", "week_of_month", "month", "deal", "feat", "move", "price", "price_ratio"]
categ_fea = ["store", "brand", "deal"]
# Get unique stores and brands
train_df = pd.read_csv(os.path.join(TRAIN_DIR, "train_round_1.csv"))
store_list = train_df["store"].unique()
brand_list = train_df["brand"].unique()
# Train and predict for all forecast rounds
pred_all = []
metric_all = []
for r in range(bs.NUM_ROUNDS):
print("---- Round " + str(r + 1) + " ----")
# Create features
features = make_features(r, TRAIN_DIR, lags, args.window_size, 0, used_columns, store_list, brand_list)
train_fea = features[features.week <= bs.TRAIN_END_WEEK_LIST[r]].reset_index(drop=True)
# Drop rows with NaN values
train_fea.dropna(inplace=True)
# Create training set
dtrain = lgb.Dataset(train_fea.drop("move", axis=1, inplace=False), label=train_fea["move"])
if r % 3 == 0:
# Train GBM model
print("Training model...")
bst = lgb.train(params, dtrain, valid_sets=[dtrain], categorical_feature=categ_fea, verbose_eval=False)
# Generate forecasts
print("Making predictions...")
test_fea = features[features.week >= bs.TEST_START_WEEK_LIST[r]].reset_index(drop=True)
pred = make_predictions(test_fea, bst).sort_values(by=["store", "brand", "week"]).reset_index(drop=True)
# Additional columns required by the submission format
pred["round"] = r + 1
pred["weeks_ahead"] = pred["week"] - bs.TRAIN_END_WEEK_LIST[r]
# Keep the predictions
pred_all.append(pred)
# Generate submission
submission = pd.concat(pred_all, axis=0)
submission.rename(columns={"move": "prediction"}, inplace=True)
submission = submission[["round", "store", "brand", "week", "weeks_ahead", "prediction"]]
filename = "submission_seed_" + str(args.seed) + ".csv"
submission.to_csv(os.path.join(SUBMISSION_DIR, filename), index=False)

Просмотреть файл

@ -0,0 +1,241 @@
# coding: utf-8
# Perform cross validation of a boosted decision tree model on the training data of the 1st forecast round.
import os
import sys
import math
import argparse
import datetime
import itertools
import numpy as np
import pandas as pd
import lightgbm as lgb
from azureml.core import Run
from sklearn.model_selection import train_test_split
from utils import week_of_month, df_from_cartesian_product
def lagged_features(df, lags):
"""Create lagged features based on time series data.
Args:
df (Dataframe): Input time series data sorted by time
lags (List): Lag lengths
Returns:
fea (Dataframe): Lagged features
"""
df_list = []
for lag in lags:
df_shifted = df.shift(lag)
df_shifted.columns = [x + "_lag" + str(lag) for x in df_shifted.columns]
df_list.append(df_shifted)
fea = pd.concat(df_list, axis=1)
return fea
def moving_averages(df, start_step, window_size=None):
"""Compute averages of every feature over moving time windows.
Args:
df (Dataframe): Input features as a dataframe
start_step (Integer): Starting time step of rolling mean
window_size (Integer): Windows size of rolling mean
Returns:
fea (Dataframe): Dataframe consisting of the moving averages
"""
if window_size == None: # Use a large window to compute average over all historical data
window_size = df.shape[0]
fea = df.shift(start_step).rolling(min_periods=1, center=False, window=window_size).mean()
fea.columns = fea.columns + "_mean"
return fea
def combine_features(df, lag_fea, lags, window_size, used_columns):
"""Combine different features for a certain store-brand.
Args:
df (Dataframe): Time series data of a certain store-brand
lag_fea (List): A list of column names for creating lagged features
lags (Numpy Array): Numpy array including all the lags
window_size (Integer): Windows size of rolling mean
used_columns (List): A list of names of columns used in model training (including target variable)
Returns:
fea_all (Dataframe): Dataframe including all features for the specific store-brand
"""
lagged_fea = lagged_features(df[lag_fea], lags)
moving_avg = moving_averages(df[lag_fea], 2, window_size)
fea_all = pd.concat([df[used_columns], lagged_fea, moving_avg], axis=1)
return fea_all
def make_predictions(df, model):
"""Predict sales with the trained GBM model.
Args:
df (Dataframe): Dataframe including all needed features
model (Model): Trained GBM model
Returns:
Dataframe including the predicted sales of a certain store-brand
"""
predictions = pd.DataFrame({"move": model.predict(df.drop("move", axis=1))})
predictions["move"] = predictions["move"].apply(lambda x: round(x))
return pd.concat([df[["brand", "store", "week"]].reset_index(drop=True), predictions], axis=1)
if __name__ == "__main__":
# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument("--data-folder", type=str, dest="data_folder", default=".", help="data folder mounting point")
parser.add_argument("--num-leaves", type=int, dest="num_leaves", default=64, help="# of leaves of the tree")
parser.add_argument(
"--min-data-in-leaf", type=int, dest="min_data_in_leaf", default=50, help="minimum # of samples in each leaf"
)
parser.add_argument("--learning-rate", type=float, dest="learning_rate", default=0.001, help="learning rate")
parser.add_argument(
"--feature-fraction",
type=float,
dest="feature_fraction",
default=1.0,
help="ratio of features used in each iteration",
)
parser.add_argument(
"--bagging-fraction",
type=float,
dest="bagging_fraction",
default=1.0,
help="ratio of samples used in each iteration",
)
parser.add_argument("--bagging-freq", type=int, dest="bagging_freq", default=1, help="bagging frequency")
parser.add_argument("--max-rounds", type=int, dest="max_rounds", default=400, help="# of boosting iterations")
parser.add_argument("--max-lag", type=int, dest="max_lag", default=10, help="max lag of unit sales")
parser.add_argument(
"--window-size", type=int, dest="window_size", default=10, help="window size of moving average of unit sales"
)
args = parser.parse_args()
args.feature_fraction = round(args.feature_fraction, 2)
args.bagging_fraction = round(args.bagging_fraction, 2)
print(args)
# Start an Azure ML run
run = Run.get_context()
# Data paths
DATA_DIR = args.data_folder
TRAIN_DIR = os.path.join(DATA_DIR, "train")
# Data and forecast problem parameters
TRAIN_START_WEEK = 40
TRAIN_END_WEEK_LIST = list(range(135, 159, 2))
TEST_START_WEEK_LIST = list(range(137, 161, 2))
TEST_END_WEEK_LIST = list(range(138, 162, 2))
# The start datetime of the first week in the record
FIRST_WEEK_START = pd.to_datetime("1989-09-14 00:00:00")
# Parameters of GBM model
params = {
"objective": "mape",
"num_leaves": args.num_leaves,
"min_data_in_leaf": args.min_data_in_leaf,
"learning_rate": args.learning_rate,
"feature_fraction": args.feature_fraction,
"bagging_fraction": args.bagging_fraction,
"bagging_freq": args.bagging_freq,
"num_rounds": args.max_rounds,
"early_stopping_rounds": 125,
"num_threads": 16,
}
# Lags and used column names
lags = np.arange(2, args.max_lag + 1)
used_columns = ["store", "brand", "week", "week_of_month", "month", "deal", "feat", "move", "price", "price_ratio"]
categ_fea = ["store", "brand", "deal"]
# Train and validate the model using only the first round data
r = 0
print("---- Round " + str(r + 1) + " ----")
# Load training data
train_df = pd.read_csv(os.path.join(TRAIN_DIR, "train_round_" + str(r + 1) + ".csv"))
train_df["move"] = train_df["logmove"].apply(lambda x: round(math.exp(x)))
train_df = train_df[["store", "brand", "week", "move"]]
# Create a dataframe to hold all necessary data
store_list = train_df["store"].unique()
brand_list = train_df["brand"].unique()
week_list = range(TRAIN_START_WEEK, TEST_END_WEEK_LIST[r] + 1)
d = {"store": store_list, "brand": brand_list, "week": week_list}
data_grid = df_from_cartesian_product(d)
data_filled = pd.merge(data_grid, train_df, how="left", on=["store", "brand", "week"])
# Get future price, deal, and advertisement info
aux_df = pd.read_csv(os.path.join(TRAIN_DIR, "aux_round_" + str(r + 1) + ".csv"))
data_filled = pd.merge(data_filled, aux_df, how="left", on=["store", "brand", "week"])
# Create relative price feature
price_cols = [
"price1",
"price2",
"price3",
"price4",
"price5",
"price6",
"price7",
"price8",
"price9",
"price10",
"price11",
]
data_filled["price"] = data_filled.apply(lambda x: x.loc["price" + str(int(x.loc["brand"]))], axis=1)
data_filled["avg_price"] = data_filled[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
data_filled["price_ratio"] = data_filled["price"] / data_filled["avg_price"]
data_filled.drop(price_cols, axis=1, inplace=True)
# Fill missing values
data_filled = data_filled.groupby(["store", "brand"]).apply(
lambda x: x.fillna(method="ffill").fillna(method="bfill")
)
# Create datetime features
data_filled["week_start"] = data_filled["week"].apply(
lambda x: FIRST_WEEK_START + datetime.timedelta(days=(x - 1) * 7)
)
data_filled["year"] = data_filled["week_start"].apply(lambda x: x.year)
data_filled["month"] = data_filled["week_start"].apply(lambda x: x.month)
data_filled["week_of_month"] = data_filled["week_start"].apply(lambda x: week_of_month(x))
data_filled["day"] = data_filled["week_start"].apply(lambda x: x.day)
data_filled.drop("week_start", axis=1, inplace=True)
# Create other features (lagged features, moving averages, etc.)
features = data_filled.groupby(["store", "brand"]).apply(
lambda x: combine_features(x, ["move"], lags, args.window_size, used_columns)
)
train_fea = features[features.week <= TRAIN_END_WEEK_LIST[r]].reset_index(drop=True)
# Drop rows with NaN values
train_fea.dropna(inplace=True)
# Model training and validation
# Create a training/validation split
train_fea, valid_fea, train_label, valid_label = train_test_split(
train_fea.drop("move", axis=1, inplace=False), train_fea["move"], test_size=0.05, random_state=1
)
dtrain = lgb.Dataset(train_fea, train_label)
dvalid = lgb.Dataset(valid_fea, valid_label)
# A dictionary to record training results
evals_result = {}
# Train GBM model
bst = lgb.train(
params, dtrain, valid_sets=[dtrain, dvalid], categorical_feature=categ_fea, evals_result=evals_result
)
# Get final training loss & validation loss
train_loss = evals_result["training"]["mape"][-1]
valid_loss = evals_result["valid_1"]["mape"][-1]
print("Final training loss is {}".format(train_loss))
print("Final validation loss is {}".format(valid_loss))
# Log the validation loss/MAPE
run.log("MAPE", np.float(valid_loss) * 100)

Просмотреть файл

@ -1,9 +1,10 @@
# coding: utf-8
# Utility functions for building the boosted decision tree model.
# Utility functions for building the boosted decision tree model.
import pandas as pd
def week_of_month(dt):
"""Get the week of the month for the specified date.
@ -12,15 +13,17 @@ def week_of_month(dt):
Returns:
wom (Integer): Week of the month of the input date
"""
"""
from math import ceil
first_day = dt.replace(day=1)
dom = dt.day
adjusted_dom = dom + first_day.weekday()
wom = int(ceil(adjusted_dom/7.0))
wom = int(ceil(adjusted_dom / 7.0))
return wom
def df_from_cartesian_product(dict_in):
def df_from_cartesian_product(dict_in):
"""Generate a Pandas dataframe from Cartesian product of lists.
Args:
@ -31,7 +34,8 @@ def df_from_cartesian_product(dict_in):
"""
from collections import OrderedDict
from itertools import product
od = OrderedDict(sorted(dict_in.items()))
cart = list(product(*od.values()))
df = pd.DataFrame(cart, columns=od.keys())
return df
return df

Просмотреть файл

@ -22,7 +22,7 @@ hparams_manual = dict(
learning_rate=0.001,
beta1=0.9,
beta2=0.999,
epsilon=1e-08
epsilon=1e-08,
)
# this is the hyperparameter selected when running 50 trials in SMAC
# hyperparameter tuning.
@ -43,7 +43,7 @@ hparams_smac = dict(
learning_rate=0.001,
beta1=0.7763754022206656,
beta2=0.7923825287287111,
epsilon=1e-08
epsilon=1e-08,
)
# this is the hyperparameter selected when running 100 trials in SMAC
@ -68,5 +68,5 @@ hparams_smac_100 = dict(
learning_rate=0.01,
beta1=0.6011027681578323,
beta2=0.9809964662293627,
epsilon=1e-08
)
epsilon=1e-08,
)

Просмотреть файл

@ -22,7 +22,7 @@ from utils import *
# Add TSPerf root directory to sys.path
file_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
tsperf_dir = os.path.join(file_dir, '../../../../')
tsperf_dir = os.path.join(file_dir, "../../../../")
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
@ -32,13 +32,16 @@ from common.evaluation_utils import MAPE
from smac.configspace import ConfigurationSpace
from smac.scenario.scenario import Scenario
from ConfigSpace.hyperparameters import CategoricalHyperparameter, \
UniformFloatHyperparameter, UniformIntegerHyperparameter
from ConfigSpace.hyperparameters import (
CategoricalHyperparameter,
UniformFloatHyperparameter,
UniformIntegerHyperparameter,
)
from smac.facade.smac_facade import SMAC
LIST_HYPERPARAMETER = ['decoder_input_dropout', 'decoder_state_dropout', 'decoder_output_dropout']
data_relative_dir = '../../data'
LIST_HYPERPARAMETER = ["decoder_input_dropout", "decoder_state_dropout", "decoder_output_dropout"]
data_relative_dir = "../../data"
def eval_function(hparams_dict):
@ -56,10 +59,10 @@ def eval_function(hparams_dict):
hparams_dict[key] = [hparams_dict[key]]
# add the value of other hyper parameters which are not tuned
hparams_dict['encoder_rnn_layers'] = 1
hparams_dict['decoder_rnn_layers'] = 1
hparams_dict['decoder_variational_dropout'] = [False]
hparams_dict['asgd_decay'] = None
hparams_dict["encoder_rnn_layers"] = 1
hparams_dict["decoder_rnn_layers"] = 1
hparams_dict["decoder_variational_dropout"] = [False]
hparams_dict["asgd_decay"] = None
hparams = training.HParams(**hparams_dict)
# use round 1 training data for hyper parameter tuning to avoid data leakage for later rounds
@ -67,106 +70,113 @@ def eval_function(hparams_dict):
make_features_flag = False
train_model_flag = True
train_back_offset = 3 # equal to predict_window
predict_cut_mode = 'eval'
predict_cut_mode = "eval"
# get prediction
pred_o, train_mape = create_round_prediction(data_dir, submission_round, hparams, make_features_flag=make_features_flag,
train_model_flag=train_model_flag, train_back_offset=train_back_offset,
predict_cut_mode=predict_cut_mode)
pred_o, train_mape = create_round_prediction(
data_dir,
submission_round,
hparams,
make_features_flag=make_features_flag,
train_model_flag=train_model_flag,
train_back_offset=train_back_offset,
predict_cut_mode=predict_cut_mode,
)
# get rid of prediction at horizon 1
pred_sub = pred_o[:, 1:].reshape((-1))
# evaluate the prediction on last two days in the first round training data
# TODO: get train error and evalution error for different parameters
train_file = os.path.join(data_dir, 'train/train_round_{}.csv'.format(submission_round))
train_file = os.path.join(data_dir, "train/train_round_{}.csv".format(submission_round))
train = pd.read_csv(train_file, index_col=False)
train_last_week = bs.TRAIN_END_WEEK_LIST[submission_round - 1]
# filter the train to contain ony last two days' data
train = train.loc[train['week'] >= train_last_week - 1]
train = train.loc[train["week"] >= train_last_week - 1]
# create the data frame without missing dates
store_list = train['store'].unique()
brand_list = train['brand'].unique()
store_list = train["store"].unique()
brand_list = train["brand"].unique()
week_list = range(train_last_week - 1, train_last_week + 1)
item_list = list(itertools.product(store_list, brand_list, week_list))
item_df = pd.DataFrame.from_records(item_list, columns=['store', 'brand', 'week'])
item_df = pd.DataFrame.from_records(item_list, columns=["store", "brand", "week"])
train = item_df.merge(train, how='left', on=['store', 'brand', 'week'])
result = train.sort_values(by=['store', 'brand', 'week'], ascending=True)
result['prediction'] = pred_sub
result['sales'] = result['logmove'].apply(lambda x: round(np.exp(x)))
train = item_df.merge(train, how="left", on=["store", "brand", "week"])
result = train.sort_values(by=["store", "brand", "week"], ascending=True)
result["prediction"] = pred_sub
result["sales"] = result["logmove"].apply(lambda x: round(np.exp(x)))
# calculate MAPE on the evaluate set
result = result.loc[result['sales'].notnull()]
eval_mape = MAPE(result['prediction'], result['sales'])
result = result.loc[result["sales"].notnull()]
eval_mape = MAPE(result["prediction"], result["sales"])
return eval_mape
if __name__ == '__main__':
if __name__ == "__main__":
# Build Configuration Space which defines all parameters and their ranges
cs = ConfigurationSpace()
# add parameters to the configuration space
train_window = UniformIntegerHyperparameter('train_window', 3, 70, default_value=60)
train_window = UniformIntegerHyperparameter("train_window", 3, 70, default_value=60)
cs.add_hyperparameter(train_window)
batch_size = CategoricalHyperparameter('batch_size', [64, 128, 256, 1024], default_value=64)
batch_size = CategoricalHyperparameter("batch_size", [64, 128, 256, 1024], default_value=64)
cs.add_hyperparameter(batch_size)
rnn_depth = UniformIntegerHyperparameter('rnn_depth', 100, 500, default_value=400)
rnn_depth = UniformIntegerHyperparameter("rnn_depth", 100, 500, default_value=400)
cs.add_hyperparameter(rnn_depth)
encoder_dropout = UniformFloatHyperparameter('encoder_dropout', 0.0, 0.05, default_value=0.03)
encoder_dropout = UniformFloatHyperparameter("encoder_dropout", 0.0, 0.05, default_value=0.03)
cs.add_hyperparameter(encoder_dropout)
gate_dropout = UniformFloatHyperparameter('gate_dropout', 0.95, 1.0, default_value=0.997)
gate_dropout = UniformFloatHyperparameter("gate_dropout", 0.95, 1.0, default_value=0.997)
cs.add_hyperparameter(gate_dropout)
decoder_input_dropout = UniformFloatHyperparameter('decoder_input_dropout', 0.95, 1.0, default_value=1.0)
decoder_input_dropout = UniformFloatHyperparameter("decoder_input_dropout", 0.95, 1.0, default_value=1.0)
cs.add_hyperparameter(decoder_input_dropout)
decoder_state_dropout = UniformFloatHyperparameter('decoder_state_dropout', 0.95, 1.0, default_value=0.99)
decoder_state_dropout = UniformFloatHyperparameter("decoder_state_dropout", 0.95, 1.0, default_value=0.99)
cs.add_hyperparameter(decoder_state_dropout)
decoder_output_dropout = UniformFloatHyperparameter('decoder_output_dropout', 0.95, 1.0, default_value=0.975)
decoder_output_dropout = UniformFloatHyperparameter("decoder_output_dropout", 0.95, 1.0, default_value=0.975)
cs.add_hyperparameter(decoder_output_dropout)
max_epoch = CategoricalHyperparameter('max_epoch', [50, 100, 150, 200], default_value=100)
max_epoch = CategoricalHyperparameter("max_epoch", [50, 100, 150, 200], default_value=100)
cs.add_hyperparameter(max_epoch)
learning_rate = CategoricalHyperparameter('learning_rate', [0.001, 0.01, 0.1], default_value=0.001)
learning_rate = CategoricalHyperparameter("learning_rate", [0.001, 0.01, 0.1], default_value=0.001)
cs.add_hyperparameter(learning_rate)
beta1 = UniformFloatHyperparameter('beta1', 0.5, 0.9999, default_value=0.9)
beta1 = UniformFloatHyperparameter("beta1", 0.5, 0.9999, default_value=0.9)
cs.add_hyperparameter(beta1)
beta2 = UniformFloatHyperparameter('beta2', 0.5, 0.9999, default_value=0.999)
beta2 = UniformFloatHyperparameter("beta2", 0.5, 0.9999, default_value=0.999)
cs.add_hyperparameter(beta2)
epsilon = CategoricalHyperparameter('epsilon', [1e-08, 0.00001, 0.0001, 0.1, 1], default_value=1e-08)
epsilon = CategoricalHyperparameter("epsilon", [1e-08, 0.00001, 0.0001, 0.1, 1], default_value=1e-08)
cs.add_hyperparameter(epsilon)
scenario = Scenario({"run_obj": "quality", # we optimize quality (alternatively runtime)
"runcount-limit": 50, # maximum function evaluations
"cs": cs, # configuration space
"deterministic": "true"
})
scenario = Scenario(
{
"run_obj": "quality", # we optimize quality (alternatively runtime)
"runcount-limit": 50, # maximum function evaluations
"cs": cs, # configuration space
"deterministic": "true",
}
)
# test the default configuration works
# eval_function(cs.get_default_configuration())
# import hyper parameters
# TODO: add ema in the code to imporve the performance
smac = SMAC(scenario=scenario, rng=np.random.RandomState(42),
tae_runner=eval_function)
smac = SMAC(scenario=scenario, rng=np.random.RandomState(42), tae_runner=eval_function)
incumbent = smac.optimize()
inc_value = eval_function(incumbent)
print('the best hyper parameter sets are:')
print("the best hyper parameter sets are:")
print(incumbent)
print('the corresponding MAPE on validation datset is: {}'.format(inc_value))
print("the corresponding MAPE on validation datset is: {}".format(inc_value))
# following are the print out:
# the best hyper parameter sets are:
@ -186,6 +196,3 @@ if __name__ == '__main__':
# train_window, Value: 26
# the corresponding MAPE is: 0.36703585613035433

Просмотреть файл

@ -14,14 +14,14 @@ from sklearn.preprocessing import OneHotEncoder
# Add TSPerf root directory to sys.path
file_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
tsperf_dir = os.path.join(file_dir, '../../../../')
tsperf_dir = os.path.join(file_dir, "../../../../")
if tsperf_dir not in sys.path:
sys.path.append(tsperf_dir)
import retail_sales.OrangeJuice_Pt_3Weeks_Weekly.common.benchmark_settings as bs
data_relative_dir = '../../data'
data_relative_dir = "../../data"
def make_features(submission_round):
@ -37,63 +37,77 @@ def make_features(submission_round):
file_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
data_dir = os.path.join(file_dir, data_relative_dir)
train_file = os.path.join(data_dir, 'train/train_round_{}.csv'.format(submission_round))
test_file = os.path.join(data_dir, 'train/aux_round_{}.csv'.format(submission_round))
train_file = os.path.join(data_dir, "train/train_round_{}.csv".format(submission_round))
test_file = os.path.join(data_dir, "train/aux_round_{}.csv".format(submission_round))
train = pd.read_csv(train_file, index_col=False)
test = pd.read_csv(test_file, index_col=False)
# select the test data range for test data
train_last_week = train['week'].max()
test = test.loc[test['week'] > train_last_week]
train_last_week = train["week"].max()
test = test.loc[test["week"] > train_last_week]
# calculate series popularity
series_popularity = train.groupby(['store', 'brand']).apply(lambda x: x['logmove'].median())
series_popularity = train.groupby(["store", "brand"]).apply(lambda x: x["logmove"].median())
# fill the datetime gaps
# such that every time series have the same length both in train and test
store_list = train['store'].unique()
brand_list = train['brand'].unique()
store_list = train["store"].unique()
brand_list = train["brand"].unique()
train_week_list = range(bs.TRAIN_START_WEEK, bs.TRAIN_END_WEEK_LIST[submission_round - 1] + 1)
test_week_list = range(bs.TEST_START_WEEK_LIST[submission_round - 1] - 1, bs.TEST_END_WEEK_LIST[submission_round - 1] + 1)
test_week_list = range(
bs.TEST_START_WEEK_LIST[submission_round - 1] - 1, bs.TEST_END_WEEK_LIST[submission_round - 1] + 1
)
train_item_list = list(itertools.product(store_list, brand_list, train_week_list))
train_item_df = pd.DataFrame.from_records(train_item_list, columns=['store', 'brand', 'week'])
train_item_df = pd.DataFrame.from_records(train_item_list, columns=["store", "brand", "week"])
test_item_list = list(itertools.product(store_list, brand_list, test_week_list))
test_item_df = pd.DataFrame.from_records(test_item_list, columns=['store', 'brand', 'week'])
test_item_df = pd.DataFrame.from_records(test_item_list, columns=["store", "brand", "week"])
train = train_item_df.merge(train, how='left', on=['store', 'brand', 'week'])
test = test_item_df.merge(test, how='left', on=['store', 'brand', 'week'])
train = train_item_df.merge(train, how="left", on=["store", "brand", "week"])
test = test_item_df.merge(test, how="left", on=["store", "brand", "week"])
# sort the train, test, series_popularity by store and brand
train = train.sort_values(by=['store', 'brand', 'week'], ascending=True)
test = test.sort_values(by=['store', 'brand', 'week'], ascending=True)
series_popularity = series_popularity.reset_index().sort_values(
by=['store', 'brand'], ascending=True).rename(columns={0: 'pop'})
train = train.sort_values(by=["store", "brand", "week"], ascending=True)
test = test.sort_values(by=["store", "brand", "week"], ascending=True)
series_popularity = (
series_popularity.reset_index().sort_values(by=["store", "brand"], ascending=True).rename(columns={0: "pop"})
)
# calculate one-hot encoding for brands
enc = OneHotEncoder(categories='auto')
brand_train = np.reshape(train['brand'].values, (-1, 1))
brand_test = np.reshape(test['brand'].values, (-1, 1))
enc = OneHotEncoder(categories="auto")
brand_train = np.reshape(train["brand"].values, (-1, 1))
brand_test = np.reshape(test["brand"].values, (-1, 1))
enc = enc.fit(brand_train)
brand_enc_train = enc.transform(brand_train).todense()
brand_enc_test = enc.transform(brand_test).todense()
# calculate price and price_ratio
price_cols = ['price1', 'price2', 'price3', 'price4', 'price5', 'price6', 'price7', 'price8',
'price9', 'price10', 'price11']
price_cols = [
"price1",
"price2",
"price3",
"price4",
"price5",
"price6",
"price7",
"price8",
"price9",
"price10",
"price11",
]
train['price'] = train.apply(lambda x: x.loc['price' + str(int(x.loc['brand']))], axis=1)
train['avg_price'] = train[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
train['price_ratio'] = train['price'] / train['avg_price']
train["price"] = train.apply(lambda x: x.loc["price" + str(int(x.loc["brand"]))], axis=1)
train["avg_price"] = train[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
train["price_ratio"] = train["price"] / train["avg_price"]
test['price'] = test.apply(lambda x: x.loc['price' + str(int(x.loc['brand']))], axis=1)
test['avg_price'] = test[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
test['price_ratio'] = test.apply(lambda x: x['price'] / x['avg_price'], axis=1)
test["price"] = test.apply(lambda x: x.loc["price" + str(int(x.loc["brand"]))], axis=1)
test["avg_price"] = test[price_cols].sum(axis=1).apply(lambda x: x / len(price_cols))
test["price_ratio"] = test.apply(lambda x: x["price"] / x["avg_price"], axis=1)
# fill the missing values for feat, deal, price, price_ratio with 0
for cl in ['price', 'price_ratio', 'feat', 'deal']:
for cl in ["price", "price_ratio", "feat", "deal"]:
train.loc[train[cl].isna(), cl] = 0
test.loc[test[cl].isna(), cl] = 0
@ -102,16 +116,15 @@ def make_features(submission_round):
# 2) brand - 11
# 3) price: price and price_ratio
# 4) promo: feat and deal
series_popularity = series_popularity['pop'].values
series_popularity = (series_popularity - series_popularity.mean()) \
/ np.std(series_popularity)
series_popularity = series_popularity["pop"].values
series_popularity = (series_popularity - series_popularity.mean()) / np.std(series_popularity)
brand_enc_mean = brand_enc_train.mean(axis=0)
brand_enc_std = brand_enc_train.std(axis=0)
brand_enc_train = (brand_enc_train - brand_enc_mean) / brand_enc_std
brand_enc_test = (brand_enc_test - brand_enc_mean) / brand_enc_std
for cl in ['price', 'price_ratio', 'feat', 'deal']:
for cl in ["price", "price_ratio", "feat", "deal"]:
cl_mean = train[cl].mean()
cl_std = train[cl].std()
train[cl] = (train[cl] - cl_mean) / cl_std
@ -132,41 +145,37 @@ def make_features(submission_round):
# ts_value_train
# the target variable fed into the neural network are: log(sales + 1), where sales = exp(logmove).
ts_value_train = np.log(np.exp(train['logmove'].values) + 1)
ts_value_train = np.log(np.exp(train["logmove"].values) + 1)
ts_value_train = ts_value_train.reshape((ts_number, train_ts_length))
# fill missing value with zero
ts_value_train = np.nan_to_num(ts_value_train)
# feature_train
series_popularity_train = np.repeat(series_popularity, train_ts_length).reshape(
(ts_number, train_ts_length, 1))
series_popularity_train = np.repeat(series_popularity, train_ts_length).reshape((ts_number, train_ts_length, 1))
brand_number = brand_enc_train.shape[1]
brand_enc_train = np.array(brand_enc_train).reshape(
(ts_number, train_ts_length, brand_number))
price_promo_features_train = train[['price', 'price_ratio', 'feat', 'deal']].values.reshape(
(ts_number, train_ts_length, 4))
brand_enc_train = np.array(brand_enc_train).reshape((ts_number, train_ts_length, brand_number))
price_promo_features_train = train[["price", "price_ratio", "feat", "deal"]].values.reshape(
(ts_number, train_ts_length, 4)
)
feature_train = np.concatenate((series_popularity_train, brand_enc_train, price_promo_features_train), axis=-1)
# feature_test
series_popularity_test = np.repeat(series_popularity, test_ts_length).reshape(
(ts_number, test_ts_length, 1))
series_popularity_test = np.repeat(series_popularity, test_ts_length).reshape((ts_number, test_ts_length, 1))
brand_enc_test = np.array(brand_enc_test).reshape(
(ts_number, test_ts_length, brand_number))
price_promo_features_test = test[['price', 'price_ratio', 'feat', 'deal']].values.reshape(
(ts_number, test_ts_length, 4))
brand_enc_test = np.array(brand_enc_test).reshape((ts_number, test_ts_length, brand_number))
price_promo_features_test = test[["price", "price_ratio", "feat", "deal"]].values.reshape(
(ts_number, test_ts_length, 4)
)
feature_test = np.concatenate((series_popularity_test, brand_enc_test, price_promo_features_test), axis=-1)
# save the numpy arrays
intermediate_data_dir = os.path.join(data_dir, 'intermediate/round_{}'.format(submission_round))
intermediate_data_dir = os.path.join(data_dir, "intermediate/round_{}".format(submission_round))
if not os.path.isdir(intermediate_data_dir):
os.makedirs(intermediate_data_dir)
np.save(os.path.join(intermediate_data_dir, 'ts_value_train.npy'), ts_value_train)
np.save(os.path.join(intermediate_data_dir, 'feature_train.npy'), feature_train)
np.save(os.path.join(intermediate_data_dir, 'feature_test.npy'), feature_test)
np.save(os.path.join(intermediate_data_dir, "ts_value_train.npy"), ts_value_train)
np.save(os.path.join(intermediate_data_dir, "feature_train.npy"), feature_train)
np.save(os.path.join(intermediate_data_dir, "feature_test.npy"), feature_test)

Просмотреть файл

@ -13,8 +13,17 @@ from utils import *
IS_TRAIN = False
def rnn_predict(ts_value_train, feature_train, feature_test, hparams, predict_window, intermediate_data_dir,
submission_round, batch_size, cut_mode='predict'):
def rnn_predict(
ts_value_train,
feature_train,
feature_test,
hparams,
predict_window,
intermediate_data_dir,
submission_round,
batch_size,
cut_mode="predict",
):
"""
This function creates predictions by loading the trained RNN model.
@ -39,13 +48,21 @@ def rnn_predict(ts_value_train, feature_train, feature_test, hparams, predict_wi
(#time series, #predict_window)
"""
# build the dataset
root_ds = tf.data.Dataset.from_tensor_slices(
(ts_value_train, feature_train, feature_test)).repeat(1)
batch = (root_ds
.map(lambda *x: cut(*x, cut_mode=cut_mode, train_window=hparams.train_window,
predict_window=predict_window, ts_length=ts_value_train.shape[1], back_offset=0))
.map(normalize_target)
.batch(batch_size))
root_ds = tf.data.Dataset.from_tensor_slices((ts_value_train, feature_train, feature_test)).repeat(1)
batch = (
root_ds.map(
lambda *x: cut(
*x,
cut_mode=cut_mode,
train_window=hparams.train_window,
predict_window=predict_window,
ts_length=ts_value_train.shape[1],
back_offset=0
)
)
.map(normalize_target)
.batch(batch_size)
)
iterator = batch.make_initializable_iterator()
it_tensors = iterator.get_next()
@ -55,9 +72,9 @@ def rnn_predict(ts_value_train, feature_train, feature_test, hparams, predict_wi
predictions = build_rnn_model(norm_x, feature_x, feature_y, norm_mean, norm_std, predict_window, IS_TRAIN, hparams)
# init the saver
saver = tf.train.Saver(name='eval_saver', var_list=None)
saver = tf.train.Saver(name="eval_saver", var_list=None)
# read the saver from checkpoint
saver_path = os.path.join(intermediate_data_dir, 'cpt_round_{}'.format(submission_round))
saver_path = os.path.join(intermediate_data_dir, "cpt_round_{}".format(submission_round))
paths = [p for p in tf.train.get_checkpoint_state(saver_path).all_model_checkpoint_paths]
checkpoint = paths[0]
@ -66,12 +83,10 @@ def rnn_predict(ts_value_train, feature_train, feature_test, hparams, predict_wi
sess.run(iterator.initializer)
saver.restore(sess, checkpoint)
pred, = sess.run([predictions])
(pred,) = sess.run([predictions])
# invert the prediction back to original scale
pred_o = np.exp(pred) - 1
pred_o = pred_o.astype(int)
return pred_o

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше