diff --git a/README.md b/README.md index 1bb9e7a..08b7b09 100644 --- a/README.md +++ b/README.md @@ -37,7 +37,7 @@ This solution analyzes your prices and makes recommendations, All information is provided the fine level at which you need to control your price and inventory. -More detailed description is available in our +Additional detail on the data science of prices are in our [blog post](https://blogs.msdn.microsoft.com/intel/archives/1015). # Solution Architecture diff --git a/Technical Deployment Guide/TechnicalDeploymentGuide.md b/Technical Deployment Guide/TechnicalDeploymentGuide.md index 284c265..2e72c94 100644 --- a/Technical Deployment Guide/TechnicalDeploymentGuide.md +++ b/Technical Deployment Guide/TechnicalDeploymentGuide.md @@ -282,19 +282,19 @@ CREATE TABLE dbo.Forecasts ( ChannelName varchar(100) not null, CustomerSegment varchar(100) not null, LastDayOfData date not null, -- forecast made using data up and including this - PeriodInDays int not null, - PeriodsAhead int not null, - ForecastPeriodStart date not null, - ForecastPeriodEnd date not null, -- end of the period whose demand is forecasted - UnitPrice float, -- forecast is conditional on this price. Should be decimal (6,2), but that's pulling the ADF tiger's tail. + PeriodInDays int not null, + PeriodsAhead int not null, + ForecastPeriodStart date not null, + ForecastPeriodEnd date not null, -- end of the period whose demand is forecasted + UnitPrice float, -- forecast is conditional on this price. Demand float not null, Demand90LB float not null, Demand90UB float not null, - ActualSales float null, - sAPE float null, - qBar float null + ActualSales float null, + sAPE float null, + qBar float null primary key (RunDate, Item, SiteName, ChannelName, - LastDayOfData, ForecastPeriodStart, ForecastPeriodEnd) + LastDayOfData, ForecastPeriodStart, ForecastPeriodEnd) ) ``` @@ -310,10 +310,7 @@ To put the sAPE in perspective, the qBar is a smoothed measure demand around the #### Suggestions dataset The SuggestionRuns table stores the pricing suggestions made from -the elasticities and forecasts. The suggestionRunID is an identifier -referring to the date of model build from which the suggestion are created. - -The +the elasticities and forecasts. ```sql CREATE TABLE [dbo].[SuggestionRuns] ( @@ -345,27 +342,70 @@ CREATE TABLE [dbo].[SuggestionRuns] ( ); ``` -
-TODO: Describe the table columns -
+The suggestionRunID is an identifier referring to the date of model +build from which the suggestion are created. + +PastPeriodStart and PastPeriodEnd describe the time interval for which +the baseline numbers (Orders, Revenue, Margin) are taken (the "past period"). +SuggestionPeriodStar and SuggestionPeriodEnd describe the period for +which the price is proposed (the "suggestion period"). +minOrders is the minimum number of orders that need to have occured +for the item in the past period to be considered in the suggestion pipeline. + +Then we have the baseline numbers: UnitsLastPeriod, avgSaleUnitPrice, +avgCostUnitPrice, RevenueLastPeriod, MarginLastPeriod +whose interpretations are hopefully clear. +Orders will be more than 1 only if disaggregated data are entered. + +Elasticity comes from the model estimatuon step, and the optimal prices +follow from it and the marginal cost (avgCostUnitPrice). The exact price maximizing +the gross profit margin is marginOptimalPrice, which is then rounded to +"x.y9". The model predicts that it should be possible to make additional +incrementalMargin dollars over the suggestion period. ## Configuration To set the parameters, update the table dbo.Parameters in the Solution's -SQL database. These parameters are configurable: +SQL database, which stores simple key-value pairs: -* Lead-time (days before start of pricing period) for producing the pricing suggestions. - Insert the key-value pair ('sugLeadTime', '[n]') into the table, replacing [n] - by the number of days in advance. The default value is '1'. -* Maximum allowed deviation of suggested price from current price (percentage). - Please insert the key-value pair ('maxPriceDeviation', '[x.x]') into the table, - replacing [x.x] by the desired fraction. The default value is '0.2', representing - a maximum deviation of 20 percent. +```sql +CREATE TABLE [dbo].[Parameters] ( + [paramName] VARCHAR(50) NOT NULL, + [paramValue] VARCHAR(MAX) NULL, + PRIMARY KEY ([paramName]) +) +``` -
-TODO: describe the parameters table and check parameter names -
+The bulk services are the only ML Services running in the ADF +pipeline and therefore need configuration. +### Recognized parameters for bulk services + +|Parameter (paramName) | Meaning | Default paramValue | +|--|--|--| +|BulkElasticities_DeltaX | Elasticity for what change in price*? | -0.1 | +|BulkElasticities_WeekJump | Retrieve elasticity for every n-th week | 1 | +|BulkForecasts_periodsAhead | Forecasts for many periods ahead? | 1 | +|BulkCrossPrice_WeekJump |Retrieve elasticity for every n-th week | 1 | + +### Adjusting suggestion lead times + +Today, you can adjust the lead times on suggestion by manipulating the date parameters +which Azure Data Factory passes to the spRecommendProducts stored procedure Activity +(see ADF description below). If you would like multiple suggestion periods, please +duplicate the stored procedure activity and call it with different parameters. + +```json +"typeProperties": { + "storedProcedureName": "spRecommendProducts", + "storedProcedureParameters": { + "SliceEnd": "$$Text.Format('{0:yyyy-MM-dd}', SliceEnd)", + "lastDayOfData": "$$Text.Format('{0:yyyy-MM-dd}', SliceEnd)", + "suggestionPeriodStart": "$$Text.Format('{0:yyyy-MM-dd}', Date.AddDays(SliceEnd,1))", + "suggestionPeriodEnd": "$$Text.Format('{0:yyyy-MM-dd}', Date.AddDays(SliceEnd,7))", + "minOrders": "1" +} +``` ## Building Applications @@ -420,13 +460,16 @@ The same information is reflected in the VIEW SCHEMA pane of the AzureML plugin. There are three types of ML service in this solution, batch model build, interactive retrieval and bulk retrieval services. -The batch model build service is BuildModel and is responsible for all estimation tasks. -Depending on data size, it can run minutes to hours. +The batch model build service is BuildModel and is responsible for all estimation +and forecasting tasks. Depending on data size, it can run several minutes to hours. The interactive services are: -* Elasticities -* CrossElasticities -* Forecasts +* Elasticities - retrieve elasticities for one product at all sites, channels, and segments +* CrossElasticities - retrieve cross-elasticities for all products and channels at one site. + The model assumes the same items at different sites don't compete. + Perhaps more questionably, it also assumes that customer segmentation + boundaries are not permeable. +* Forecasts - retrieve forecasts at one site, assa specific pricing point * PromoSimulation * Outliers * RetrospectiveAnalysis @@ -439,14 +482,30 @@ The bulk services are used to export the data from the model to the database. * BulkCrossElasticities * BulkForecasts -### Storage and ADF architecture +### Storage and ADF structure + +The Azure Data Factory has three Pipelines: +- Configure Services Pipeline. This creates small datasets containing solution parameters. +- Pricing This is the large pipeline which +- * Prepares the data for modeling + * Runs the models + * Extracts forecasts and elasticities from the model + * Loads the model outputs into the database for visualization +- Suggestions Pipeline creates pricing suggestions based on outputs of the Pricing pipeline.
-TODO: describe the parameters table and check parameter names +TODO: describe activities in the pipelines
-- ADF pipeline -- Datasets +The storage account has the following important folders: +- crosselasticity - extracted from the model in Pricing Pipeline +- elasticity - extracted from the model in Pricing Pipeline +- experimentoutput - AzureML experiment internal cache +- forecasts - extracted from the model in Pricing Pipeline +- originaldata - cleaned, processed input data at start of Pricing Pipeline +- pricing - contains the log files from model runs +- serviceparameters - parameter datasets produced by the Configure Services pipeline + ## Troubleshooting @@ -459,7 +518,7 @@ need to use higher tiers of the resources. The blob name is crezted from the datasetName given in the spreadsheet. Open the storage account and make sure a model with the given datasetName exists. The name is case-sensitive. -The default datasetName used by ADF is latestModelBuild +The default datasetName used by ADF is "latestDemoBuild". #### Timeouts on the BuildModel service Try increasing the timeout period in the retrain_AzureML_Model ADF activity.