A good forecaster is no smarter than everyone else, he merely has his ignorance better organized. – Anonymous
I’ve written on the topic of forecast performance measurement from many different angles, particularly in the context of forecasting sales at the point of consumption in retail.
Over the years, I’ve opined that:
- Forecast accuracy (in the traditional sense) is a useless measure
- Reasonableness is more important than accuracy, given that forecasts are, by their nature, forgiving planning elements
- The outsized importance placed on forecast accuracy in supply chain planning is a myth
- Accuracy and precision must be considered simultaneously
- Forecasts should be judged against what is a reasonable expectation for accuracy
- Forecasting at higher levels of aggregation to achieve higher levels of “accuracy” is a waste of time
After going back and re-reading all of that stuff, they are all really just different angles and approaches for delivering the message “popular methods of comparing forecasts and actuals may not be as useful as you think, especially in a retail context”.
But in all of this time there is one key aspect of forecast measurement that I have not addressed: forecast lags. In other words, which forecast (or forecasts) should you be comparing to the actual?
Assuming, for example, that you have a rolling 52 week forecasting process where forecasts and actuals are in weekly buckets, then for any given week, you would have 52 choices of forecasts to compare to a single actual. So which one(s) do you choose?
Let’s get the easy one out of the way first. Considering that the forecast is being used to drive the supply chain, the conventional wisdom is that the most important lag to capture for measurement is the order lead time, when a firm commitment to purchase must be made based on the forecast. For example, if the lead time is 4 weeks, you’d capture the forecast for 4 weeks from now and measure its accuracy when the actual is posted 4 weeks later.
Nope. To all of that.
While it’s true that measuring the cumulative forecast error over the lead time can be useful for determining safety stock levels, it’s not very useful for measuring the performance of the forecasting process itself, for a couple of reasons:
- It is a flagrant violation of demand planning principle. Nothing on the supply side of the equation (inventory levels, lead times, pack rounding, purchasing constraints, etc.) has anything to do with true demand. Customers want the products they want, where they want them and when they want them at a price they’re willing to pay, period. The amount of time it happens to take to get from the point of origin to a customer accessible location is completely immaterial to the customer.
- A demand planner’s job is to manage the entire continuum of forecasts over the forecast horizon. If they know about something that will affect demand at any point (or at all points) over the next 52 weeks, the forecasts should be amended accordingly.
Suppose that you’re a demand planner who manages the following item/location. The black line is 3 years’ worth of demand history and a weekly baseline forecast is calculated for the next 52 weeks.
Because you’re a very good demand planner who keeps tabs on the drivers of demand for this product, you know that:
- The warm weather that drives the demand pattern for this item/location has arrived early and it looks like it’s going to stay that way between now and when the season was originally expected to start.
- There are 2 one week price promotions coming up that have just been signed off and all of the pertinent details (particularly timing and discount) are known.
- For the last 3 years, there have been 3 similar products to this one being offered at this location. A decision has just been made to broaden the assortment with 2 additional similar products half way through the selling season.
On that basis, I have 2 questions:
- How does the baseline forecast need to change in order to incorporate this new information?
- How would your answer to question 1 change if you also knew that the order-to-delivery lead time for this item/location was 1 week? 2 weeks? 12 weeks?
Hint: Because it was established at the outset that “you’re a very good demand planner who keeps tabs on the drivers of demand for this product”, the answer to question 2 is: “Not at all.”
So if measuring forecast error at the lead time isn’t the right way to go, then what lag(s) should be captured for measurement?
As with all things forecasting related, there is no definitive answer to this question. But as a matter of principle, the lags chosen to measure the performance of a demand planning process should based on when facts become “knowable” that could affect future demand and would prompt a demand planner to “grab the stick” and override a baseline forecast modeled based on historical patterns.
In some cases, upstream processes that create or shape demand can provide very specific input to the forecasting process.
For example, it’s common for retailers to have promotional planning processes with specific milestones, for example:
- Product selection and price discounts are decided 12 weeks out
- Final design of media to support the ad is decided 8 weeks out
- Last minute adds, deletes and switches are finalized 3 weeks out
At each of those milestones, decisions can be made that might impact a demand planner’s expectation of demand for the promotion, so in this case, it would be valuable to store forecasts at lags 3, 8 and 12. Similar milestone schedules generally exist for assortment decisions as well.
In other cases, what’s “knowable” to the demand planner can be subject to judgment. For example, if actuals come in higher than forecast for 3 weeks in a row, is that a trend change or a blip? How about 4 weeks in a row?
Lags that are closer in time (say 0 through 4) are often useful in this regard, as they can show error trends forming while they are still fresh.
Unless tied to a demand shaping process with specific milestones as described above, long term lags are virtually useless. Reviewing actuals posted over the weekend and comparing it to a forecast for that week that was created 6 months ago might be an interesting academic exercise, but it’s a complete waste of time otherwise.
The key of measuring is to inform so as to improve the process over the long term.
With the right tools and mindset, today’s “I wish I knew that ahead of time” turns into tomorrow’s knowable information.