Re: How to Use Model Metrics to Gauge Uncertainty


Si Chen <sichen@...>
 

Thanks for pointing that out, Phil.  It seems that CalTrack 4.3.2.4 has replaced ASHRAE's 1.26 "empirical coefficient" with a formula, and for M=12 (12 reporting periods) it comes out to 1.30 for billing (monthly) data and 1.39 for daily data.  

Is P' calculated from P the same way here that n' is calculated from n from the ASHRAE formula, using the autocorrelation coefficient rho?

Finally how do we get the number of model parameters or "number of explanatory variables in the baseline model"?  

-----
Si Chen
Open Source Strategies, Inc.

Our Mission: https://www.youtube.com/watch?v=Uc7lmvnuJHY




On Wed, Mar 4, 2020 at 4:30 PM <ngo.phil@...> wrote:
1. Correct - autocorr_resid is rho
2. The value of n should be 365, that is correct. It sounds like you have the right idea for m as well (i.e, if you have 30 daily predictions and want to know the uncertainty of the sum of those thirty predictions, m should be 30) with a slight caveat that CalTRACK suggests handling these calculations using a polynomial correction using experimentally derived coefficients. See section 4.3, http://docs.caltrack.org/en/latest/methods.html#section-4-aggregation. In that case, there is also an M (capitalized) to keep track of, which is the number of months (regardless of frequency - which is taken into account by using different coefficients for daily and monthly billing data.)

On Wed, Mar 4, 2020 at 3:01 PM Si Chen <sichen@...> wrote:

[Edited Message Follows]

We've fitted some models and would like to know how to use them to really understand the quality of the models.  The model metrics look like this:



and comparing it to ASHRAE 14 guidelines, which gives us these formulas:



My questions are:

1. Is the autocorr_resid the rho (p) is B-14?
2.  What are the right parameters for n and m?  According to an early page in ASHRAE 14, n and m are "number of observations in the baseline (or pre- retrofit) and the post-ECM periods, respectively"   If the model is a daily, should n be 365, so in this case, n' = 365 * (1-0.4792) / (1+0.4792) = 128.5?  If the model is used to compare energy savings over a year, should m be 365?  Or should m be 30 if we're comparing the energy savings on a monthly basis?
3.  How many model parameters are there?  In a combined heating and cooling model, should it be 5 -- 2 betas, 2 balance points, and an intercept -- or 3?

Calculating all this from my example model, I get a 25.8% uncertainty for F (energy savings) of 20% at 68% confidence (t = 1)  Does that seem reasonable for a daily model with this much CVRMSE?

Thanks.

Join {openeemeter@lists.lfenergy.org to automatically receive all group messages.