Hello,
since openGUTS includes the model efficiency as a goodness-of-fit measure, I was wondering if there was a cut-off value that can be applied for the performance rating of a model fit, i.e. which NSE range describes a good or acceptable model fit?
There is some nice literature about NSE and model performances (e.g. Ritter and Muñoz-Carpena 2013 or Moriasi et al. 2007), which propose a cut-off value of 0.65 or 0.5 for a satisfactory model fit. However, all studies (that I could find) deal with hydrological models and I'm not sure to what extent we can adopt those assumptions for GUTS models.
Does somebody have any experience with NSE values describing a "good" model fit concerning GUTS models? I'm looking forward to your answers.
Cheers,
Dino
References:
Moriasi, D.N., Arnold, J.G., Liew, M.W.V., Bingner, R.L., Harmel, R.D., Veith, T.L., 2007. Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations. Transactions of the ASABE 50, 885–900. https://doi.org/10.13031/2013.23153
Ritter, A., Muñoz-Carpena, R., 2013. Performance evaluation of hydrological models: Statistical significance for reducing subjectivity in goodness-of-fit assessments. Journal of Hydrology 480, 33–45. https://doi.org/10.1016/j.jhydrol.2012.12.004
Model efficiency (NSE)
Re: Model efficiency (NSE)
Hi Dino,
The goodness-of-fit measures in openGUTS are the ones proposed in the EFSA opinion, plus the NSE (I think that one is most intuitive). However, all of them are based on comparing observed-predicted survival or survivors (and in a sense, you could also say that the NRMSE and NSE are assuming normally distributed residuals as they square the difference). This does not match how GUTS models are fitted: the likelihood function that is optimised looks at the observed and predicted deaths in each interval between observations (and accounts for the fact that these are discrete numbers). Therefore, all of these goodness-of-fit measures are tricky to interpret quantitatively and in an absolute sense for GUTS analyses.
I am not aware of a good fit measure for GUTS models. In section 3.5.7 of the GUTS e-book, we added some discussion on this topic. However, I can see that we need something for practical application (especially in risk assessment). At this moment, I think the only solution would be to 'calibrate' a measure like the NSE on expert opinion. E.g., have a panel of experts judge GUTS fits as 'good enough' or not, and compare that to the NSE's of the fits. However, that's were my imagiation stops. I would be very happy to hear other people's ideas and experiences.
The goodness-of-fit measures in openGUTS are the ones proposed in the EFSA opinion, plus the NSE (I think that one is most intuitive). However, all of them are based on comparing observed-predicted survival or survivors (and in a sense, you could also say that the NRMSE and NSE are assuming normally distributed residuals as they square the difference). This does not match how GUTS models are fitted: the likelihood function that is optimised looks at the observed and predicted deaths in each interval between observations (and accounts for the fact that these are discrete numbers). Therefore, all of these goodness-of-fit measures are tricky to interpret quantitatively and in an absolute sense for GUTS analyses.
I am not aware of a good fit measure for GUTS models. In section 3.5.7 of the GUTS e-book, we added some discussion on this topic. However, I can see that we need something for practical application (especially in risk assessment). At this moment, I think the only solution would be to 'calibrate' a measure like the NSE on expert opinion. E.g., have a panel of experts judge GUTS fits as 'good enough' or not, and compare that to the NSE's of the fits. However, that's were my imagiation stops. I would be very happy to hear other people's ideas and experiences.