When I think of residuals, I think actual - predicted.

But what’s a residual for a frequency analysis? Because you train models on a binary, 1 or 0, but the predicted result is an expected frequency per unit of exposure…so how would you calculate a “residual” for a frequency analysis?

Severity is much easier, you can calculate actual severity(continuous) - predicted severity(also continuous)

Till all are one,

Epistemus

One method that I’ve seen is “bunched” residuals: you bucket the data according to some consistent level (often aggregating to unique values of input variables of the model . . . with some level of grouping for continuous or near-continuous variables) so that the predicted value has very small (or zero) variance within the bin. Then you can do the actual (rate) less the predicted.

1 Like

Actually, the correct term to search for would be “crunched residuals” . . . this is often available in most statistical packages.

Another method for this is to group the residuals into a fixed number of buckets (e.g., 500 or 10,000 depending on the size of your data) and graph the average residual value.

Right…but what, exactly, are the residuals in this context? How would you define it?

It’s still going to be some form of “actual - predicted”; but there are some various transformations that you might consider to help with the analysis.

Here is a resource from a google search on “residuals”:

Understanding Deviance Residuals | UVA Library (virginia.edu)

You might also look at chapter 6 of the CAS Monograph #5:

Understanding Deviance Residuals | UVA Library (virginia.edu)

To a large extent, I’ve seen the “deviance residual” used quite a bit in practice; but variations of the Pearson’s residual have appeared in some of the competitive filings I’ve reviewed (but this was about 10 years ago).

1 Like