How to interpret the Area Under the Curve (AUC) stat

One of the questions I often ask in data science interviews is ‘How would you explain the area under the curve statistic to a business person?’. Maybe it is too easy a question even for juniors, as I can’t remember anyone getting it wrong. While there is no correct answer per se, the most logical response is you focus on discussing true positives and false positives, and how the predictive model can be tuned to capture more true positives at the expense of generating more false positives. Only after that do you then even bother to show the ROC curve, and say we calculate the area under the curve (AUC) as a measure of how well the model can discriminate the two classes.

The most recent situation I remember this happened in real life, I actually said to the business rep that the AUC does not directly translate to revenue, but is a good indication that a model is good in an absolute sense (we know others have AUCs typically around 0.7 to 0.8 for this problem, and over 0.5 is better than random). And it is often good in a relative sense – a model with an AUC of 0.8 is typically better than a model with and AUC of 0.75 (although not always, you need to draw the ROC curve and make sure the larger AUC curve dominates the other curve and that they do not cross). So while I try to do my best explaining technical statistical content, I often punt to simpler ‘here are the end outcomes we care about’ (which don’t technically answer the question) as opposed to ‘here is how the sausage is made’ explanations.

One alternative and simple explanation of AUC though for binary models is to take the Harrell’s C index interpretation, which for binary predictions is equivalent to the AUC statistic. So for this statistic you could say something like ‘If I randomly sample a negative case and a positive case, the positive case will have a higher predicted risk {AUC} percent of the time.’ I do like this interpretation, which illustrates that the AUC is all just about rank ordering the predictions, and a more discriminating model will have a higher AUC (although it says nothing about calibration).

Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: