precision and recall

scroll ↓ to Resources

Note

Precision

  • High fraction of TP within what the model labels as positive class:
  • *bank wants to give a loan ONLY to those who certainly will return it, even if this number is small and excludes some people who also will return the money *
  • ==low precision = wasting resources (processing irrelevant content)==

Recall

  • Powerful “coverage” of the positive class:
  • same as true positive rate
  • bank wants to find all clients who will return the loan, indifferent to those who won’t be able to return it
  • ==low recall = missing important information==

  • See also Precision@k and Recall@k at different cut-off points K
  • ==In practice it is good to track evolution of metrics for various : 1, 10, 50, 1000…==

Difference of PR-curve vs ROC curve

the difference between ROC and precision-recall is in how each treats true negatives. Typically, if true negatives are not meaningful to the problem or negative examples just dwarf the number of positives, precision-recall is typically going to be more useful;

I think intuitively you can say that if your model needs to perform equally well on the positive class as the negative class (for example, for classifying images between cats and dogs, you would like the model to perform well on the cats as well as on the dogs. For this you would use the ROC AUC.
On the other hand, if you’re not really interested in how the model performs on the negative class, but just want to make sure every positive prediction is correct (precision), and that you get as many of the positives predicted as positives as possible (recall), then you should choose PR AUC. For example, for detecting cancer, you don’t care how many of the negative predictions are correct, you want to make sure all the positive predictions are correct, and that you don’t miss any. (In fact, in this case missing a cancer would be worse then a false positive so you’d want to put more weight towards recall.)

Resources


table file.inlinks, filter(file.outlinks, (x) => !contains(string(x), ".jpg") AND !contains(string(x), ".pdf") AND !contains(string(x), ".png")) as "Outlinks" from [[]] and !outgoing([[]])  AND -"Changelog"