Jul 2, 2021
Hi David, great article!
As you perfectly explained in your article: “choosing the optimal threshold depends on the cost to the business of false negatives vs. false positives.”
I totally agree and in the OCR space, this threshold choice is more a business problem than a data science problem. I tried to make this article easy to use for plotting a PR-Curve with just a spreadsheet so that people working on the product can choose their optimal threshold without coding.