(Baby smoking image designed by Freepik)
As I’ve mentioned before, when the National Institute of Standards and Technology (NIST) tests biometric modalities such as finger and face, they conduct each test in a bunch of different ways.
One of the ramifications of this is that many entities can claim that they are “the best, according to NIST.”
For example, when NIST released its first version of the age estimation tests, 5 of the 6 participating vendors scored first in SOME category.
But NIST doesn’t do this just to make the vendors happy. NIST does this because biometrics are used in many, many ways.
Let’s look at recent age estimation testing, which currently tests 15 algorithms rather than the original 6.
Governments and private entities can estimate ages for people at the pub, people buying weed, or people gambling. And then there’s the use case that is getting a lot of attention these days—people accessing social media.
Child Online Safety, Ages 13-16 (in my country anyway)
When NIST conceived the age estimation tests, the social media providers generaly required their users to be 13 years of age or older. For this reason, one of NIST’s age estimation tests focused upon whether age estimation algorithms could reliably identify those who were 13 years old vs. those who were not.

Which, by the way, basically means that the NIST age estimation tests are useless in Australia. After NIST started age estimation testing, Australia passed a law last month requiring social media users to be 16 years old or older.
Returning to America, NIST actually conducted several different tests for the 13 year old “child online safety” testing. I’m going to focus on one of them:
Age 8-12 – False Positive Rates (FPR) are proportions of subjects aged 8 to 12 but whose age is estimated from 13 to 16 (below 17).
This covers the case in which a social media provider requires people to be 13 years old or older, someone between 8 and 12 tries to sign up for the social media service anyway…AND SUCCESSFULLY DOES SO.
You want the “false positive rate” to be as low as possible in this case, so that’s what NIST measures.
Results as of December 10, 2024
The image below was taken from the NIST Face Analysis Technology Evaluation (FATE) Age Estimation & Verification page on December 10, 2024. Because this is a continuous test, the actual results may be different by the time you read this, so be sure to check the latest results.

As of December 10, the best performing algorithm of the 15 tested had a false positive rate (FPR) of 0.0467. The second was close at 0.0542, with the third at 0.0828.
The 15th was a distant last at 0.2929.
But the worst-tested algorithm is much better on other tests
But before you conclude that the 15th algorithm in the “8-12” test is a dud, take a look at how that same algorithm performed on some of the OTHER age estimation tests.
- For the age 17-22 test (“False Positive Rates (FPR) are proportions of subjects aged 17 to 22 but whose age is estimated from 13 to 16 (below 17)”), this algorithm was the second MOST accurate.
- And the algorithm is pretty good at correctly classifying 13-16 year olds.
- It also performs well in the “challenge 25” tests (addressing some of the use cases I mentioned above such as alcohol purchases).

So it looks like this particular algorithm doesn’t (currently) do well with kids, but it does VERY well with adults.
So before you use the NIST tests as a starting point to determine if an algorithm is good for you, make sure you evaluate the CORRECT test, including the CORRECT data.
