Time to Check the Current NIST Face Recognition Vendor Test Results (well, three of them)

It’s been a while since I’ve peeked at the NIST Face Recognition Vendor Test (FRVT) results.

As I’ve stated before, the results can be sliced and diced in so many ways that many vendors can claim to be the #1 NIST FRVT vendor.

What’s more, these results change on a monthly basis, so it’s quite possible that the #1 vendor in some category in February 2022 was no longer than #1 vendor in March 2022. (And if your company markets years-old FRVT results, stop it!)

This is the August 15, 2023 peek at three ways to slice and dice the NIST FRVT results.

And a bunch of vendors will be mad at me because I didn’t choose THEIR preferred slicing and dicing, or their ways to exclude results (not including Chinese algorithms, not including algorithms used in surveillance, etc.). The mad vendors can write their own blog posts (or ask Bredemarket to ghostwrite them on their behalf).

NIST FRVT 1:1, VISABORDER

The phrase “NIST FRVT 1:1, VISABORDER” is shorthand for the NIST one-to-one version of the Face Recognition Vendor Test, using the VISABORDER probe and gallery data. This happens to be the default way in which NIST sorts the 1:1 accuracy results, but of course you can sort them against any other probe/gallery combination, and get a different #1 vendor.

As of August 15, the top two accuracy algorithms for VISABORDER came from Cloudwalk. Here are all of the top ten.

Captured 8/15/2023, sorted by VISABORDER. From https://pages.nist.gov/frvt/html/frvt11.html

NIST FRVT 1:1, Comparison Time (Mate)

But NIST doesn’t just measure accuracy for a bunch of different probe-target combinations. It also measures performance, since the most accurate algorithm in the world won’t do you any good if it takes forever to compare the face templates.

One caveat regarding these measures is that NIST conducts the tests on a standardized set of equipment, so that results between vendors can be compared. This is important to note, because a comparison that takes 103 milliseconds on NIST’s equipment will yield a different time on a customer’s equipment.

One of the many performance measures is “Comparison Time (Mate).” There is also a performance measure for “Comparison Time (Non-mate).”

So in this test, the fastest vendor algorithm comes from Trueface. Again, here are the top 10.

Captured 8/15/2023, sorted by Comparison Time (Mate). From https://pages.nist.gov/frvt/html/frvt11.html

NIST FRVT 1:N, VISABORDER 1.6M

Now I know what some of you are saying. “John,” you say, “the 1:1 test only measures a comparison against one face against one other face, or what NIST calls verification. What if you’re searching against a database of faces, or identification?”

Well, NIST has a 1:N test to measure that particular use case. Or use cases, because again you can slice and dice the results in so many different ways.

When looking at accuracy, the default NIST 1:N sort is by:

  • Probe images from the BORDER database.
  • Gallery images from a 1,600,000 record VISA database.

Cloudwalk happens to be the #1 vendor in this slicing and dicing of the test. Here are the top ten.

Captured 8/15/2023, sorted by Visa, Border, N=1600000. From https://pages.nist.gov/frvt/html/frvt1N.html

Test data is test data

The usual cautions apply that everyone, including NIST, emphasizes that these test results do not guarantee similar results in an operational environment. Even if the algorithm author ported its algorithm to an operational system with absolutely no changes, the operational system will have a different hardware configuration and will have different data.

For example, none of the NIST 1:N tests use databases with more than 12 million records. Even 20 years ago, Behnam Bavarian correctly noted that biometric databases would eventually surpass hundreds of millions of records, or even billions of records. There is no way that NIST could assemble a test database that large.

So you should certianly consider the NIST tests, but before you deploy an operational ABIS, you should follow Mike French’s advice and conduct an ABIS benchmark on your own equipment, with your own data.

1 Comment

Leave a Comment