After a lack of appearances in the Bredemarket blog (none since November), Pangiam is making an appearance again, based on announcements by Biometric Update and Trueface itself about a new revision of the Trueface facial recognition SDK.
The new revision includes a number of features, including a new model for masked faces and some technical improvements.
So what is this revision called?
“Wait,” you’re asking yourself. “Version 1.0 is the NEW version? It sounds like the ORIGINAL version. Shouldn’t the new version be 2.0?”
Well, no. The original version was V0. Trueface is now ready to release V1.
When I viewed the page on the afternoon of March 28, the latest stable release was 0.33.14634.
If you want to use the version 1.0 that is being “introduced” (Pangiam’s word), you have to go to the latest beta release, which was 1.0.16286.
And if you want to go bleeding edge alpha, you can get release 1.1.16419.
(Again, this was on the afternoon of March 28, and may change by the time you read this.)
Now most biometric vendors don’t expose this much detail about their software. Some don’t even provide any release information, especially for products with long delivery times where the version that a customer will eventually get doesn’t even have locked-down requirements yet. But Pangiam has chosen to provide this level of detail.
Oh, and Pangiam/Trueface also actively participates in the ongoing NIST FRVT testing. Information on the 1:1 performance of the trueface-003 algorithm can be found here. Information on the 1:N performance of the trueface-000 algorithm can be found here.
As I’ve noted before, there are a number of facial recognition companies that claim to be the #1 NIST facial recognition vendor. I’m here to help you cut through the clutter so you know who the #1 NIST facial recognition vendor truly is.
Now how can ALL dozen-plus of these entities be number 1?
The NIST 1:1 and 1:N tests include many different accuracy and performance measurements, and each of the entities listed above placed #1 in at least one of these measurements. And all of the databases, database sizes, and use cases measure very different things.
Visage Technologies was #1 in the 1:1 performance measurements for template generation time, in milliseconds, for 480×720 and 960×1440 data.
Meanwhile, NEC was #1 in the 1:N Identification (T>0) accuracy measurements for gallery border, probe border with a delta T greater than or equal to 10 years, N = 1.6 million.
Not to be confused with the 1:N Identification (T>0) accuracy measurements for gallery visa, probe border, N = 1.6 million, where the #1 algorithm was not from NEC.
And not to be confused with the 1:N Investigation (R = 1, T = 0) accuracy measurements for gallery border, probe border with a delta T greater than or equal to 10 years, N = 1.6 million, where the #1 algorithm was not from NEC.
And can I add a few more caveats?
First caveat: Since all of these tests are ongoing tests, you can probably find a slightly different set of #1 algorithms if you look at the January data, and you will probably find a slightly different set of #1 algorithms when the March data is available.
Second caveat: These are the results for the unqualified #1 NIST categories. You can add qualifiers, such as “#1 non-Chinese vendor” or “#1 western vendor” or “#1 U.S. vendor” to vault a particular algorithm to the top of the list.
Third caveat: You can add even more qualifiers, such as “within the top five NIST vendors” and (one I admit to having used before) “a top tier NIST vendor in multiple categories.” This can mean whatever you want it to mean. (As can “dramatically improved” algorithm, which may mean that you vaulted from position #300 to position #200 in one of the categories.)
Fourth caveat: Even if a particular NIST test applies to your specific use case, #1 performance on a NIST test does not guarantee that a facial recognition system supplied by that entity will yield #1 performance with your database in your environment. The algorithm sent to NIST may or may not make it into a production system. And even if it does, performance against a particular NIST test database may not yield the same results as performance against a Rhode Island criminal database, a French driver’s license database, or a Nigerian passport database. For more information on this, see Mike French’s LinkedIn article “Why agencies should conduct their own AFIS benchmarks rather than relying on others.”
So now that you know who the #1 NIST facial recognition vendor is, do you feel more knowledgeable?
As many of you know, there have been many claims about bias in facial recognition, which have even led to the formation of an Algorithmic Justice League.
Whoops, wrong Justice League. But you get the idea. “Gender Shades” and stuff like that, which I’ve written about before.
Back to Hall’s article, which makes a number of excellent points about bias in facial recognition, including the studies performed by NIST (referenced later in this post), but I loved one comparison that Baker wrote about.
So technical improvements may narrow but not entirely eliminate disparities in face recognition. Even if that’s true, however, treating those disparities as a moral issue still leads us astray. To see how, consider pharmaceuticals. The world is full of drugs that work a bit better or worse in men than in women. Those drugs aren’t banned as the evil sexist work of pharma bros. If the gender differential is modest, doctors may simply ignore the difference, or they may recommend a different dose for women. And even when the differential impact is devastating—such as a drug that helps men but causes birth defects when taken by pregnant women—no one wastes time condemning those drugs for their bias. Instead, they’re treated like any other flawed tool, minimizing their risks by using a variety of protocols from prescription requirements to black box warnings.
As an (tangential) example of this, I recently read an article entitled “To begin addressing racial bias in medicine, start with the skin.” This article does not argue that we should ban dermatology because conditions are more often misdiagnosed in people with darker skin. Instead, the article argues that we should improve dermatology to reduce these biases.
In the same manner, the biometric industry and stakeholder should strive to minimize bias in facial recognition and other biometrics, not ban it. See NIST’s study (NISTIR 8280, PDF) in this regard, referenced in Baker’s article.
In addition to what Baker said, let me again note that when judging the use of facial recognition, it should be compared against the alternatives. While I believe that alternatives should be offered, even passwords, consider that automated facial recognition supported by trained examiner review is much more accurate than witness (mis)identification. I don’t think we want to solely rely on that.
Because falsely imprisoning someone due to non-algorithmic witness misidentification is as bad as kryptonite.
I’ve worked in the general area of contactless fingerprint capture for years, initially while working for a NIST CRADA partner. While most of the NIST CRADA partners are still pursuing contactless fingerprint technology, there are also new entrants.
In the pre-COVID days, the primary advantage of contactless fingerprint capture was speed. As I noted in an October 2021 post:
Actually this effort launched before that, as there were efforts in 2004 and following years to capture a complete set of fingerprints within 15 seconds; those efforts led, among other things, to the smartphone software we are seeing today.
By 2016, several companies had entered into cooperative research and development agreements with NIST to develop contactless fingerprint capture software, either for dedicated devices or for smartphones. Most of those early CRADA participants are still around today, albeit under different names.
I’ve previously written posts about two of these CRADA partners, Telos ID (previously Diamond Fortress) and Sciometrics (the supplier for Integrated Biometrics).
But these aren’t the only players in the contactless fingerprint market. There are always new entrants in a market where there is opportunity.
T5-AirSnap Finger uses a smartphone’s built-in camera to perform finger detection, enhancement, image processing and scaling, generating images that can be transmitted for identity verification or registration within seconds, according to the announcement. The resulting images are suitable for use with standard AFIS solutions, and comparison against legacy datasets…
“We all carry these awesome computers in our hands,” Parthe explains. “It’s a perfectly packaged hardware device that is ideal for any capture technology. Smartphones are powerful compute devices on the edge, with a nice integrated camera with auto-focus and flash. And now phones also come with multiple cameras which can help with better focus and depth estimation. This allows the users to take photos of their fingers and the software takes care of the rest. I’d just like to point out here that we’re talking about using the phone’s camera to capture biometrics and using a smartphone to take the place of a dedicated reader. We’re not talking about the in-built fingerprint acquisition we’re all familiar with on many devices which is the means of accessing the device itself.”
For those who have never looked at FRVT before, it does not merely report the accuracy results of searches against one database, but reports accuracy results for searches against eight different databases of different types and of different sizes (N).
Mugshot, Mugshot, N = 12000000
Mugshot, Mugshot, N = 1600000
Mugshot, Webcam, N = 1600000
Mugshot, Profile, N = 1600000
Visa, Border, N = 1600000
Visa, Kiosk, N = 1600000
Border, Border 10+YRS, N = 1600000
Mugshot, Mugshot 12+YRS, N = 3000000
This is actually good for the vendors who submit their biometric algorithms, because even if the algorithm performs poorly on one of the databases, it may perform wonderfully on one of the other seven. That’s how so many vendors can trumpet that their algorithm is the best. When you throw in other qualifiers such as “top five,” “best non-Chinese vendor,” and even “vastly improved,” you can see how dozens of vendors can issue “NIST says we’re the best” press releases.
Not that I knock the practice; after all, I myself have done this for years. But you need to know how to interpret these press releases, and what they’re really saying. Remember this when you read the vendor announcement toward the end of this post.
Anyway, I went to check the current results, which when you originally visit the page are sorted in the order of the fifth database, the Visa Border database. And this is what I saw this morning (October 27):
For the most part, the top five for the Visa Border test contain the usual players. North Americans will be most familiar with IDEMIA and NEC, and Cloudwalk and Sensetime have been around for a while.
A new algorithm from a not-so-new provider
But I had never noticed Cubox in the NIST testing before. And the number attached to the Cubox algorithm, “000,” indicates that this is Cubox’s first submission.
And Cubox did exceptionally well, especially for a first submission.
As you can see by the superscripts attached to each numeric value, Cubox had the second most accurate algorithm for the Visa Border test, the most accurate algorithm for the Visa Kiosk test, and placed no lower than 12th in the six (of eight) tests in which it participated. Considering that 302 algorithms have been submitted over the years, that’s pretty remarkable for a first-time submission.
Well, as an ex-IDEMIA employee, my curious nature kicked in.
The Cubox that submitted an algorithm to NIST is a South Korean firm with the website cubox.aero, self-described as “The Leading Provider in Biometrics” (aren’t they all?) with fingerprint and face solutions. Cubox competes in the access control and border control markets.
Cubox’s ten-year history and “overseas” page details its growth in its markets, and its solutions that it has provided in South Korea, Mongolia, and Vietnam.
And although Cubox hasn’t trumpeted its performance on its own website (at least in the English version; I don’t know about the Korean version), Cubox has publicized its accomplishment on a LinkedIn post.
Why NIST tests aren’t important
But before you get excited about the NIST results from Cubox, Sensetime, or any of the algorithm providers, remember that the NIST test is just a test. NIST cautions people about this, I have cautioned people about this (see the fourth point in this post), and Mike French has also discussed this.
However, it is also important to remember that NIST does not test operational systems, but rather technology submitted as software development kits or SDKs. Sometimes these submissions are labeled as research (or just not labeled), but in reality it cannot be known if these algorithms are included in the product that an agency will ultimately receive when they purchase a biometric system. And even if they are “thesame”, the operational architecture could produce different results with the same core algorithms optimized for use in a NIST study.
The very fact that test results vary between the NIST databases explicitly tells you that a number one ranking on one database does not mean that you’ll get a number one ranking on every database. And as French reminds us, when you take an operational algorithm in an operational system with a customer database, the results may be quite different.
Which is why French recommends that any government agency purchasing a biometric system should conduct its own test, with vendor operational systems (rather than test systems) loaded with the agency’s own data.
Incidentally, if your agency needs a forensic expert to help with a biometric procurement or implementation, check out the consulting services offered by French’s company, Applied Forensic Services.
Let me kick off this post by quoting from another post that I wrote:
I’ve always been of the opinion that technology is moving away from specialized hardware to COTS hardware. For example, the fingerprint processing and matching that used to require high-end UNIX computers with custom processor boards in the 1990s can now be accomplished on consumer-grade smartphones.
Further evidence of this was promoted in advance of #connectID by Integrated Biometrics.
And yes, for those following Integrated Biometrics’ naming conventions, there IS a 1970s movie called “Slap Shot,” but I don’t think it has anything to do with crime solving. Unless you count hockey “enforcers” as law enforcement. And the product apparently wasn’t named by Integrated Biometrics anyway.
SlapShot supports the collection of Fingerprint and facial images suitable for use with state of the art matching algorithms. Fingerprints can now be captured by advanced software that enables the camera in your existing smart phones to generate images with a quality capable of precise identification. Facial recognition and metadata supplement the identification process for any potential suspect or person of interest.
This groundbreaking approach turns almost any smart phone into a biometric capture device, and with minimal integration, your entire force can leverage their existing smart phones to capture fingerprints for identification and verification, receiving matching results in seconds from a centralized repository.
Great, you say! But there’s one more thing. Two more things, actually:
SlapShot functions on Android devices that support Lollipop or later operating systems and relies on the device’s rear high-resolution camera. Images captured from the camera are automatically processed on the device in the background and converted into EBTS files. Once the fingerprint image is taken, the fingerprint matcher in the cloud returns results instantly.
The SlapShot SDK allows developers to capture contactless fingerprints and other biometrics within their own apps via calls to the SlapShot APIs.
Note that SlapShot is NOT intended for end users, but for developers to incorporate into existing applications. Also note that it is (currently) ONLY supported on Android, not iOS.
But this does illustrate the continuing move away from dedicated devices, including Integrated Biometrics’ own line of dedicated devices, to multi-use devices that can also perform forensic capture and perform or receive forensic matching results.
And no, Integrated Biometrics is not cannibalizing its own market. I say this for two reasons.
First, there are still going to be customers who will want dedicated devices, for a variety of reasons.
Second, if Integrated Biometrics doesn’t compete in the smartphone contactless fingerprint capture market, it will lose sales to the companies that DO compete in this market.
Contactless fingerprint capture has been pursued by multiple companies for years, ever since the NIST CRADA was issued a few years ago. (Integrated Biometrics’ partner Sciometrics was one of those early CRADA participants, along with others.) Actually this effort launched before that, as there were efforts in 2004 and following years to capture a complete set of fingerprints within 15 seconds; those efforts led, among other things, to the smartphone software we are seeing today. Not only from Integrated Biometrics/Sciometrics, but also from other CRADA participants. (Don’t forget this one.)
Eric Weiss of FindBiometrics has opined on the Tel Aviv master faces study that I previously discussed.
While he does not explicitly talk about the myriad of facial recognition algorithms that were NOT addressed in the study, he does have some additional details about the test dataset.
The three algorithms that were tested
Here’s what FindBiometrics says about the three algorithms that were tested in the Israeli study.
The researchers described (the master faces) as master keys that could unlock the three facial recognition systems that were used to test the theory. In that regard, they challenged the Dlib, FaceNet, and SphereFace systems, and their nine master faces were able to impersonate more than 40 percent of the 5,749 people in the LFW set.
While it initially sounds impressive to say that three facial recognition algorithms were fooled by the master faces, bear in mind that there are hundreds of facial recognition algorithms tested by NIST alone, and (as I said earlier) the test has NOT been duplicated against any algorithms other than the three open source algorithms mentioned.
…let’s look at the algorithms themselves and evaluate the claim that results for the three algorithms Dlib, FaceNet, and SphereFace can naturally be extrapolated to ALL facial recognition algorithms….NIST’s subsequent study…evaluated 189 algorithms specially for 1:1 and 1:N use cases….“Tests showed a wide range in accuracy across developers, with the most accurate algorithms producing many fewer errors.”
In short, just because the three open source algorithms were fooled by master faces doesn’t mean that commercial grade algorithms would also be fooled by master faces. Maybe they would be fooled…or maybe they wouldn’t.
What about the dataset?
The three open source algorithms were tested against the dataset from Labeled Faces in the Wild. As I noted in my prior post, the LFW people emphasize some important caveats about their dataset, including the following:
Many groups are not well represented in LFW. For example, there are very few children, no babies, very few people over the age of 80, and a relatively small proportion of women. In addition, many ethnicities have very minor representation or none at all.
In the FindBiometrics article, Weiss provides some additional detail about dataset representation.
…there is good reason to question the researchers’ conclusion. Only two of the nine master faces belong to women, and most depicted white men over the age of 60. In plain terms, that means that the master faces are not representative of the global public, and they are not nearly as effective when applied to anyone that falls outside one particular demographic.
That discrepancy can largely be attributed to the limitations of the LFW dataset. Women make up only 22 percent of the dataset, and the numbers are even lower for children, the elderly (those over the age of 80), and for many ethnic groups.
Valid points to be sure, although the definition of a “representative” dataset varies depending upon the use case. For example, a representative dataset for a law enforcement database in the city of El Paso, Texas will differ from a representative dataset for an airport database catering to Air France customers.
So what conclusion can be drawn?
Perhaps it’s just me, but scientific entities that conduct studies are always motivated by the need for additional funding. After a study is concluded, it seems that the entities always conclude that “more research is needed”…which can be self-serving, because as long as more research is needed, the scientific entities can continue to receive necessary funding. Imagine the scientific entity that would dare to say “Well, all necessary research has been conducted. We’re closing down our research center.”
But in this case, there IS a need to perform additional research, to test the master faces against different algorithms and against different datasets. Then we’ll know whether this statement from the FindBiometrics article (emphasis mine) is actually true:
Any face-based identification system would be extremely vulnerable to spoofing…
Modern “journalism” often consists of reprinting a press release without subjecting it to critical analysis. Sadly, I see a lot of this in publications, including both biometric and technology publications.
This post looks at the recently announced master faces study results, the datasets used (and the datasets not used), the algorithms used (and the algorithms not used), and the (faulty) conclusions that have been derived from the study.
Oh, and it also informs you of a way to make sure that you don’t make the same mistakes when talking about biometrics.
Vulnerabilities from master faces
In facial recognition, there is a concept called “master faces” (similar concepts can be found for other biometric modalities). The idea behind master faces is that such data can potentially match against MULTIPLE faces, not just one. This is similar to a master key that can unlock many doors, not just one.
This can conceivably happen because facial recognition algorithms do not match faces to faces, but match derived features from faces to derived features from faces. So if you can create the right “master” feature set, it can potentially match more than one face.
Ever thought you were being gaslighted by industry claims that facial recognition is trustworthy for authentication and identification? You have been.
The article goes on to discuss an Israeli research project that demonstrated some true “master faces” vulnerabilities. (Emphasis mine.)
One particular approach, which they write was based on Dlib, created nine master faces that unlocked 42 percent to 64 percent of a test dataset. The team also evaluated its work using the FaceNet and SphereFace, which like Dlib, are convolutional neural network-based face descriptors.
They say a single face passed for 20 percent of identities in Labeled Faces in the Wild, an open-source database developed by the University of Massachusetts. That might make many current facial recognition products and strategies obsolete.
Sounds frightening. After all, the study not only used dlib, FaceNet, and SphereFace, but also made reference to a test set from Labeled Faces in the Wild. So it’s obvious why master faces techniques might make many current facial recognition products obsolete.
Let’s look at the datasets
It’s always more impressive to cite an authority, and citations of the University of Massachusetts’ Labeled Faces in the Wild (LFW) are no exception. After all, this dataset has been used for some time to evaluate facial recognition algorithms.
But what does Labeled Faces in the Wild say about…itself? (I know this is a long excerpt, but it’s important.)
Labeled Faces in the Wild is a public benchmark for face verification, also known as pair matching. No matter what the performance of an algorithm on LFW, it should not be used to conclude that an algorithm is suitable for any commercial purpose. There are many reasons for this. Here is a non-exhaustive list:
Face verification and other forms of face recognition are very different problems. For example, it is very difficult to extrapolate from performance on verification to performance on 1:N recognition.
Many groups are not well represented in LFW. For example, there are very few children, no babies, very few people over the age of 80, and a relatively small proportion of women. In addition, many ethnicities have very minor representation or none at all.
While theoretically LFW could be used to assess performance for certain subgroups, the database was not designed to have enough data for strong statistical conclusions about subgroups. Simply put, LFW is not large enough to provide evidence that a particular piece of software has been thoroughly tested.
Additional conditions, such as poor lighting, extreme pose, strong occlusions, low resolution, and other important factors do not constitute a major part of LFW. These are important areas of evaluation, especially for algorithms designed to recognize images “in the wild”.
For all of these reasons, we would like to emphasize that LFW was published to help the research community make advances in face verification, not to provide a thorough vetting of commercial algorithms before deployment.
While there are many resources available for assessing face recognition algorithms, such as the Face Recognition Vendor Tests run by the USA National Institute of Standards and Technology (NIST), the understanding of how to best test face recognition algorithms for commercial use is a rapidly evolving area. Some of us are actively involved in developing these new standards, and will continue to make them publicly available when they are ready.
So there are a lot of disclaimers in that text.
LFW is a 1:1 test, not a 1:N test. Therefore, while it can test how one face compares to another face, it cannot test how one face compares to a database of faces. The usual law enforcement use case is to compare a single face (for example, one captured from a video camera) against an entire database of known criminals. That’s a computationally different exercise from the act of comparing a crime scene face against a single criminal face, then comparing it against a second criminal face, and so forth.
The people in the LFW database are not necessarily representative of the world population, the population of the United States, the population of Massachusetts, or any population at all. So you can’t conclude that a master face that matches against a bunch of LFW faces would match against a bunch of faces from your locality.
Captured faces exhibit a variety of quality levels. A face image captured by a camera three feet from you at eye level in good lighting will differ from a face image captured by an overhead camera in poor lighting. LFW doesn’t have a lot of these latter images.
I should mention one more thing about LFW. The researchers allow testers to access the database itself, essentially making LFW an “open book test.” And as any student knows, if a test is open book, it’s much easier to get an A on the test.
Now let’s take a look at another test that was mentioned by the LFW folks itself: namely, NIST’s Face Recognition Vendor Test.
This is actually a series of tests that has evolved over the years; NIST is now conducting ongoing tests for both 1:1 and 1:N (unlike LFW, which only conducts 1:1 testing). This is important because most of the large-scale facial recognition commercial applications that we think about are 1:N applications (see my example above, in which a facial image captured at a crime scene is compared against an entire database of criminals).
In addition, NIST uses multiple data sets that cover a number of use cases, including mugshots, visa photos, and faces “in the wild” (i.e. not under ideal conditions).
It’s also important to note that NIST’s tests are also intended to benefit research, and do not necessarily indicate that a particular algorithm that performs well for NIST will perform well in a commercial implementation. (If the algorithm is even available in a commercial implementation: some of the algorithms submitted to NIST are research algorithms only that never made it to a production system.) For the difference between testing an algorithm in a NIST test and testing an algorithm in a production system, please see Mike French’s LinkedIn article on the topic. (I’ve cited this article before.)
With those caveats, I will note that NIST’s FRVT tests are NOT open book tests. Vendors and other entities give their algorithms to NIST, NIST tests them, and then NIST tells YOU what the results were.
So perhaps it’s more robust than LFW, but it’s still a research project.
Let’s look at the algorithms
Now that we’ve looked at two test datasets, let’s look at the algorithms themselves and evaluate the claim that results for the three algorithms Dlib, FaceNet, and SphereFace can naturally be extrapolated to ALL facial recognition algorithms.
This isn’t the first time that we’ve seen such an attempt at extrapolation. After all, the MIT Media Lab’s Gender Shades study (which evaluated neither 1:1 nor 1:N use cases, but algorithmic attempts to identify gender and race) itself only used three algorithms. Yet the popular media conclusion from this study was that ALL facial recognition algorithms are racist.
Compare this with NIST’s subsequent study, which evaluated 189 algorithms specially for 1:1 and 1:N use cases. While NIST did find some race/sex differences in algorithms, these were not universal: “Tests showed a wide range in accuracy across developers, with the most accurate algorithms producing many fewer errors.”
In other words, just because an earlier test of three algorithms demonstrated issues in determining race or gender, that doesn’t mean that the current crop of hundreds of algorithms will necessarily demonstrate issues in identifying individuals.
So let’s circle back to the master faces study. How do the results of this study affect “current facial recognition products”?
The answer is “We don’t know.”
Has the master faces experiment been duplicated against the leading commercial algorithms tested by Labeled Faces in the Wild? Apparently not.
Has the master faces experiment been duplicated against the leading commercial algorithms tested by NIST? Well, let’s look at the various ways you can define the “leading” commercial algorithms.
Now you can play with the sort order in many different ways, but the question remains: have the Israeli researchers, or anyone else, performed a “master faces” test (preferably a 1:N test) on the IDEMIA, Paravision, Sensetime, NtechLab, Anyvision, or ANY other commercial algorithm?
Maybe a future study WILL conclude that even the leading commercial algorithms are vulnerable to master face attacks. However, until such studies are actually performed, we CANNOT conclude that commercial facial recognition algorithms are vulnerable to master face attacks.
So naturally journalists approach the results critically…not
But I’m sure that people are going to make those conclusions anyway.
While Matt Schneier doesn’t go to the extreme of saying that all facial recognition algorithms are now defunct, he does classify the research as “fascinating” WITHOUT commenting on its limitations or applicability. Schneier knows security, but he didn’t vet this one.
Does anyone even UNDERSTAND these studies? (Or do they choose NOT to understand them?)
How can you avoid the same mistakes when communicating about biometrics?
As you can see, people often write about biometric topics without understanding them fully.
Even biometric companies sometimes have difficulty communicating about biometric topics in a way that laypeople can understand. (Perhaps that’s the reason why people misconstrue these studies and conclude that “all facial recognition is racist” and “any facial recognition system can be spoofed by a master face.”)
Are you about to publish something about biometrics that requires a sanity check? (Hopefully not literally, but you know what I mean.)
At Bredemarket, I work with a number of companies that provide biometric systems. And I’ve seen a lot of other systems over the years, including fingerprint, face, DNA, and other systems.
The components of a biometric system
While biometric systems may seem complex, the concept is simple. Years ago, I knew a guy who asserted that a biometric system only needs to contain two elements:
An algorithm that takes a biometric sample, such as a fingerprint image, and converts it into a biometric template.
An algorithm that can take these biometric templates and match them against each other.
If you have these two algorithms, my friend stated that you had everything you need for an biometric system.
Well, maybe not everything.
Today, I can think of a few other things that might be essential, or at least highly recommended. Here they are:
An algorithm that can measure the quality of a biometric sample. In some cases, the quality of the sample may be important in determining how reliable matching results may be.
For fingerprints, an algorithm that can classify the prints. Forensic examiners routinely classify prints as arches, whorls, loops, or variants of these three, and classifications can sometimes be helpful in the matching process.
For some biometric samples, utilities to manage the compression and decompression of the biometric images. Such images can be huge, and if they can be compressed by a reliable compression methodology, then processing and transmission speeds can be improved.
A utility to manage the way in which the biometric data is accessed. To ensure that biometric systems can talk to each other, there are a number of related interchange standards that govern how the biometric information can be read, written, edited, and manipulated.
For fingerprints, a utility to segment the fingerprints, in cases where multiple fingerprints can be found in the same image.
So based upon the two lists above, there are seven different algorithms/utilities that could be combined to form an automated fingerprint identification system, and I could probably come up with an eighth one if I really felt like it.
My friend knew about this stuff, because he had worked for several different firms that produced fingerprint identification systems. These firms spent a lot of money hiring many engineers and researchers to create all of these algorithms/utilities and sell them to customers.
How to get these biometric system components for free
But what if I told you that all of these firms were wasting their time?
And if I told you that since 2007, you could get source code for ALL of these algorithms and utilities for FREE?
Well, it’s true.
To further its testing work, the National Institute of Standards and Technology (NIST) created the NIST Biometric Image Software (NBIS), which currently has eight algorithms/utilities. (The eighth one, not mentioned above, is a spectral validation/verification metric for fingerprint images.) Some of these algorithms and utilities are available separately or in other utilities: anyone can (and is encouraged to) use the quality algorithm, called NFIQ, and the minutiae detector MINDTCT is used within the FBI’s Universal Latent Workstation (ULW).
As I write this, NBIS has not been updated in six years, when Release 5.0.0 came out.
Is anyone using this in a production system?
And no, I am unaware of any law enforcement agency or any other entity that has actually USED NBIS in a production system, outside of the testing realm, with the exception of limited use of selected utilities as noted above. Although Dev Technology Group has compiled NBIS on the Android platform as an exercise. (Would you like an AFIS on your Samsung phone?)
Over the last few days, I’ve run across three stories that deal with two aspects of DNA collection: familial DNA, and DNA mixtures.
(This case was mentioned on Forensics and Law in Focus, a recommended read for all sorts of forensic techniques.)
Of all of the biometrics, DNA has a property that the others don’t: the similarity of DNA between family members. Someone finding my child’s fingerprints won’t necessarily be able to find me, and even someone who finds my child’s face won’t necessarily be able to find me.
Vannieuwenhoven is accused in the July 9, 1976, murders of a Green Bay couple who was camping at McClintock Park in the Town of Silver Cliff. David Schuldes, 25, and Ellen Matheys, 24, were shot and killed at the campground….
A DNA profile obtained through evidence was already on file with the State Crime Lab, according to previous testimony….
Baldwin explained how a breakthrough came in 2018 when Parabon Nanolabs of Virginia developed new technology to examine DNA evidence, which could provide certain genetic characteristics of possible suspects through DNA….
On Dec. 21, 2018, Parabon contacted Baldwin and informed him that a possible suspect was found through the DNA testing. He said they gave him a Green Bay-area family—the Vannieuwenhovens—that had four sons and four grandsons who possibly could be a match.
The detectives then had to test the relatives and compare their DNA to the crime scene DNA. But not ALL of the relatives: this was solely used as an investigative lead, and there was no point in testing the grandsons for a 1976 murder. Raymand was one of those whose DNA was collected (by having him lick an envelope to seal it), and the probabilities indicated a match.
Obviously this technique has controversy in some quarters, since the family members who originally provided the DNA had no idea that it would be used to arrest (or, in some cases, exonerate) another family member in this way. But the technique is being used.
The other story concerns what can be found when a DNA sample is collected. The DNA sample may contain a lot of things, from a lot of people.
With improvements in DNA testing methods, we don’t need much DNA to make a profile and see perhaps if I am a likely contributor to that sample or if you have contributed — even if you never touched the table directly. That level of DNA profiling is useful for many different types of crimes, but also brings up the issue of relevance. We aren’t explaining how DNA got to a location.
As an example, a single item at a crime scene may include the DNA of the person who committed the crime, the crime victim, an innocent bystander who touched the area in question before the crime was committed, and (if the police officer was careless) the police officer investigating the crime.
Now you have to look at the DNA sample that was collected. With DNA mixtures, this gets tough.
If single-source DNA is like basic arithmetic and a two-person mixture is like algebra, then a complicated mixture is like calculus!
The quotes above are from John Butler of the National Institute of Standards and Technology, who has a concern about how all of the different laboratories interpret DNA mixtures. Ideally, all labs should work together to have a consistent, verifiable way to interpret these mixtures.
We wanted to see if there were established methodologies that worked better than others when tested, and where those limits were being drawn. What we found is that there is not enough publicly available data to enable an external and independent assessment of the degree of reliability of DNA mixture interpretation practices.
NIST, as it does in other areas, seeks to advance the science, and is urging stakeholders to work together to do so.
But wait; there’s more on DNA mixtures!
While NIST has been conducting the work above, the National Institute of Justice have been funding other work.
Michael Marciano, research assistant professor and director for research in the Forensic and National Security Sciences Institute (FNSSI) within the College of Arts and Sciences, and Jonathan Adelman, research assistant professor in FNSSI, have invented a novel hybrid machine learning approach (MLA) to mixture analysis (U.S. patent number 10,957,421). Their method combines the strengths of current computational and expert analysis approaches with those in data mining and artificial intelligence.
Marciano and Adelman received funding from the National Institute of Justice to further develop their idea in 2014. Although this intellectual property has not been fully developed for commercial use, they are pursuing funding to transition the technology. Once this is done, they are hopeful that the new method will be used throughout the law enforcement and criminal justice communities, specifically by forensic DNA scientists and the legal community.
Actually, once the intellectual property has been developed for commercial use, it will NOT be used THROUGHOUT the law enforcement and criminal justice communities. It will be used by PORTIONS of the law enforcement and criminal justice communities, while OTHERS within the community will use commercial products from competitors.
Commercialization of a product actually works AGAINST universal acceptance, except in very limited cases. Take commercialization of fingerprinting products. As Chapter 6 of The Fingerprint Sourcebook details, independent research was performed in four separate countries (France, Japan, the UK, and the US) which, after commercialization, led to three (now two) separate fingerprinting products: NEC’s product from Japanese research, and IDEMIA’s product from separate French (Morpho) and United States (Printrak) research. This initial research, combined with subsequent research that led to additional products, led to an interoperability issue, despite efforts from NIST to advance greater inoperability.
Will NIST have to do the same thing to reconcile competing DNA mixture analysis methods?