Facial Recognition Accuracy - It's Complicated

At Blink Identity, we are often asked about the accuracy of our identification system. People want a concise answer, but that's difficult to provide because biometric accuracy is a very complex topic. If you see a biometric vendor quote a single number for accuracy, something like "Acme, Inc's system is 98.46% accurate" you should immediately be skeptical for a few reasons.

First, any biometric system can make two types of errors: false match and false non-match. You need both numbers to understand how the system performs. Furthermore, you have to understand the environment in which the vendor measured accuracy and how it applies (or doesn't) to your environment. The actual accuracy in any production environment may be better or worse than vendor's claimed accuracy, especially with face recognition, which is strongly influenced by environmental factors.

Let's take a step back and ask "How good is good enough?" Well, it depends. In most use cases, we are comparing a biometric identification system to a human that is looking at some kind of photo ID. For example, a bartender at a bar checking to see you are old enough to buy alcohol or an agent at the airport checking your passport and boarding pass. It would be really difficult for a machine to do that as well as a human, right?

It turns out that humans are both amazing and terrible at face recognition.

With a familiar face (friend, family), we can make out incredibly low quality pictures of those familiar faces in a way that computer systems can't touch. But with the exception of a few rare "super recognizers", humans are not very good at recognizing unfamiliar faces, and especially bad at recognizing faces that are different from the ethnicity we grew up with. The old trope of "they all look alike" is generally true for any "they" that isn't "us."

How bad are humans at face recognition? There have been a few studies, but once of my favorites was performed with the Sydney passport office, using actual passport examiners who do this human face recognition task every day, along with novices without any training. The test included passports with impostor faces that generally looked like the actual subject, as shown here:

Are these two photos of the same person or four different people?

Are these two photos of the same person or four different people?

Depending upon how good you are with faces, these may or may not be obvious to you as different people. But remember that you only have ten seconds to check this, and passports often have pictures from 5+ years prior. Lots of people may not look exactly like their passport photo. You also are doing this all day when the vast majority of passports are not fake. How long can you pay really close attention to this task?

The performance of the examiners on this task was actually quite good, relative to novices. The examiners rejected 6% of passports that were genuine (False Non-Match) and more troubling, they accepted 14% of the passports with imposter photos (False Match). The examiners knew they were participating in a test, making this result even more alarming because presumably they were checking the photos more closely than normal.

There is no contest in this scenario between humans and computers. The human examiners performing this task were doing verification (1:1), simply matching a photo against a human standing in front of them, and even with trained examiners had a 14% false match rate. That means 14% of the wrong people were accepted. Leading face recognition algorithms currently have a false match rate significantly lower than 1% when performing identification (1:N) against a large background gallery of 1,000,000 people, a much harder problem.

In short, if your current security involves having a human look at an ID, then it is almost a certainty that a biometric system will dramatically outperform your current approach. And while accuracy is important, the tiny differences in accuracy between vendors may not be relevant in many security applications. Modern algorithms are mostly differentiated by their ability to handle poor quality data. With good data and smaller scale (e.g. small office) many algorithms will essentially be perfect. However, it would not be hard to spend 100X or 1,000X on a system to get that extra 1% accuracy, only to find out that it makes no difference at all in your particular application.

Ultimately, the key aspect of biometric performance is not some abstract accuracy benchmark, but how it performs in your environment, with the people you want to identify.

Previous
Previous

Identity Technology – It’s Really Just Math

Next
Next

Face Recognition at Live Events