Fast Biometric Identification Considering
User Statistics
Manfred Bromba · Bromba GmbH
First release: 2008-11-17 • Actual status:
2011-02-03
Introduction
| Unlike comparing the identity of simple
numbers, comparing biometric data is a computationally intensive task.
Furthermore, a comparison does not simply deliver the answer "yes" for
equality and "no" for inequality. Instead, the result of a biometric comparison
will initially be a similarity score since biometric data generally do
not show perfect identity (when captured from the same biometric characteristic)
or perfect distinctness (when captured from a different biometric characteristic).
Especially with large biometric data bases where a great number of biometric
references are stored, the former effect leads to large computation times
in the case of identification while the latter effect is responsible for
misidentifications. In contrast to verification where the ID of a reference
is given, identification requires the comparison of all data base
reference data sets with the incoming biometric sample data to deliver
the ID of a matching reference. |
Regarding misidentifications, usually three
types of failure are considered [BioFAQ]:
|
I.
|
|
An "identity" which is known to be included
in the reference data base is not found, e.g., since the actual similarity
score is below a given threshold. This error corresponds to the FRR (False
Reject Rate). |
|
|
|
|
II.
|
|
An "identity" is mistaken for another
identity in the reference data base, e.g., since the similarity score for
the latter is erroneously higher than for the true identity. This error
corresponds to the FIR (False Identification Rate). |
|
|
|
|
III.
|
|
An "identity" which is not included in
the reference data base is erroneously assigned to a foreign identity included
in the data base, e.g., since the similarity score exceeds a given threshold.
This error corresponds to the FAR (False Accept Rate). |
|
| Usually, there is a tradeoff between FRR
and FAR. If the decision threshold is increased, FRR increases and FAR
decreases, and vice versa. This holds for simple systems. Because identification
systems have to cope with the fact that more than one comparison score
may exceed the threshold, the situation becomes more sophisticated
and more than one realization becomes possible. This fact constitutes another
tradeoff: failure rates versus identification time. This is because real-life
identification systems seem to increase error rates if identification times
decrease, and vice versa. For that reason, this paper focuses on
methods to decrease identification time while preserving error rates at
the lowest possible status. Binning (a kind of preclassification to speed
up identification) is not discussed here. Finally, we assume a unique ID
as result without manual interaction as this is common in AFIS data bases
used by police. |
Full Search
| In full search operation mode of an identification
system, all references in a data base are compared against the incoming
sample data. If N is the number of references, N comparisons have to be
performed, delivering N similarity scores. Without loss of generality,
we assume the score throughout this paper to lie between 0 and 100. Usually,
the score value is quantized, e.g., in unit steps between 0 and 100. If
Tc is the comparison time, the identification time in full search
mode Tfs will be N times Tc plus an initialization
time Ti (e.g., caused by loading the references into the main
memory): |
| The failure rates depend on the decision
strategy. Since more than one score value may exceed a decision threshold,
further processing is required to achieve a unique result. For example,
if two score values exceed the threshold, one may consider this case as
rejection with result "unknown". This would increase the FRR and reduce
the FAR. Another decision policy is to decide for the highest-score ID
exceeding the threshold and rejecting only when the highest ID is not unique.
These examples do hardly influence identification time. As a result, Tfs
is the worst case with respect to identification time (and probably the
best case with respect to failure rates). |
Fast Identification
| In fast identification mode, that ID,
say n, is chosen whose score in a process of subsequent comparisons first
exceeds the threshold. After that the comparisons stop. Obviously, if identification
starts with ID 1, proceeds with ID 2, and so on, the identification time
in fast identification Tfi is given by |
| where n is an integer number between 1
and N. Since it is not known in advance how large n is (otherwise we would
only need a verification), Tfi is a random variable, too. If
we know nothing about the a priori probability of the occurrence of ID
n, it is useful to assume a uniform distribution with all n having the
same probability to match the incoming sample data. Under this assumption
and the additional assumption that the effects of FRR and FAR are negligible,
the expected value of Tfi, written E{Tfi}, is given
by |
|
| E{Tfi} = Ti + ½
(N+1) Tc ~ Ti + ½N Tc |
|
|
|
|
| For large N this results in a reduction
of the identification time by a factor of 2 with respect to the full search,
see eq. (1). |
| Now we have to discuss the effect of fast
identification on the error rates. If more than one ID score exceeds the
threshold, the first one may not be the best one! On the other hand, if
the probability that more than one score exceeds the threshold is small,
the error rates will hardly differ. This probability is approximately given
by N times the verification FAR [BioFAQ].
Since it makes sense that this product is very small, we can expect a negligible
effect of the fast identification mode on FAR. |
| A more thorough consideration has to include
the effect of failure cases on identification time. For example, in an
false reject case where no false accepts occur, a full search is done.
That is, the higher the FRR, the longer the mean identification time in
fast identification mode! Thus, a better approximation for large N than
eq.
(3) would be |
|
| E{Tfi} ~ (1 - FRR) (Ti
+ ½N Tc) + FRR Tfs = Ti + ½N
(1 + FRR) Tc |
|
|
(4)
|
|
| For FRR = 0, the mean time approaches
the old approximation eq. (3). For FRR = 1, there is no
benefit with the fast identification mode. For a typical FRR of 0.1, the
factor ½ becomes 0.55 which is still a good value! |
| Until now we have discussed the most unfavorable
case, the uniform distribution of the probabilities pn(t) (n=
1...N) that a given ID n matches the incoming biometric sample. If this
distribution was known a priori to be nonuniform, it would be possible
to reorder the sequence of comparisons such that higher-probability IDs
are compared first. This should further accelerate fast identification
in the mean. The only thing we have to do is to reorder the IDs such that
their probabilities decrease monotonously. Then the mean value is given
by |
|
| E{Tfi} ~ (1 - FRR) (Ti
+ s(t)Tc}) + FRR Tfs |
|
|
(5)
|
|
| where s(t) is the mean value of the (ordered)
distribution with respect to all references (which may depend on time). |
| Example:
The identification system is used for time & attendance control. Not
all employees are coming at the same time, everyone has his own preferences
which expresses in an individual periodical distribution of arrival times.
This periodical distribution can even be statistically measured by the
application itself (remark: this is a privacy issue!). To find the optimum
order at a time, the measured distributions of all employees have to be
compared. The lowest probability estimation at a certain time (of day,
week, or month) is compared last, the highest one first. This principle
can simply be extended to shift-workers. Besides time depending statistics,
there are also event-based statistics. For example, if an employee has
already checked in, it is unlikely that he will check-in within the next
several hours. Thus, after check-in, the corresponding reference only requires
a very low priority and is compared, if at all, only after all employees
who still have to come. |
| If we want to calculate the benefit of
ordered statistics, we have to calculate s(t). It is simply the sum over
the product nqn(t) with index n running from 1 to N. qn(t)
is the ordered distribution such that q1(t) is the largest value
and qN(t) the smallest one. Each qn(t) uniquely represents
one pm(t), where m is the ID and n the index for the comparison
sequence. |
Example: For simplicity,
let N = 1000,
| Before reordering, the mean
value s(t) is given by 500 x 0.5 + 900 x 0.5 = 700. That is, in the mean,
700 instead of 1000 comparisons have to be performed for this situation.
After reordering, we have |
| Then the new mean value
is given by 1 x 0.6 + 2 x 0.4 = 1.4 comparisons instead of 1000, a really
formidable improvement by a factor of 700/1.4 = 500 in this case! |
|
| The last example shows that a considerable
improvement can be expected if suitable statistical data are available.
Such data can be collected and updated automatically by the application,
giving a self-optimizing system with respect to recognition speed. |
Two-stage identification
| Most biometric characteristics allow the
creation of scalable references. For high speed comparison, a subset of
biometric features which may lead to higher failure rates can be used.
This suggests a two-stage identification scheme: |
|
C1:
|
|
Full search with a fast comparison algorithm
C1 adjusted for high FAR and low FRR |
|
|
|
|
C2:
|
|
Full search over all references which
exceeded the score threshold in step C1 with a high-precision but slow
comparison algorithm C2, adjusted for very low FAR at moderate FRR |
|
| An example for such a procedure is fingerprint
identification with fast minutiae comparison and high-precision image correlation
comparison. Minutiae as features allow for fast comparison algorithms,
the image itself enables extremely low error rates at low speed. |
| Example: Let N = 10000 be
the size of the reference data base. In the first step of the 2-stage procedure
we expect an identification with a full search time of 1s. The threshold
is adjusted for a verification FAR = 0.1 at FRR = 0.005.
This yields approximately 1000 references which exceed the threshold, inlucing
the correct one except for false reject cases which occur with probability
0.005. These remaining references which contain the searched ID
except for false reject cases are now identified in the second step where
we assume a full search time of 1s, too, but this time only over 1000 references.
In the second step we expect a verification FAR = 0.00001 at FRR = 0.015.
This may result in the following performance, compared with a full search
of all N references with comparison algorithm C2 which needs 10s for 10k
references (for the determination of the failure rates the ROC of the system
is required [BioFAQ]): |
| Procedure |
total FAR |
total FRR |
Identification time |
| 1-stage C1 |
~0.01
|
~0.20
|
1 s
|
| 1-stage C2 |
~0.01
|
~0.02
|
10 s
|
| 2-stage |
~0.01
|
~0.02
|
~2 s
|
| The total FAR or identification
FAR is nearly given by N times the verification FAR [BioFAQ].
Although the table is fictive (but represents the actual state of the art)
and does not take into account statistical dependencies, it indicates the
trade-off between speed and failure rates. Statistical dependencies should
slightly increase the total FAR in the 2-stage procedure. |
| For a better understanding
we analyze what is happening in the two-stage case. Only the FAR of Comparator
C1 has a considerable effect on identification times in this case. |
|
Stage 1
|
Stage 2
|
Total
|
| Comparator |
C1
|
C2
|
C1 & C2
|
| Comparison speed per reference |
10000/s
|
1000/s
|
~5000/s
|
| Verification FAR |
0.1
|
0.00001
|
~0.000001 |
| Identification FAR |
~1
|
~0.01
|
~0.01
|
| Number of input references |
10000
|
~1000
|
10000
|
| Number of output references |
~1000
|
~1
|
~1
|
| Identification time |
1s
|
~1s
|
~2s
|
|
| To determine the parameters used in the
last example, the Receiver Operating Characteristic (ROC) [BioFAQ]
is very helpful. The following example ROC stems from a fingerprint recognition
system with the two comparators C1 and C2. It can be used to determine
the error rates at different operating points for a 2-stage fingerprint
identification mode. |

Fast Two-Stage Identification
| For the 2-stage procedure in the last
paragraph it makes little sense to activate the fast identification mode
in the first step since there are a large number of references which exceed
the threshold. Here, modified procedures are required. One way is to apply
fast identification only to the second step. This would decrease only a
part of the total identification time. |
| Another solution is partitioning. In the
first step, only a first part of the full data base is compared with the
incoming sample using comparator C1. The ID candidates found in this step
are then processed by the second step C2. If an ID is found, the procedure
stops. Otherwise, the second and further parts of the reference data base
are explored using the combination C1 / C2 until the search was successfull
or the whole reference data base has been explored. |
| The question to be answered is "What is
the optimum size of the partitions". To find a solution, we restrict the
following calculations to the parameters of the last example
and neglect the effect of FRR (=0). M shall be the number of equal sized
partitions and must be larger than or equal to 10 because this is the reduction
rate from stage C1 to stage C2 in our example. For
simplicity, we assume that N is chosen such that K = N/M, the size of a
partition, is an integer. Then the full search time for one partition is
given by |
|
| TP = TC1 + TC2
= Ti1 + K Tc1 + Ti2+ (K/10) Tc2 |
|
|
(6)
|
|
Next we neglect initialization time, i.e.,
Ti1 = Ti2 = 0. Then with Tc2 = 10 Tc1,
we obtain
|
|
|
(7)
|
|
|
|
|
|
| Tfs = M TP = 2MK
Tc1 = 2N Tc1 |
|
|
(8)
|
|
| The full search time for a partitioned
system is the same as for the unpartitioned 2-stage system in the example
and does not depend on the number or size of partitions! If the fast identification
mode is based on partitions, we obtain as identification time Tfi: |
|
|
|
(9)
|
|
|
|
|
|
| E{Tfi} = ½ (M+1) TP
= K(M+1) Tc1 = N(1+1/M) Tc1 < 1.11 N Tc1 |
|
|
(10)
|
|
| The last inequality comes from the requirement
M > 9, see above. Together with the natural upper limit M = N, we get |
|
| (N+1) Tc1 < E{Tfi}
< 1.11 N Tc1 |
|
|
(11)
|
|
| For large N this gives only a small dependence
on M, provided the reference distributions are uniform. However, to take
the best benefit from nonuniform distributions, a large number of partitions
seems appropriate. Since in our example the upper limit for M is on tenth
of N, it seems generally to be advisable to choose M as N divided by the
reduction rate of the number of references from step C1 to step C2. In
our example, this delivers: |
| and shows only a small 10% loss to be
added to the 2-fold loss in the double-step procedure, see eq.
(3). |
| 2009-04-07: In the second table of the
third example, under "Comparison speed per reference", "~500/s" had to
be replaced by "5000/s" |
| 2011-02-03: invalid URN links replaced
by permalinks |
|