Introduction
This document deals with FAR (False Acceptance
Rate) measurements for fingerprint recognition. Each finger is assigned
to a unique ID. Fingerprint samples are defined as different measurements
of the same finger, resp. ID. Score values are the result of comparisons
("matches") between two fingerprints and describe the similarity. In this
investigation, the scores assume values between 0 (least similarity) and
100 (most similarity).
Based on a matrix representation of a fingerprint
FAR examination certain conclusions on the nature of FAR determination
are derived. Two experiments have been performed. In the first experiment
every fingerprint reference has been matched against one sample of every
request (query) fingerprint ID, excluding identical IDs. In the second
case, every request sample of one impostor fingerprint ID has been matched
against all reference IDs. These experiments shall give answers on questions
such as:
-
Is there an asymmetry between references and
requests?
-
What is the effect of several impostor trials
with the same finger (ID)?
With regard to the FAR examination effort,
it should be noticed that it is more expensive to have more (different)
IDs than to have more prints (samples) per ID!
Experimental
Using the data base FGA1010x which encompasses
81 different IDs (= different fingers) two score value matrices have been
calculated:
-
The first matrix (Table 1) comprises 68 columns
for 68 successfully enrolled IDs used as reference (Ref) and 81 rows for
all 81 IDs used as request (Req). The cells contain the match score for
the corresponding IDs except for identical IDs.
-
The second matrix (Table 2) comprises 68 columns
for 68 successfully enrolled IDs used as reference (Ref) and 100 rows for
all samples available for reference ID 10010 used as request (Req). The
cells contain the score for the corresponding matches. The match with the
identical ID is omitted in the calculations (marked red in the table).
Both matrices contain the mean values and
variances of the scores per column and per row using the corresponding
MS-Excel functions "Mean" and "Variance".
| Table 1 |
|
Ref
ID: |
10010 |
10013 |
10014 |
10016 |
10022 |
10048 |
10054 |
10059 |
10061 |
... |
| One sample |
|
|
|
|
|
|
|
|
|
|
|
|
| |
variance |
|
2.33 |
1.45 |
0.96 |
0.99 |
1.22 |
0.76 |
1.21 |
1.52 |
0.96 |
... |
| Req ID: |
|
mean |
1.15 |
1.20 |
0.68 |
0.66 |
0.94 |
0.49 |
1.03 |
1.05 |
0.73 |
... |
| 10010 |
1.30 |
1.22 |
|
3 |
2 |
2 |
2 |
1 |
1 |
2 |
3 |
... |
| 10011 |
1.02 |
0.62 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
... |
| 10013 |
0.65 |
0.51 |
2 |
|
0 |
0 |
0 |
0 |
2 |
1 |
0 |
... |
| 10014 |
1.08 |
0.66 |
1 |
2 |
|
0 |
2 |
0 |
0 |
1 |
1 |
... |
| 10016 |
1.39 |
1.00 |
0 |
1 |
0 |
|
2 |
0 |
3 |
0 |
2 |
... |
| 10022 |
1.05 |
0.99 |
2 |
0 |
0 |
2 |
|
0 |
0 |
2 |
2 |
... |
| 10048 |
1.60 |
1.06 |
0 |
2 |
0 |
0 |
0 |
|
0 |
2 |
1 |
... |
| 10054 |
1.34 |
0.76 |
0 |
2 |
0 |
3 |
0 |
0 |
|
0 |
0 |
... |
| 10059 |
2.78 |
1.28 |
4 |
0 |
2 |
0 |
0 |
0 |
0 |
|
4 |
... |
| 10061 |
1.53 |
0.84 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
5 |
|
... |
| 10062 |
1.49 |
1.15 |
2 |
2 |
2 |
0 |
0 |
0 |
0 |
2 |
1 |
... |
| 10065 |
1.17 |
0.62 |
1 |
0 |
0 |
0 |
0 |
0 |
2 |
0 |
0 |
... |
| 10066 |
0.53 |
0.52 |
1 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
0 |
... |
| 10068 |
1.77 |
1.25 |
0 |
1 |
0 |
3 |
2 |
0 |
3 |
0 |
0 |
... |
| 10069 |
2.09 |
1.13 |
2 |
0 |
0 |
0 |
1 |
2 |
0 |
2 |
4 |
... |
| 10074 |
1.29 |
1.10 |
3 |
2 |
2 |
0 |
3 |
2 |
1 |
2 |
0 |
... |
| 10075 |
0.94 |
0.60 |
2 |
0 |
0 |
1 |
0 |
0 |
1 |
3 |
2 |
... |
| 10076 |
1.70 |
1.40 |
2 |
2 |
4 |
0 |
0 |
1 |
2 |
3 |
1 |
... |
| 10077 |
1.06 |
0.61 |
0 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
... |
| 10099 |
1.28 |
0.81 |
0 |
0 |
1 |
0 |
0 |
4 |
0 |
2 |
2 |
... |
| ... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
| Table
2 |
Ref
ID: |
10010 |
10013 |
10014 |
10016 |
10022 |
10048 |
10054 |
10059 |
10061 |
... |
| One ID |
|
|
|
|
|
|
|
|
|
|
|
| variance |
|
162.51 |
2.83 |
1.10 |
0.85 |
1.17 |
0.47 |
0.57 |
2.40 |
0.83 |
... |
| |
mean |
89.45 |
1.77 |
1.10 |
1.24 |
1.92 |
0.35 |
0.98 |
1.94 |
0.77 |
... |
| 1.30 |
1.22 |
100 |
3 |
2 |
2 |
2 |
1 |
1 |
2 |
3 |
... |
| 0.99 |
0.66 |
20 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
... |
| 1.71 |
1.45 |
100 |
2 |
2 |
1 |
2 |
0 |
1 |
0 |
0 |
... |
| 1.33 |
1.22 |
74 |
2 |
2 |
1 |
5 |
0 |
1 |
2 |
0 |
... |
| 1.45 |
0.91 |
100 |
2 |
1 |
3 |
1 |
0 |
0 |
2 |
0 |
... |
| 1.42 |
1.09 |
98 |
3 |
2 |
0 |
2 |
2 |
1 |
4 |
1 |
... |
| 1.87 |
1.09 |
100 |
0 |
0 |
2 |
2 |
0 |
0 |
0 |
0 |
... |
| 1.74 |
1.12 |
88 |
2 |
0 |
0 |
3 |
0 |
1 |
3 |
0 |
... |
| 1.66 |
1.64 |
91 |
5 |
3 |
1 |
2 |
1 |
1 |
4 |
1 |
... |
| 1.67 |
1.19 |
93 |
1 |
0 |
2 |
1 |
0 |
1 |
2 |
0 |
... |
| 0.76 |
0.70 |
76 |
3 |
1 |
1 |
2 |
0 |
0 |
3 |
1 |
... |
| 1.57 |
0.91 |
75 |
2 |
0 |
2 |
1 |
0 |
0 |
3 |
0 |
... |
| 1.44 |
1.12 |
100 |
2 |
2 |
0 |
1 |
0 |
0 |
2 |
3 |
... |
| 2.16 |
1.55 |
83 |
4 |
2 |
0 |
4 |
0 |
1 |
5 |
1 |
... |
| 2.34 |
1.43 |
100 |
2 |
3 |
2 |
3 |
0 |
2 |
3 |
0 |
... |
| 1.67 |
1.15 |
100 |
2 |
2 |
3 |
1 |
0 |
1 |
2 |
0 |
... |
| 1.49 |
1.24 |
75 |
3 |
2 |
0 |
2 |
1 |
0 |
2 |
1 |
... |
| 1.33 |
0.87 |
82 |
2 |
0 |
2 |
1 |
0 |
2 |
0 |
0 |
... |
| 2.23 |
1.36 |
97 |
2 |
0 |
1 |
3 |
0 |
1 |
2 |
0 |
... |
| 1.48 |
1.00 |
71 |
0 |
0 |
2 |
0 |
0 |
2 |
0 |
0 |
... |
| ... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
Results
From the score matrices the main statistical
properties have been calculated using the corresponding Excel functions.
The results are given in Table 3:
Table 3
|
|
|
One sample
Ref ID mean: Average of
one Ref ID over all Req IDs
Req ID mean: Average of
one Req ID over all Ref IDs
Mean of Ref ID mean: Average
of all Ref ID means etc.
| Global score
mean |
0.92 |
| Global score
variance |
1.43 |
| Global
score sigma |
1.19 |
|
|
| Mean of Ref ID
means |
0.92 |
| Variance of Ref
ID means |
0.13 |
| Sigma
of Ref ID means |
0.36 |
| Mean of Ref ID
variances |
1.32 |
| SQRT
of mean of Ref ID variances |
1.15 |
| Variance
of Ref ID variances |
0.21 |
| Sigma
of Ref ID variances |
0.46 |
|
|
| Mean of Req ID
means |
0.92 |
| Variance of Req
ID means |
0.12 |
| Sigma
of Req ID means |
0.35 |
| Mean of Req ID
variances |
1.33 |
| SQRT
of mean of Req ID variances |
1.15 |
| Variance
of Req ID variances |
0.26 |
| Sigma
of Req ID variances |
0.51 |
|
|
| Number
of Ref IDs |
68 |
| Number
of Req IDs |
81 |
|
|
One ID
Ref ID mean: Average of
one ID over all samples
Req sample mean: Average
of one sample over all IDs
Mean of Ref ID mean: Average
of all Ref ID means etc.
| Global
score mean |
1.18 |
| Global score
variance |
1.67 |
| Global
score sigma |
1.29 |
|
|
| Mean of Ref ID
means |
1.18 |
| Variance of Ref
ID means |
0.63 |
| Sigma
of Ref ID means |
0.80 |
| Mean of Ref ID
variances |
1.05 |
| SQRT
of mean of Ref ID variances |
1.03 |
| Variance
of Ref ID variances |
0.40 |
| Sigma
of Ref ID variances |
0.63 |
|
|
| Mean of Req sample
means |
1.18 |
| Variance of Req
sample means |
0.04 |
| Sigma
of Req sample means |
0.20 |
| Mean of Req sample
variances |
1.65 |
| SQRT
of mean of Req sample variances |
1.29 |
| Variance
of Req sample variances |
0.11 |
| Sigma
of Req sample variances |
0.33 |
|
|
| Number
of Ref IDs |
68 |
| Number
of Req samples |
100 |
|
The following observations can be derived
from Tables 1, 2, and 3:
-
The global mean and variance estimations of
the "one sample" case and the "one ID" case deliver "significantly" different
values which seem to favor the "one sample" case (Table 3).
However this may be an artifact of the fact that the "one ID" case is ID
specific whereas the "one sample" case is not.
-
In the "one sample" case the only "significant"
deviation between the column and row distributions of the score values
seems to be the measurement errors which are represented by "Sigma of Ref
ID variance" and "Sigma of Req ID variance" and can be explained by the
different number of IDs (Table 3)
-
The global score means within one case (Table
3) are equal to the means of the row and column means. (This is trivial.)
-
The sums of the variances of column / row
means and the means of the column / row variances approaches the squared
global score variances:
| 0.13 + 1.32 = 1.45 ~ 1.43 |
|
(one sample) |
| 0.12 + 1.33 = 1.45 ~ 1.43 |
|
(one sample) |
| 0.63 + 1.05 = 1.68 ~ 1.67 |
|
(one ID) |
| 0.04 + 1.65 = 1.69 ~ 1.67 |
|
(one ID) |
| |
|
|
-
The variance of the means is small while the
mean of the variances is near the global variance in both cases. (By theory,
the variance of the means should approach zero and the mean of the variances
approach the global variance for sufficiently high numbers of scores.)
-
In the "one ID case" there is an extreme difference
between the Ref ID results and the Req sample results. Especially the variance
of Ref ID means is unusually high (0.63).
-
In the "one ID" case the mean of the row standard
deviations (1.29, estimated as the square root of the mean of Req sample
variances) is significantly higher than the mean of the column standard
deviations (1.03, estimated as the square root of the mean of Ref ID variance)
-
In the "one ID" case the variance of the Req
sample variances (0.11) is considerably smaller than the variance of the
Ref ID variances (0.40).
-
Suppose a test has been made with one request
sample per ID (the first one in Table 2) and 68-1 reference IDs. Then the
actual mean value would have been 1.22 (compared to 1.18 +- 0.20) and the
actual standard deviation 1.14 (compared to 1.29).
-
Suppose a test has been made with only one
request ID (the first one in Table 2) and one reference ID (the second
one in Table 2) but 100 request samples. Then the actual mean value would
have been 1.77 (compared to 1.18 +- 0.80) and the actual standard deviation
1.68 (compared to 1.03).
-
The second observation is that the measurement
error (standard deviation) for the first test scenario is significantly
smaller than for the second one:
Table 4: Results for different
test scenarios
| Calculation method |
|
Mean of impostor score |
Sigma of measurement error |
|
Variance of impostor score |
Sigma of measurement error |
Global average:
6700 samples |
|
1.18
|
|
|
1.67
|
|
|
|
|
|
|
|
|
ID average:
67 IDs,
100 samples |
|
1.18
|
0.20
|
|
1.65
|
0.33
|
Test scenario 1:
67 IDs;
1 sample |
|
1.22
|
(0.20)
|
|
1.30
|
(0.33)
|
|
|
|
|
|
|
|
Sample average:
67 IDs,
100 samples |
|
1.18
|
0.80
|
|
1.05
|
0.63
|
Test scenario 2:
1 ID,
100 samples |
|
1.77
|
(0.80)
|
|
2.83
|
(0.63)
|
From these observations we may conclude:
-
There are three kinds of impostor distributions
which are not identical: one ID to one ID, one ID to many IDs, many IDs
to many IDs (1, 6, 7, 8, 11)
-
Request and reference prints seem to behave
identical in this trial (2)
-
The one-to-many impostor distribution (mean
variance = 1.32) seems to be narrower than the many-to-many distribution
(global variance = 1.43)
-
The one-to-one impostor distributions (mean
variance = 1.05) seems to be narrower than the one-to-many distribution
(variance = 1.67)
Conclusions
-
Although the number of IDs is smaller than
the number of prints per ID, a test based on 67 IDs and 1 sample delivers
more accurate results than a test based on 1 ID and 100 samples. For the
planning of tests this means that it is more advisable to have a large
number of participants than a large number of samples per finger (although
this is easier to achieve).
-
Due to the strong personal influences,
each ID must be represented by the same number of samples when calculating
global characteristics. Alternatively, the mean value of personal characteristics
may be taken, provided it delivers the desired result. (Example: The average
over all Ref ID means delivers an unbiased global mean value. This is not
true for the average over all Ref ID variances which does not estimate
the correct global variance. There should be no problem when calculating
the impostor distribution or the FAR from the personal impostor distributions
or the personal FARs by averaging.)
Comments
All results are based on mean values and standard
deviations which represent the position and the width of the score distribution,
respectively. (If the distribution type were known a priori, it could eventually
be determined completely by mean and standard deviation.) To obtain a low
FAR, the mean of the impostor score distribution should be as low as possible
and, simultaneously, the standard deviation of the impostor score
distribution should be as small as possible. (This is a necessary condition,
but it is not sufficient unless the distribution function has a known simple
form. Especially the tails of the distribution which are most important
for small FAR and FRR values, may show remarkable deviations.) |