VOC AP计算方法和检测框置信度阈值取值的影响

54 阅读 0 评论 36 点赞

我是靠谱客的博主敏感嚓茶，这篇文章主要介绍VOC AP计算方法和检测框置信度阈值取值的影响，现在分享给大家，希望可以做个参考。

VOC AP计算方法和检测框置信度阈值取值的影响

文章目录

- VOC AP计算方法和检测框置信度阈值取值的影响
- - VOC AP计算方法
  - 检测框置信度阈值对AP的影响
- Faster RCNN mAP计算代码

VOC AP计算方法

首先明确几个定义

	预测(detection)	真实(GroundTrue)
TP(True Positive)	真	真
FP(False Positive)	真	假
TN(True Negative)	假	假
FN(False Negative)	假	真

记忆小技巧：例如True Positive，前一个单词表示预测和真实是否一致，后一个单词表示预测结果是什么类型；
True/False是一类表示绝对的词语，只能用来表示事实，这里用来表示预测事件的正确与否；
Positive/Negative 用来表示模型、算法或器材得到的结论（预测结果），因为存在模型和器材的不可靠因素，所以只能用Positive/Negative词语来表示可能性；这样理解的记忆，就再不会搞混了。

由于目标检测得到的都是被认为是真的检测框,所以不计算TN和FN.

假设目标检测置信度按照从高到低排列,如下表所示:

检测编号	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
IOU>0.5	T	F	T	F	T	T	T	F	F	T	T	F	T	F	T
tp(遇T增1)	1	1	2	2	3	4	5	5	5	6	7	7	8	8	9
fp(遇F增1)	0	1	1	2	2	2	2	3	4	4	4	5	5	6	6

假设用置信度阈值筛选了15个bbox(降低阈值,可以筛选更多,之后分析)
以总共15个bbox进行计算ap,其中总的GTbbox=9,虚警个数6个;下面计算n不同是的recall和precision

n	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
recall:tp/9	1/9	1/9	2/9	2/9	3/9	4/9	5/9	5/9	5/9	6/9	7/9	7/9	8/9	8/9	9/9
precision: tp/(tp+fp)	1/1	1/2	2/3	2/4	3/5	4/6	5/7	5/8	5/9	6/10	7/11	7/12	8/13	8/14	9/15

在这里插入图片描述

检测框置信度阈值对AP的影响

在这里插入图片描述

结论:从图像上看,降低阈值后使precision逐渐降为0,也就是当recall=1,precision会有更多个点.

VOC2007计算AP时是取11个固定recall值对应的最大的precision值,然后取平均,所以降低置信度阈值增加的点不会影响AP的值(前提是:recall已经为1,增加的点都是虚警点).
这时我们也就能明白,为什么计算AP时用一个较小的置信度阈值,就是为了确保recall能够等于1,确保没有漏检,后续增加的虚警都是不影响最终计算的AP值的.

Faster RCNN mAP计算代码

splitlines = [x.strip().split(' ') for x in lines]
image_ids = [x[0] for x in splitlines]
confidence = np.array([float(x[1]) for x in splitlines])
BB = np.array([[float(z) for z in x[2:]] for x in splitlines])
# sort by confidence
sorted_ind = np.argsort(-confidence) # np.argsort()排序默认从小到大,所以这里将置信度取负
sorted_scores = np.sort(-confidence) #按照置信度排序,置信度高的排在前面;
BB = BB[sorted_ind, :]
image_ids = [image_ids[x] for x in sorted_ind] #检测结果保存时,每行一个bbox,所以一张图像多个bbox的情况就被分成了多行;这里image_ids中存在多行同为一个图像的情况
# go down dets and mark TPs and FPs
nd = len(image_ids)
tp = np.zeros(nd)
fp = np.zeros(nd)
for d in range(nd):
R = class_recs[image_ids[d]] #class_recs dict {'img_index',''} #R表示当前帧图像上所有的GT bbox的信息
bb = BB[d, :].astype(float)
ovmax = -np.inf
BBGT = R['bbox'].astype(float)
if BBGT.size > 0:
# compute overlaps
# intersection
ixmin = np.maximum(BBGT[:, 0], bb[0])
iymin = np.maximum(BBGT[:, 1], bb[1])
ixmax = np.minimum(BBGT[:, 2], bb[2])
iymax = np.minimum(BBGT[:, 3], bb[3])
iw = np.maximum(ixmax - ixmin + 1., 0.)
ih = np.maximum(iymax - iymin + 1., 0.)
inters = iw * ih
# union
uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +
(BBGT[:, 2] - BBGT[:, 0] + 1.) *
(BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)
overlaps = inters / uni
#IOU
ovmax = np.max(overlaps)
#bb表示测试集中某一个检测出来的框的四个坐标，BBGT表示和bb同一图像上的所有检测框，取其中IOU最大的作为检测框的ground-true
jmax = np.argmax(overlaps)
if ovmax > ovthresh:
if not R['difficult'][jmax]:
if not R['det'][jmax]: # 判断是否被检测过(如果之前有置信度更高的bbox匹配上了这个BBGT,那么就表示检测过了)
tp[d] = 1. #预测为正，实际为正
R['det'][jmax] = 1
else:
fp[d] = 1. #预测为正，实际为负
else:
fp[d] = 1.
# compute precision recall
fp = np.cumsum(fp)
tp = np.cumsum(tp)
rec = tp / float(npos) #召回率
# avoid divide by zero in case the first detection matches a difficult
# ground truth
prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps) # 精准率,查准率
# ap = voc_ap(rec, prec, use_07_metric)
ap = voc_ap(rec, prec, use_07_metric=True)

voc_ap计算

def voc_ap(rec, prec, use_07_metric=False): #rec:召回率 prec:准确率；召回率越高，准确率越低
""" ap = voc_ap(rec, prec, [use_07_metric])
Compute VOC AP given precision and recall.
If use_07_metric is true, uses the
VOC 07 11 point method (default:False).
"""
if use_07_metric: #Y轴查准率p,X轴召回率r,取11个点,如[r(0.0),p(0)],[r(0.1),p(1)],...,[r(1.0),p(10)],ap=(p(0)+p(1)+...+p(10))/11
# 11 point metric
ap = 0.
for t in np.arange(0., 1.1, 0.1):
if np.sum(rec >= t) == 0: #召回率rec中大于阈值t的数量;等于0表示超过了最大召回率,对应的p设置为0
p = 0
else:
p = np.max(prec[rec >= t]) #召回率大于t时精度的最大值 ???
ap = ap + p / 11.
else:
# correct AP calculation
# first append sentinel values at the end
mrec = np.concatenate(([0.], rec, [1.]))
mpre = np.concatenate(([0.], prec, [0.]))
# compute the precision envelope
for i in range(mpre.size - 1, 0, -1):
mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
# to calculate area under PR curve, look for points
# where X axis (recall) changes value
i = np.where(mrec[1:] != mrec[:-1])[0]
# and sum (Delta recall) * prec
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1]) #计算PR曲线向下包围的面积
return ap