maskrcnn_benchmark 代码详解之　matcher.py

78 阅读 0 评论 52 点赞

我是靠谱客的博主幽默百合，这篇文章主要介绍maskrcnn_benchmark 代码详解之　matcher.py，现在分享给大家，希望可以做个参考。

前言：

　　在目标检测的过程中，RPN生成的所有锚点(anchor)需要找到与之对应的在原图上标明的基准边框(gt)。在maskrcnn_benchmark中，这一匹配的过程由matcher.py来完成。该类定义了锚点与基准边框匹配的规则，将锚点与基准边框的IoU小于某个阈值的锚点设置为背景，大于另一个阈值的锚点设置为含目标的锚点，介于两个阈值之间的标记为第三种锚点。这种分类有助于RPN的后续操作，其代码与详细注释为：

# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import torch


class Matcher(object):
    """
    这个类主要实现将RPN提取出来的所有锚点(anchor)与标注的基准边框(ground truth box)进行匹配
    每一个锚点都会匹配一个与之对应的基准边框，当锚点与基准边框的Iou小于一定值时，认定其找不到对应
    的边框，认定其为背景。每一个基准边框对应０个或者多个锚点
    这个匹配操作是基于计算过的各个锚点与基准边框之间的IoU的MxN矩阵(match_quality_matrix)来
    进行的。其中M为基准边框的个数，N为锚点的个数。IoU矩阵的每一列表示某个锚点与所有各个基准框之间
    的IoU,每一行表示每个基准边框与所有各个锚点之间的IoU
    本Matcher类返回一个长度为N的向量，其表示每一个锚点的类型：背景-1,介于背景和目标之间-2以及
    目标边框（各自对应的基准边框的索引）
    This class assigns to each predicted "element" (e.g., a box) a ground-truth
    element. Each predicted element will have exactly zero or one matches; each
    ground-truth element may be assigned to zero or more predicted elements.

    Matching is based on the MxN match_quality_matrix, that characterizes how well
    each (ground-truth, predicted)-pair match. For example, if the elements are
    boxes, the matrix may contain box IoU overlap values.

    The matcher returns a tensor of size N containing the index of the ground-truth
    element m that matches to prediction n. If there is no match, a negative value
    is returned.
    """
    # 如果锚点与基准边框的IoU小于某个阈值则让他对应的基准边框索引为-1
    BELOW_LOW_THRESHOLD = -1
    # 如果锚点与基准边框的IoU介于两个阈值则让他对应的基准边框索引为 - ２
    BETWEEN_THRESHOLDS = -2

    def __init__(self, high_threshold, low_threshold, allow_low_quality_matches=False):
        """
        参数:
            high_threshold (float): quality values greater than or equal to
                this value are candidate matches.IoU阈值参数，锚点与基准边框(gt)的
                IoU大于该值时，才能认定锚点内有目标
            low_threshold (float): a lower quality threshold used to stratify
                matches into three levels:
                1) matches >= high_threshold
                2) BETWEEN_THRESHOLDS matches in [low_threshold, high_threshold)
                3) BELOW_LOW_THRESHOLD matches in [0, low_threshold)
                IoU阈值参数，锚点与基准边框(gt)的IoU小于该值时，才能认定锚点为背景
                这两个阈值吧所有的锚点分为３类
                １》大于等于high_threshold：目标－－设置对应的基准边框编号为真实的基准框索引
                ２》小于low_threshold：背景－－设置对应的基准边框编号为BELOW_LOW_THRESHOLD
                ３》介于二者之间－－设置对应的基准边框编号为BETWEEN_THRESHOLDS
            allow_low_quality_matches (bool): if True, produce additional matches
                for predictions that have only low-quality match candidates. See
                set_low_quality_matches_ for more details.
                如果值为真，则本匹配类将允许锚点匹配上比较小IoU的那个基准边框，因为有很多情况下
                生成的锚点并不能更好的逼近基准边框，所以IoU并不是很大，当在一个基准边框的锚点中，
                IoU最大的那个锚点的IoU可能小于low_threshold，按照常理，他应该被识别为背景，但是
                当allow_low_quality_matches为True时允许小于low_threshold的锚点与相应的基准
                边框相对应
        """
        # 如果较小的阈值不是小于较大的阈值，则报错
        assert low_threshold <= high_threshold
        # 将参数保存为类内属性
        self.high_threshold = high_threshold
        self.low_threshold = low_threshold
        self.allow_low_quality_matches = allow_low_quality_matches

    def __call__(self, match_quality_matrix):
        """
        参数:
            match_quality_matrix (Tensor[float]): an MxN tensor, containing the
            pairwise quality between M ground-truth elements and N predicted elements.
            计算过的各个锚点与基准边框之间的IoU的MxN矩阵(match_quality_matrix)来
            进行的。其中M为基准边框的个数，N为锚点的个数。IoU矩阵的每一列表示某个锚点与所有各个基准框之间
            的IoU,每一行表示每个基准边框与所有各个锚点之间的IoU

        Returns:
            matches (Tensor[int64]): an N tensor where N[i] is a matched gt in
            [0, M - 1] or a negative value indicating that prediction i could not
            be matched.
            长度为N的向量，N[i]表示每一个锚点对应的基准边框：背景-1,介于背景和目标之间-2以及
            目标边框（各自对应的基准边框的索引）
        """
        # 如果保存IoU的张量长度为０，则有错误。判断各种存在错误的情况
        if match_quality_matrix.numel() == 0:
            # empty targets or proposals not supported during training
            if match_quality_matrix.shape[0] == 0:
                raise ValueError(
                    "No ground-truth boxes available for one of the images "
                    "during training")
            else:
                raise ValueError(
                    "No proposal boxes available for one of the images "
                    "during training")

        # match_quality_matrix is M (gt) x N (predicted)
        # 从每一列中找到最大的IoU，即找到与锚点IoU最大的基准边框。得到Iou的值以及基准边框的索引
        matched_vals, matches = match_quality_matrix.max(dim=0)
        # 如果允许不再最大IoU的情况下匹配
        if self.allow_low_quality_matches:
            all_matches = matches.clone()

        # 分别得到不同种类的锚点的索引
        # 得到背景锚点的索引
        below_low_threshold = matched_vals < self.low_threshold
        # 得到介于背景和目标之间的锚点的索引
        between_thresholds = (matched_vals >= self.low_threshold) & (
            matched_vals < self.high_threshold
        )
        # 将背景锚点对应的基准边框索引设置为BELOW_LOW_THRESHOLD　－１
        matches[below_low_threshold] = Matcher.BELOW_LOW_THRESHOLD
        # 将介于背景和目标之间的锚点对应的基准边框索引设置为BETWEEN_THRESHOLDS　－２
        matches[between_thresholds] = Matcher.BETWEEN_THRESHOLDS
        # 如果允许较小IoU的情况下匹配，则允许保留各个基准边框的所有锚点中IoU最大的那个锚点与其基准边框对应，尽管该IoU可能很小
        if self.allow_low_quality_matches:
            self.set_low_quality_matches_(matches, all_matches, match_quality_matrix)
        # 返回锚点与基准边框对应的索引向量
        return matches

    def set_low_quality_matches_(self, matches, all_matches, match_quality_matrix):
        """
        Produce additional matches for predictions that have only low-quality matches.
        Specifically, for each ground-truth find the set of predictions that have
        maximum overlap with it (including ties); for each prediction in that set, if
        it is unmatched, then match it to the ground-truth with which it has the highest
        quality value.
        允许锚点匹配上比较小IoU的那个基准边框，因为有很多情况下
        生成的锚点并不能更好的逼近基准边框，所以IoU并不是很大，当在一个基准边框的锚点中，
        IoU最大的那个锚点的IoU可能小于low_threshold，按照常理，他应该被识别为背景，但是
        当allow_low_quality_matches为True时允许小于low_threshold的锚点与相应的基准
        边框相对应
        """
        # 对于每一个基准框找到与之对应的最大IoU
        highest_quality_foreach_gt, _ = match_quality_matrix.max(dim=1)
        # Find highest quality match available, even if it is low, including ties
        # 对于每一个基准框找到与之对应的IoU最大的锚点索引
        gt_pred_pairs_of_highest_quality = torch.nonzero(
            match_quality_matrix == highest_quality_foreach_gt[:, None]
        )
        # Example gt_pred_pairs_of_highest_quality:
        #   tensor([[    0, 39796],
        #           [    1, 32055],
        #           [    1, 32070],
        #           [    2, 39190],
        #           [    2, 40255],
        #           [    3, 40390],
        #           [    3, 41455],
        #           [    4, 45470],
        #           [    5, 45325],
        #           [    5, 46390]])
        # Each row is a (gt index, prediction index)
        # Note how gt items 1, 2, 3, and 5 each have two ties
        # 对于每一个基准框找到与之对应的IoU最大的锚点索引
        pred_inds_to_update = gt_pred_pairs_of_highest_quality[:, 1]
        # 将刚刚找出的锚点对应的基准边框恢复，因为当IoU很小时其对应的基准边框索引可能被修正为-1或者-2
        matches[pred_inds_to_update] = all_matches[pred_inds_to_update]