我是靠谱客的博主 傻傻小霸王,最近开发中收集的这篇文章主要介绍基于PaddleDetection的PP-YOLO开源代码复现-训练-测试-以及测试时类别为官方标签解决-附加批量测试程序,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

项目介绍

代码链接
飞桨推出的Paddle-Detection是端到端目标检测开发套件,旨在帮助开发者更快更好地完成检测模型的训练、精度速度优化到部署全流程。PaddleDetection以模块化的设计实现了多种主流目标检测算法,并且提供了丰富的数据增强、网络组件、损失函数等模块,集成了模型压缩和跨平台高性能部署能力。目前基于PaddleDetection已经完成落地的项目涉及工业质检、遥感图像检测、无人巡检等多个领域。
PaddleDetection新发布精度速度领先的PP-YOLO模型,COCO数据集精度达到45.2%,单卡Tesla V100预测速度达到72.9 FPS。从此图可以看出其性能各方面均高于YOLOV4.
在这里插入图片描述

环境搭建

1.https://github.com/PaddlePaddle/PaddleDetection    进入该链接下载对应操作系统的版本PaddlePaddle Fluid v.1.6。
需要注意的是必须安装好对应的依赖cuda和cudnn:[cuda安装](https://blog.csdn.net/wanzhen4330/article/details/81699769)
CUDA >= 8.0
cuDNN >= 5.0
nccl >= 2.1.2
2.安装COCO-API
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
# 若Cython未安装,安装Cython
pip install Cython
# 安装至全局site-packages
make install
# 若您没有权限或更倾向不安装至全局site-packages
python setup.py install --user
# 或者使用pip安装
pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI"
3. 通过以下命令克隆PaddleDetection:
cd <path/to/clone/PaddleDetection>
git clone https://github.com/PaddlePaddle/PaddleDetection.git
4. 安装python依赖包
cd 到指定文件夹
pip install -r requirements.txt
5 简单测试
python ppdet/modeling/tests/test_architectures.py

如下界面即成功安装。

(py36) z@z-SYS-4028GR-TR2:~/zuo/PaddleDetection$ python ppdet/modeling/tests/test_architectures.py
ss..........
----------------------------------------------------------------------
Ran 12 tests in 3.133s

数据集制作

Pascal VOC 数据集目录结构如下:
dataset/voc/
├── trainval.txt
├── test.txt
├── label_list.txt (optional)
├── VOCdevkit/VOC2007
│   ├── Annotations
│       ├── 001789.xml
│       |   ...
│   ├── JPEGImages
│       ├── 001789.jpg
│       |   ...
│   ├── ImageSets
│       |   ...
├── VOCdevkit/VOC2012
│   ├── Annotations
│       ├── 2011_003876.xml
│       |   ...
│   ├── JPEGImages
│       ├── 2011_003876.jpg
│       |   ...
│   ├── ImageSets
│       |   ...
|   ...

coco如下:
dataset/coco/
├── annotations
│   ├── instances_train2014.json
│   ├── instances_train2017.json
│   ├── instances_val2014.json
│   ├── instances_val2017.json
│   |   ...
├── train2017
│   ├── 000000000009.jpg
│   ├── 000000580008.jpg
│   |   ...
├── val2017
│   ├── 000000000139.jpg
│   ├── 000000000285.jpg
│   |   ...
|   ...

实现voc-coco的转化:

python tools/x2coco.py 
        --dataset_type voc 
        --voc_anno_dir path/to/VOCdevkit/VOC2007/Annotations/ 
        --voc_anno_list path/to/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt 
        --voc_label_list dataset/voc/label_list.txt 
        --voc_out_name voc_train.json

对照这个官方给出的代码,在pycharm对照改写如下:

def main():
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument('--dataset_type', help='the type of dataset',default='voc')
    parser.add_argument('--json_input_dir', help='input annotated directory')
    parser.add_argument('--image_input_dir', help='image directory')
    parser.add_argument(
        '--output_dir', help='output dataset directory', default='')
    parser.add_argument(
        '--train_proportion',
        help='the proportion of train dataset',
        type=float,
        default=1.0)
    parser.add_argument(
        '--val_proportion',
        help='the proportion of validation dataset',
        type=float,
        default=0.0)
    parser.add_argument(
        '--test_proportion',
        help='the proportion of test dataset',
        type=float,
        default=0.0)
    parser.add_argument(
        '--voc_anno_dir',
        help='In Voc format dataset, path to annotation files directory.',
        type=str,
        default='./VOCdevkit/VOC2007/Annotations/')
    parser.add_argument(
        '--voc_anno_list',
        help='In Voc format dataset, path to annotation files ids list.',
        type=str,
        default='VOCdevkit/VOC2007/ImageSets/Main/trainval.txt')
    parser.add_argument(
        '--voc_label_list',
        help='In Voc format dataset, path to label list. The content of each line is a category.',
        type=str,
        default='./label_list.txt')
    parser.add_argument(
        '--voc_out_name',
        type=str,
        default='voc_train.json',
        help='In Voc format dataset, path to output json file')
    args = parser.parse_args()

运行后会在tools/下生成voc_train.json
过程如下:

Start converting !
100%|██████████| 153/153 [00:00<00:00, 10922.29it/s]

按照同样方法生成voc_test.json和和voc_val.json
然后将其放到上述coco格式中:
即tools/datasets/coco/… 数据集很重要,很多错误都源于这。

选择模型

PaddleDetection中提供了丰富的模型库,具体可在模型库中查看各个模型的指标,可依据实际部署算力的情况,选择合适的模型。
算力资源小时,使用移动端模型,PaddleDetection中的移动端模型经过迭代优化,具有较高性价比。
算力资源强大时,使用服务器端模型,该模型是PaddleDetection提出的面向服务器端实用的目标检测方案。同时也可以根据使用场景不同选择合适的模型。
当小物体检测时,使用两阶段检测模型,比如Faster RCNN系列模型,具体可在模型库中找到。
同时也可以使用PaddleDetection中开发的YOLOv3增强模型、YOLOv4模型与Anchor Free模型等。

选择好模型后,需要在configs目录中找到对应的配置文件,为了在自定义数据集上训练,需要对参数配置做一些修改:


    COCO数据集:

  dataset:
     !COCODataSet
     image_dir: val2017 # 图像数据基于数据集根目录的相对路径
     anno_path: annotations/instances_val2017.json  # 标注文件基于数据集根目录的相对路径
     dataset_dir: dataset/coco  # 数据集根目录
     with_background: true  # 背景是否作为一类标签,默认为true。

这里选择的是ppylolo_r18vd.yml,路径和类别修改后,开始训练。

修改最大迭代次数:
max_iters=epoch×训练集数目/2倍的gpu个数
比如训练集200,一个gpu,选取epoch=12时,max_iters=2400/2=1200

尽管我选择的ppylolo_r18vd.yml,但是其基于ppyolo_reader.yml
找到这部分进行修改,不然测试时全是官方类别:

TestReader:
  inputs_def:
    image_shape: [3, 608, 608]
    fields: ['image', 'im_size', 'im_id']
  dataset:
    !ImageFolder
      #anno_path: annotations/instances_val2017.json
      #修改
      anno_path: dataset/coco/annotations/instances_val2017.json
      with_background: false

训练及测试

在pycharm上方参数行输入:

-c /home/z/zuo/PaddleDetection/configs/ppyolo/ppyolo_r18vd.yml --eval
运行结果:
2020-09-26 16:16:06,211-INFO: places would be ommited when DataLoader is not iterable
W0926 16:16:06.233088 11782 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 11.0, Runtime API Version: 10.0
W0926 16:16:06.236994 11782 device_context.cc:260] device: 0, cuDNN Version: 7.6.
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2020-09-26 16:16:14,963-INFO: places would be ommited when DataLoader is not iterable
I0926 16:16:19.645217 11782 build_strategy.cc:361] set enable_sequential_execution:1
W0926 16:16:19.923936 11782 fuse_all_reduce_op_pass.cc:74] Find all_reduce operators: 88. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 72.
2020-09-26 16:16:29,817-INFO: iter: 0, lr: 0.000000, 'loss_xy': '1.268840', 'loss_wh': '5.101564', 'loss_obj': '3439.274658', 'loss_cls': '0.728997', 'loss_iou': '5.795195', 'loss': '3452.169189', time: 0.000, eta: 0:00:00
2020-09-26 16:16:43,796-INFO: iter: 20, lr: 0.000020, 'loss_xy': '1.486864', 'loss_wh': '5.368582', 'loss_obj': '527.408203', 'loss_cls': '0.749239', 'loss_iou': '6.396237', 'loss': '540.208008', time: 1.408, eta: 0:46:28
2020-09-26 16:17:00,762-INFO: iter: 40, lr: 0.000040, 'loss_xy': '1.548631', 'loss_wh': '4.606500', 'loss_obj': '14.239037', 'loss_cls': '0.723093', 'loss_iou': '6.582170', 'loss': '27.592823', time: 0.859, eta: 0:28:03

训练完毕后会在当前tools目录下生成output文件夹里面是生成的模型,但不是最后用的模型,需要进行转换。
tools/export_model.py
在pycharm运行参数栏里输入后运行。

-c configs/faster_rcnn_r50_fpn_1x.yml --output_dir=./inference_model

如果有找不到模型的错误,可能需要改config的参数:在weights那行

weights: output/ppyolo_tiny/model_final,

按照提示把最后一个模型文件名给改了,可能是:

weights: output/ppyolo_r18vd/best_model

运行后生成inference_model里面是可用的模型

运行结果如下:
2020-09-24 15:55:40,178-INFO: save_inference_model pruned unused feed variables im_id
2020-09-24 15:55:40,178-INFO: Export inference model to ./inference_model/ppyolo_r18vd, input: ['image', 'im_size'], output: ['matrix_nms_0.tmp_0']...
2020-09-24 15:55:40,495-INFO: Load categories from dataset/coco/annotations/instances_val2017.json
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2020-09-24 15:55:40,757-INFO: Export inference config file to ./inference_model/ppyolo_r18vd/infer_cfg.yml

测试时在deploy/python/infer.py
同样在pycharm参数行里输入:

--model_dir=./inference_model//ppyolo_r18vd.yml --image_file=path/to/dataset/2572.jpeg --use_gpu=True
运行结果如下:
image_file: /home/z/zuo/PaddleDetection/tools/dataset/coco/test2017/000001.bmp
model_dir: /home/z/zuo/PaddleDetection/tools/inference_model/ppyolo_r18vd
output_dir: output
run_benchmark: False
run_mode: fluid
threshold: 0.5
use_gpu: True
video_file: 
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Use Padddle Executor: False
Transform Order: 
--transform op: Resize
--transform op: Normalize
--transform op: Permute
--------------------------------------------
Inference: 6.660938262939453 ms per batch image
class_id:0, confidence:0.9825,left_top:[471.25,312.21], right_bottom:[689.18,330.45]
save result to: output/000001.bmp

批量测试

修改infer.py
也可以将infer.py程序复制到新建的程序后再改:
防止改不错了
1.找到此行,输入待测试图片文件夹路径

parser.add_argument(
  "--image_file", type=str, default='/home/z/zuo/PaddleDetection/tools/dataset/coco/test2017', help="Path of image file.")

2.修改函数def predict_image()

def predict_image():
    # 批量测试:
    detector = Detector(
        FLAGS.model_dir, use_gpu=FLAGS.use_gpu, run_mode=FLAGS.run_mode)
    if FLAGS.run_benchmark:
        ##### 加
        for filename in os.listdir(FLAGS.image_file):
            path=FLAGS.image_file+'/'+filename
            detector.predict(
            path,
            FLAGS.threshold,
            warmup=100,
            repeats=100,
            run_benchmark=True)
        #detector.predict(
            #FLAGS.image_file,
            #FLAGS.threshold,
            #warmup=100,
            #repeats=100,
            #run_benchmark=True)
    else:
        for filename in os.listdir(FLAGS.image_file):
            path=FLAGS.image_file+'/'+filename
            results = detector.predict(path, FLAGS.threshold)
            visualize(
                path,
                results,
                detector.config.labels,
                mask_resolution=detector.config.mask_resolution,
                output_dir=FLAGS.output_dir)

同样在pycharm参数行输入:

--model_dir=/home/z/zuo/PaddleDetection/tools/inference_model/ppyolo_r18vd
测试效果:截取一部分
Inference: 6.028890609741211 ms per batch image
class_id:0, confidence:0.9979,left_top:[405.45,265.05], right_bottom:[423.90,274.10]
class_id:0, confidence:0.9970,left_top:[81.94,309.99], right_bottom:[109.75,319.11]
class_id:0, confidence:0.9920,left_top:[329.64,352.03], right_bottom:[358.91,361.85]
save result to: output/000476.bmp
Inference: 5.766630172729492 ms per batch image
class_id:0, confidence:0.9989,left_top:[239.34,320.03], right_bottom:[310.86,330.82]
class_id:0, confidence:0.9966,left_top:[65.20,318.06], right_bottom:[92.88,327.04]
class_id:0, confidence:0.9843,left_top:[435.73,322.93], right_bottom:[467.24,332.08]
class_id:0, confidence:0.9819,left_top:[339.03,321.02], right_bottom:[371.97,330.75]
save result to: output/000462.bmp
Inference: 5.4779052734375 ms per batch image
class_id:0, confidence:0.9977,left_top:[240.46,323.91], right_bottom:[299.29,334.81]
class_id:0, confidence:0.9958,left_top:[182.27,328.96], right_bottom:[212.92,335.92]
class_id:0, confidence:0.9956,left_top:[383.70,326.88], right_bottom:[409.83,335.94]
class_id:0, confidence:0.9884,left_top:[117.74,327.63], right_bottom:[152.39,335.65]
save result to: output/000770.bmp
Inference: 8.993148803710938 ms per batch image
class_id:0, confidence:0.9973,left_top:[591.84,326.17], right_bottom:[662.69,335.04]
class_id:0, confidence:0.9960,left_top:[441.39,322.97], right_bottom:[543.67,336.99]
save result to: output/000290.bmp

Process finished with exit code 0

如果需要生成txt文档,则需要更改visualize的一些代码:

# coding: utf-8
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from __future__ import division
import os
import cv2
import numpy as np
from PIL import Image, ImageDraw


def visualize_box_mask(pic_path, results, labels, mask_resolution=14):
    """ 
    Args:
        im (str/np.ndarray): path of image/np.ndarray read by cv2
        results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
                        matix element:[class, score, x_min, y_min, x_max, y_max]
                        MaskRCNN's results include 'masks': np.ndarray: 
                        shape:[N, class_num, mask_resolution, mask_resolution]  
        labels (list): labels:['class1', ..., 'classn']
        mask_resolution (int): shape of a mask is:[mask_resolution, mask_resolution]
    Returns:
        im (PIL.Image.Image): visualized image  
    """
    if isinstance(pic_path, str):
        im = Image.open(pic_path).convert('RGB')
    else:
        im = Image.fromarray(pic_path)
    if 'masks' in results and 'boxes' in results:
        im = draw_mask(
            im,
            results['boxes'],
            results['masks'],
            labels,
            resolution=mask_resolution)
    if 'boxes' in results:
        im = draw_box(im,pic_path,results['boxes'], labels)
    return im


def get_color_map_list(num_classes):
    """ 
    Args:
        num_classes (int): number of class
    Returns:
        color_map (list): RGB color list 
    """
    color_map = num_classes * [0, 0, 0]
    for i in range(0, num_classes):
        j = 0
        lab = i
        while lab:
            color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
            color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
            color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
            j += 1
            lab >>= 3
    color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
    return color_map


def expand_boxes(boxes, scale=0.0):
    """ 
    Args:
        boxes (np.ndarray): shape:[N,4], N:number of box,
                            matix element:[x_min, y_min, x_max, y_max]
        scale (float): scale of boxes
    Returns:
        boxes_exp (np.ndarray): expanded boxes
    """
    w_half = (boxes[:, 2] - boxes[:, 0]) * .5
    h_half = (boxes[:, 3] - boxes[:, 1]) * .5
    x_c = (boxes[:, 2] + boxes[:, 0]) * .5
    y_c = (boxes[:, 3] + boxes[:, 1]) * .5
    w_half *= scale
    h_half *= scale
    boxes_exp = np.zeros(boxes.shape)
    boxes_exp[:, 0] = x_c - w_half
    boxes_exp[:, 2] = x_c + w_half
    boxes_exp[:, 1] = y_c - h_half
    boxes_exp[:, 3] = y_c + h_half
    return boxes_exp


def draw_mask(im, np_boxes, np_masks, labels, resolution=14, threshold=0.5):
    """ 
    Args:
        im (PIL.Image.Image): PIL image
        np_boxes (np.ndarray): shape:[N,6], N: number of box,
                               matix element:[class, score, x_min, y_min, x_max, y_max]
        np_masks (np.ndarray): shape:[N, class_num, resolution, resolution]
        labels (list): labels:['class1', ..., 'classn']
        resolution (int): shape of a mask is:[resolution, resolution]
        threshold (float): threshold of mask
    Returns:
        im (PIL.Image.Image): visualized image  
    """
    color_list = get_color_map_list(len(labels))
    scale = (resolution + 2.0) / resolution
    im_w, im_h = im.size
    w_ratio = 0.4
    alpha = 0.7
    im = np.array(im).astype('float32')
    rects = np_boxes[:, 2:]
    expand_rects = expand_boxes(rects, scale)
    expand_rects = expand_rects.astype(np.int32)
    clsid_scores = np_boxes[:, 0:2]
    padded_mask = np.zeros((resolution + 2, resolution + 2), dtype=np.float32)
    clsid2color = {}
    for idx in range(len(np_boxes)):
        clsid, score = clsid_scores[idx].tolist()
        clsid = int(clsid)
        xmin, ymin, xmax, ymax = expand_rects[idx].tolist()
        w = xmax - xmin + 1
        h = ymax - ymin + 1
        w = np.maximum(w, 1)
        h = np.maximum(h, 1)
        padded_mask[1:-1, 1:-1] = np_masks[idx, int(clsid), :, :]
        resized_mask = cv2.resize(padded_mask, (w, h))
        resized_mask = np.array(resized_mask > threshold, dtype=np.uint8)
        x0 = min(max(xmin, 0), im_w)
        x1 = min(max(xmax + 1, 0), im_w)
        y0 = min(max(ymin, 0), im_h)
        y1 = min(max(ymax + 1, 0), im_h)
        im_mask = np.zeros((im_h, im_w), dtype=np.uint8)
        im_mask[y0:y1, x0:x1] = resized_mask[(y0 - ymin):(y1 - ymin), (
            x0 - xmin):(x1 - xmin)]
        if clsid not in clsid2color:
            clsid2color[clsid] = color_list[clsid]
        color_mask = clsid2color[clsid]
        for c in range(3):
            color_mask[c] = color_mask[c] * (1 - w_ratio) + w_ratio * 255
        idx = np.nonzero(im_mask)
        color_mask = np.array(color_mask)
        im[idx[0], idx[1], :] *= 1.0 - alpha
        im[idx[0], idx[1], :] += alpha * color_mask
    return Image.fromarray(im.astype('uint8'))


def draw_box(im, pic_path,np_boxes, labels):
    """ 
    Args:
        im (PIL.Image.Image): PIL image
        np_boxes (np.ndarray): shape:[N,6], N: number of box,
                               matix element:[class, score, x_min, y_min, x_max, y_max]
        labels (list): labels:['class1', ..., 'classn']
    Returns:
        im (PIL.Image.Image): visualized image  
    """
    draw_thickness = min(im.size) // 320
    draw = ImageDraw.Draw(im)
    clsid2color = {}
    color_list = get_color_map_list(len(labels))

    for dt in np_boxes:
        clsid, bbox, score = int(dt[0]), dt[2:], dt[1]
        xmin, ymin, xmax, ymax = bbox
        w = xmax - xmin
        h = ymax - ymin
        if clsid not in clsid2color:
            clsid2color[clsid] = color_list[clsid]
        color = tuple(clsid2color[clsid])

        # draw bbox
        draw.line(
            [(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin),
             (xmin, ymin)],
            width=draw_thickness,
            fill=color)

        # draw label
        text = "{} {:.4f}".format(labels[clsid], score)
        tw, th = draw.textsize(text)
        draw.rectangle(
            [(xmin + 1, ymin - th), (xmin + tw + 1, ymin)], fill=color)
        draw.text((xmin + 1, ymin - th), text, fill=(255, 255, 255))
        #jia-txt
        obj_list=[pic_path[-10:],score,xmin,ymin,xmax,ymax,labels[clsid]]
        print(obj_list)

        with open('./test.txt', 'a') as f:

            f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f} {:s}n'.format(obj_list[0], float(obj_list[1]),float(obj_list[2]), float(obj_list[3]),float(obj_list[4]),float(obj_list[5]),obj_list[6]))
    return im

结果如下:

000262.bmp 0.997 627.8 334.9 718.2 349.0 lf
000245.bmp 0.999 147.9 405.0 192.1 417.0 lf
000236.bmp 0.998 143.9 289.0 208.0 300.0 lf
000236.bmp 0.992 95.5 253.0 136.9 260.8 lf

最后

以上就是傻傻小霸王为你收集整理的基于PaddleDetection的PP-YOLO开源代码复现-训练-测试-以及测试时类别为官方标签解决-附加批量测试程序的全部内容,希望文章能够帮你解决基于PaddleDetection的PP-YOLO开源代码复现-训练-测试-以及测试时类别为官方标签解决-附加批量测试程序所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(86)

评论列表共有 0 条评论

立即
投稿
返回
顶部