生产系统中的机器学习：实验笔记（一）——模型部署不计分实验Part01 - 部署一个机器学习模型不计分实验Part02 - 使用模型

68 阅读 0 评论 45 点赞

我是靠谱客的博主难过黑夜，这篇文章主要介绍生产系统中的机器学习：实验笔记（一）——模型部署不计分实验Part01 - 部署一个机器学习模型不计分实验Part02 - 使用模型，现在分享给大家，希望可以做个参考。

这是吴恩达老师在DeepLearningAI上的课程，《生产系统中的机器学习》第一个不记分实验的学习记录。这学期有老师推荐了这门课，听了一下果断付费了，因为现在比较权威的、跟生产联系紧密的机器学习或者是深度学习的课程并不多，这一系列课的质量很高，听下来收获也很多。为了push自己保持学习，我会把所有实验过程都总结记录，第一个是关于模型部署。

不计分实验Part01 - 部署一个机器学习模型

部署预先训练好的计算机视觉模型YOLO v3。步骤如下：

检查用于目标检测的图片数据集；
看一看模型本身；
使用fastAPI部署模型。

使用YOLOV3进行目标检测

检视图片

第一步先查看一下需要使用YOLO V3检测的图片。

导入相关包

from IPython.display import Image, display

调用IPython.display方法显示图片。

# 要查看的图片名称列表
image_files = [
'apple.jpg',
'clock.jpg',
'oranges.jpg',
'car.jpg'
]
for image_file in image_files:
print(f'nDisplaying image: {image_file}')
display(Image(filename=f'course1/week1-ungraded-lab/images/{image_file}'))

Displaying image: apple.jpg

在这里插入图片描述

Displaying image: clock.jpg

在这里插入图片描述

Displaying image: oranges.jpg

在这里插入图片描述

Displaying image: car.jpg

在这里插入图片描述

模型概览

接下来，我们尝试一下在本地使用YOLO V3来对这些图片做一下目标检测。
首先要创建一个目录images_with_boxes用于存放目标检测的结果。YOLO V3会把图片中的识别到的物体用一个矩形框框出，并对其打上标签，再将这个框的坐标、标签和检测的置信度返回。在得到YOLO V3的返回结果后，我们可以根据坐标和标签在原图片上画出矩形框并打上标签，输出为一张新的图片作为检测结果存放在输出目录中。

import os
dir_name = 'course1/week1-ungraded-lab/images_with_boxes'
if not os.path.exists(dir_name):
os.makedirs(dir_name)

接着定义一个方法detect_and_draw_box(filename, model, confidence)，用于实现上文所说的使用YOLO V3进行目标检测、并根据其返回结果在图片上画框打标签、输出图片至结果目录。方法参数如下：

filename 输入图片的文件名
model 选用的模型。默认是YOLO V3的简化版本yolov3-tiny
confident 预测的置信度阈值。YOLO V3的检测结果置信度必须高于该值才认为检测成功

import cv2
import cvlib as cv
from cvlib.object_detection import draw_bbox
def detect_and_draw_box(filename, model='yolov3-tiny', confidence=0.5):
img_filepath = f'course1/week1-ungraded-lab/images/{filename}'
# 读取图像
img = cv2.imread(img_filepath)
# 使用指定的model进行目标检测，返回置信度大于阈值confident的结果：边框坐标、标签和置信度
bbox, label, conf = cv.detect_common_objects(img, confidence=confidence, model=model)
print(f'========================nImage processed: {filename}n')
for l, c in zip(label, conf):
print(f'Detected object: {l} with confidence level of {c}n')
# 在图中绘制边框和标签
output_image = draw_bbox(img, bbox, label, conf)
# 输出检测结果图片
cv2.imwrite(f'course1/week1-ungraded-lab/images_with_boxes/{filename}', output_image)
# 显示图片
display(Image(f'course1/week1-ungraded-lab/images_with_boxes/{filename}'))

最后，调用detect_and_draw_box方法开始目标检测。

for image_file in image_files:
detect_and_draw_box(image_file)

========================
Image processed: apple.jpg
Detected object: apple with confidence level of 0.571720540523529

在这里插入图片描述

========================
Image processed: clock.jpg
Detected object: clock with confidence level of 0.9683184623718262

在这里插入图片描述

========================
Image processed: oranges.jpg
Detected object: orange with confidence level of 0.6185590624809265
Detected object: orange with confidence level of 0.5561691522598267

在这里插入图片描述

========================
Image processed: car.jpg
Detected object: car with confidence level of 0.6325409412384033

在这里插入图片描述

再尝试对包含有多种目标的图片用YOLO V3进行检测。

detect_and_draw_box('fruits.jpg')

========================
Image processed: fruits.jpg
Detected object: apple with confidence level of 0.5818483829498291
Detected object: orange with confidence level of 0.5346482992172241
Detected object: orange with confidence level of 0.515099287033081

在这里插入图片描述

显然，上面的检测出现了错误，把最左侧的“橘子”识别为了“苹果”；同时，周围还有大量的目标没有被检测到，这跟置信度阈值设置过高有关。
尝试降低置信度阈值至0.2。

detect_and_draw_box('fruits.jpg', confidence=0.2)

========================
Image processed: fruits.jpg
Detected object: apple with confidence level of 0.5818483829498291
Detected object: orange with confidence level of 0.5346482992172241
Detected object: orange with confidence level of 0.515099287033081
Detected object: apple with confidence level of 0.34759876132011414
Detected object: orange with confidence level of 0.32876095175743103
Detected object: apple with confidence level of 0.31244680285453796
Detected object: orange with confidence level of 0.2798606753349304
Detected object: orange with confidence level of 0.2749978303909302
Detected object: apple with confidence level of 0.2744506895542145
Detected object: orange with confidence level of 0.21419063210487366

在这里插入图片描述

调低阈值后，更多的物品被检测到并正确标记了。但是这并不能使之前错误的标记变得正确——模型对目标的检测出现错误，这是在所难免的，这也是为什么在课中吴恩达老师会说在生产中更为常见的是“半自动化”的系统。

使用fastAPI部署模型

FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints. 这是fastAPI的官方介绍，简而言之，这是一个用于构建API的web框架，和Django、Flask类似。配合web服务器框架uvicorn，能够快速搭建起一个简单的服务器。
本实验将构建一个C/S架构的系统，客户端调用服务端提供的目标检测接口，将图片上传，服务端接收图片，使用YOLO V3进行目标检测，对图片中的目标进行标记后返回给调用者（即客户端）。
首先，创建目录images_uploaded，服务端对图片进行检测并标记后，会将标记后的图片存放于此。

dir_name = 'course1/week1-ungraded-lab/images_uploaded'
if not os.path.exists(dir_name):
os.makedirs(dir_name)

导入fastAPI、uvicorn相关包。

import io
import uvicorn
import numpy as np
import nest_asyncio
from enum import Enum
from fastapi import FastAPI, UploadFile, File, HTTPException
from fastapi.responses import StreamingResponse

创建一个fastAPI实例，并分别创建用于相应HTTP请求的两个端点（endpoint）home和prediction。
其中，prediction方法是重点，用于响应客户端发来的目标检测请求。客户端调用此方法，选择用于检测的模型model，并上传需要检测的图片file；predict方法先判断图片格式是否为“jpg”、“jpeg”和“png”其中之一，只可检测这三种格式图片，否则HTTP返回错误码415，图片格式无误则接收并解码成图片，使用模型进行检测，并将结果图片写入之前创建的目录，再转换为字节流返回给调用者。

app = FastAPI(title='Deploying a ML Model with FastAPI')
class Model(str, Enum):
yolov3tiny = 'yolov3-tiny'
yolov3 = 'yolov3'
@app.get('/')
def home():
return 'Congratulations! Your API is working as expected. Now head over to http://localhost:8000/docs.'
@app.post('/predict')
def prediction(model: Model, file: UploadFile = File(...)):
filename = file.filename
fileExtension = filename.split('.')[-1] in ('jpg', 'jpeg', 'png')
if not fileExtension:
# 只能处理这三种类型的图片
raise HTTPException(status_code=415, detail='Unsupported file provided.')
# 读取文件字节流
image_stream = io.BytesIO(file.file.read())
image_stream.seek(0)
# 将文件字节流写入Numpy array
file_bytes = np.asarray(bytearray(image_stream.read()), dtype=np.uint8)
# 将Numpy array解码为图片
image = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
# 使用模型进行目标检测
bbox, label, conf = cv.detect_common_objects(image, model=model)
# 创建包含边框和标签的图片
output_image = draw_bbox(image, bbox, label, conf)
# 将检测结果图片保存在服务端的指定目录下
cv2.imwrite(f'course1/week1-ungraded-lab/images_uploaded/{filename}', output_image)
# 打开检测结果图片，以文件流返回给调用者
file_image = open(f'course1/week1-ungraded-lab/images_uploaded/{filename}', mode='rb')
return StreamingResponse(file_image, media_type='image/jpeg')

使用uvicorn运行fastAPI实例，即启动服务器。

nest_asyncio.apply()
host = '0.0.0.0' if os.getenv('DOCKER-SETUP') else '127.0.0.1'
uvicorn.run(app, host=host, port=8000)

INFO:
Started server process [19884]
INFO:
Waiting for application startup.
INFO:
Application startup complete.
INFO:
Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:
127.0.0.1:60163 - "POST /predict?model=yolov3-tiny HTTP/1.1" 200 OK
INFO:
127.0.0.1:60175 - "POST /predict?model=yolov3-tiny HTTP/1.1" 200 OK
INFO:
127.0.0.1:61009 - "POST /predict?model=yolov3-tiny HTTP/1.1" 200 OK
INFO:
127.0.0.1:61010 - "POST /predict?model=yolov3-tiny HTTP/1.1" 200 OK
INFO:
127.0.0.1:61012 - "POST /predict?model=yolov3-tiny HTTP/1.1" 200 OK
INFO:
127.0.0.1:61084 - "POST /predict?model=yolov3-tiny HTTP/1.1" 200 OK
INFO:
127.0.0.1:61085 - "POST /predict?model=yolov3-tiny HTTP/1.1" 200 OK
INFO:
127.0.0.1:61087 - "POST /predict?model=yolov3-tiny HTTP/1.1" 200 OK

服务启动后可以利用fastAPI内置的客户端来和服务端进行交互，在浏览器输入http://127.0.0.1:8000/docs 即可。当然也可以编写一个Client来与Server交互，保持Server的会话，开启一个新的notebook来编写客户端。

不计分实验Part02 - 使用模型

这一小节中，将使用Python的requests库编写一个小型的客户端，调用上文server提供的接口来使用模型进行预测。

导入相关包

import os
import io
import cv2
import requests
import numpy as np
from IPython.display import Image, display

定义URL

Client需要向Server发出HTTP请求调用相关接口来获得服务。在上文中定义了：

Server地址为**http://127.0.0.1:8000**
提供模型的端点为**/predict**

除此之外，还需要向server提供参数model（‘yolov3-tiny’ or ‘yolov3’）指定使用的模型。至此，就可以组装出调用的URL了。

base_url = 'http://127.0.0.1:8000'
endpoint = '/predict'
model = 'yolov3-tiny'
url = base_url + endpoint + '?model=' + model
url

'http://127.0.0.1:8000/predict?model=yolov3-tiny'

向服务端发送请求

上文在Server端定义的接口，是POST类型，因此在Client也将发送POST请求至Server端。定义response_from_server方法，用于发送POST请求，并处理Server返回数据。

def respond_from_server(url, image_file, verbose=True):
files = {'file': image_file}
response = requests.post(url, files=files)
# 发出POST请求，并上传需要检测的图片
status_code = response.status_code
if verbose:
msg = 'Everything went well!' if status_code == 200 else 
'There was an error when handling the request.'
print(msg)
return response

接下来可以上传一张图片来测试一下是否能通过response_from_server方法向Server发送请求并收到响应。

with open(‘course1/week1-ungraded-lab/images/clock2.jpg’, ‘rb’) as f:
prediction = respond_from_server(url, f)

检测、并显示检测结果

既然已经能顺利调用Server端的predict接口，接下来只需要将Server端的目标检测结果保存下来并显示就行了。
首先，创建一个目录用于存放Server返回的结果。

dir_name = 'course1/week1-ungraded-lab/images_predicted'
if not os.path.exists(dir_name):
os.makedirs(dir_name)

接着，定义display_image_from_response方法，用于显示Server检测的结果。

def display_image_from_response(response):
image_stream = io.BytesIO(response.content)
# 从响应中提取出字节流
image_stream.seek(0)
file_bytes = np.asarray(bytearray(image_stream.read()), dtype=np.uint8)
# 将字节流转为ndarray
image = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
# 解码得到图片
# 保存图片至本地
filename = 'image_with_objects.jpeg'
cv2.imwrite(f'{dir_name}/{filename}', image)
# 显示图片
display(Image(f'{dir_name}/{filename}'))

可以使用上文请求得到的prediction来测试一下该方法。

display_image_from_response(prediction)

在这里插入图片描述

图片成功显示。接下来测试另一组图片。

image_files = [
'car2.jpg',
'clock3.jpg',
'apples.jpg'
]
for image_file in image_files:
with open(f'course1/week1-ungraded-lab/images/{image_file}', 'rb') as f:
prediction = respond_from_server(url, f, verbose=False)
display_image_from_response(prediction)