概述
爬取猫咪图片——自动化批量下载
基于上一篇博文爬取猫咪照片进行自动化批量下载。
问题1.异常处理
解决方案:
try:
(code...)
except urllib.error.HTTPError as e:
print(e.reason, e.code, e.headers, sep = 'n')
except urllib.error.URLError as e:
print(type(e.reason))
if isinstance(e.reason, socket.timeout):
print('>>>time out')
else:
(code...)
问题2.网络响应问题造成的程序卡顿
解决方案:加入尝试次数,代码如下:
max_try_time = 5
i = 0
for tries in range(max_try_time):
try:
(code...)
except urllib.error.HTTPError as e:
print(e.reason, e.code, e.headers, sep = 'n')
except urllib.error.URLError as e:
print(type(e.reason))
if isinstance(e.reason, socket.timeout):
print('>>>time out--try %d times' % (tries+1))
else:
(code...)
i += 1
print('>>>ok_%d--try %d times' % (i, tries+1))
break
问题3.批量下载
解决方案:for循环指定长、宽范围,以50步长递增
for width in range(500, 1000):
for height in range(100, 1000):
if width % 50 == 0 and height % 50 == 0:
(code...)
完整代码如下:
from urllib.request import ProxyHandler, build_opener
import urllib.error
import socket
def main():
# size = []
max_try_time = 5
i = 0
for width in range(500, 1000):
for height in range(100, 1000):
if width % 50 == 0 and height % 50 == 0:
# size.append([width, height])
url = 'http://placekitten.com/%s/%s' % (width, height)
proxy_handler = ProxyHandler({
})
opener = build_opener(proxy_handler)
for tries in range(max_try_time):
try:
req = opener.open(url, timeout=5)
req.addheaders = [('User-Agent', 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3722.400 QQBrowser/10.5.3738.400')]
except urllib.error.HTTPError as e:
print(e.reason, e.code, e.headers, sep = 'n')
except urllib.error.URLError as e:
print(type(e.reason))
if isinstance(e.reason, socket.timeout):
print('>>>time out--try %d times' % (tries+1))
else:
img = req.read()
with open(r"C:/Users/Administrator/Desktop/png/cat_%dx%d.jpg" % (width, height), 'wb') as f:
f.write(img)
i += 1
print('>>>ok_%d--try %d times' % (i, tries+1))
break
if __name__ == '__main__':
main()
最后
以上就是孤独刺猬为你收集整理的爬取猫咪图片——自动化批量下载爬取猫咪图片——自动化批量下载的全部内容,希望文章能够帮你解决爬取猫咪图片——自动化批量下载爬取猫咪图片——自动化批量下载所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复