python提取网页中的文字_python怎么抓取网页中DIV的文字

316 阅读 0 评论 209 点赞

我是靠谱客的博主鲤鱼服饰，这篇文章主要介绍python提取网页中的文字_python怎么抓取网页中DIV的文字，现在分享给大家，希望可以做个参考。

图图凌乱给谁看

2020-06-09 15:03:53

使用 BeautifulSoup 进行解析 html,需要安装 BeautifulSoup #coding=utf-8

import urllib2

import socket

import httplib

from bs4 import BeautifulSoup

UserAgent = 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36'

def downloadPage(url):

try:

opener = urllib2.build_opener()

headers = { 'User-Agent': UserAgent }

req = urllib2.Request(url = url, headers = headers)

resp = opener.open(req, timeout = 30)

result = resp.read()

return result

except urllib2.HTTPError, ex:

print ex

return ''

except urllib2.URLError, ex:

print ex

return ''

except socket.error, ex:

print ex

return ''

except httplib.BadStatusLine, ex:

print ex

return ''

if __name__ == '__main__':

content = downloadPage("这填douban的地址")

#print content

soap = BeautifulSoup(content, 'lxml')

lst = soap.select('ol.grid_view li')

for item in lst:

# 电影详情页链接

print item.select('div.item > div.pic a')[0].attrs['href']

# 图片链接

print item.select('div.item > div.pic a img')[0].attrs['src']

# 标题

print item.select('div.item > div.info > div.hd > a > span.title')[0].get_text()

# 评分

print item.select('div.item > div.info > div.bd > div.star > span.rating_num')[0].get_text()

print '-------------------------------------------------------------------------'

最后

以上就是鲤鱼服饰最近收集整理的关于python提取网页中的文字_python怎么抓取网页中DIV的文字的全部内容，更多相关python提取网页中内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：python提取网页中的文字
浏览次数：316 次浏览
发布日期：2023-09-05 17:15:25

python提取网页中的文字_python怎么抓取网页中DIV的文字

最后

评论列表共有 0 条评论

发表评论取消回复

python提取网页中的文字_python怎么抓取网页中DIV的文字

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

发表评论取消回复