概述
使用Beautifulsoup,我正在抓取以下Web源:
Manchester City's Fabian Delph limped off in the first minute of England Euro 2016 qualifier against Switzerland with a suspected hamstring injury.
The 25-year-old midfielder, who signed for City from Aston Villa in the summer, pulled up suddenly during Tuesday's game at Wembley.
Delph was picked in Roy Hodgson's first XI having been left out of the starting line-up against San Marino on Saturday.
Delph was making his eighth appearance for England.
我使用以下代码:
for item in soup.find_all('div'):
print item.find('p').text.replace('n','')
这可行,但是结果看起来像这样(更像是四个单独的值):
Manchester City's Fabian Delph limped off in the first minute of England's Euro 2016 qualifier against Switzerland with a suspected hamstring injury.
The 25-year-old midfielder, who signed for City from Aston Villa in the summer, pulled up suddenly during Tuesday's game at Wembley.
Delph was picked in Roy Hodgson's first XI having been left out of the starting line-up against San Marino on Saturday.
Delph was making his eighth appearance for England.
如何获得以下格式的输出(更像是单个值):
Manchester City's Fabian Delph limped off in the first minute of England's Euro 2016 qualifier against Switzerland with a suspected hamstring injury. The 25-year-old midfielder, who signed for City from Aston Villa in the summer, pulled up suddenly during Tuesday's game at Wembley. Delph was picked in Roy Hodgson's first XI having been left out of the starting line-up against San Marino on Saturday. Delph was making his eighth appearance for England.
最终,我想将此数据保存在一个csv文件中。以上内容应视为csv文件中的单个值(不是四个值)。
解决方案
你可以试试:
divs = soup.find_all('div')
result = ''.join([div.find('p').text.replace('n','') for div in divs])
print result
第二行将所有div段落文本放在列表中,并将它们逐个连接。您可以检查str.join函数。
这种方法比求和所有字符串(这也是有效,正确和足够好)相加的速度更快,因为它不会在进程中创建额外的字符串。
最后
以上就是安静芹菜为你收集整理的python 去除空格和换行,删除空格和换行符-BeautifulSoup Python的全部内容,希望文章能够帮你解决python 去除空格和换行,删除空格和换行符-BeautifulSoup Python所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复