概述
Using Python 2.7 and BeautifulSoup 4, I'm scraping song names from a table.
Right now the script finds links in the row of a table; how can I specify I want the first column?
Ideally I'd be able to switch numbers around to change which ones got selected.
Right now the code looks like this:
from bs4 import BeautifulSoup
import requests
r = requests.get("http://evamsharma.finosus.com/beatles/index.html")
data = r.text
soup = BeautifulSoup(data)
for table in soup.find_all('table'):
for row in soup.find_all('tr'):
for link in soup.find_all('a'):
print(link.contents)
How do I, in effect, index the
tags within each tag?The URL in there right now is a page on my site where I basically copied the table source from Wikipedia to make the scraping a little simpler.
Thanks!
evamvid
解决方案
Find all td tags inside tr and get the one you need by index:
index = 2
for table in soup.find_all('table'):
for row in soup.find_all('tr'):
try:
td = row.find_all('td')[index]
except IndexError:
continue
for link in td.find_all('a'):
print(link.contents)
最后
以上就是花痴绿草为你收集整理的python3 beautifulsoup 表格指定行_BeautifulSoup按数字指定表格列?的全部内容,希望文章能够帮你解决python3 beautifulsoup 表格指定行_BeautifulSoup按数字指定表格列?所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复