我是靠谱客的博主 执着钻石,最近开发中收集的这篇文章主要介绍python如何提取js脚本中内容,如何获取在Python中的JavaScript内容,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

I have a website that has data I want to fetch stored in a javascript. How do I fetch it?

I want to fetch from "var playersData" line. I want to fetch this thing :- "playerId":"showsPlayer" (without quotes obviously). How do I do so?

I've tried beautiful soup. My current script looks like this

q = requests.get('websitelink')

soup = BeautifulSoup(q.text)

searching = soup.findAll('script',{'type':'text/javascript'})

for playerIdin searching:

x = playerId.find_all('var playersData', limit=1)

print x

I'm getting [] as my output. I can't seem to figure out my problem here.

Please help out guys and gals :)

解决方案

BeautifulSoup would only help locating the desired script tag. Then, you would have multiple options: you can extract the desired data with a javascript parser, like slimit, or use regular expressions:

import re

from bs4 import BeautifulSoup

page = """

var logged = true;

var video_id = 59374;

var item_type = 'official';

var debug = false;

var baseUrl = 'http://www.example.com';

var base_url = 'http://www.example.com/';

var assetsBaseUrl = 'http://www.example.com/assets';

var apiBaseUrl = 'http://www.example.com/common';

var playersData = [{"playerId":"showsPlayer","userId":true,"solution":"flash","playlist":[{"itemId":"5090","itemAK":"Movie"}]];

"""

soup = BeautifulSoup(page)

pattern = re.compile(r'"playerId":"(.*?)"', re.MULTILINE | re.DOTALL)

script = soup.find("script", text=pattern)

print pattern.search(script.text).group(1)

Prints:

showsPlayer

最后

以上就是执着钻石为你收集整理的python如何提取js脚本中内容,如何获取在Python中的JavaScript内容的全部内容,希望文章能够帮你解决python如何提取js脚本中内容,如何获取在Python中的JavaScript内容所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(67)

评论列表共有 0 条评论

立即
投稿
返回
顶部