aiohttp 模块

306 阅读 0 评论 202 点赞

我是靠谱客的博主小巧外套，这篇文章主要介绍aiohttp 模块，现在分享给大家，希望可以做个参考。

asyncio可以实现单线程并发IO操作。如果仅用在客户端，发挥的威力不大。如果把asyncio用在服务器端，例如Web服务器，由于HTTP连接就是IO操作，因此可以用单线程+coroutine实现多用户的高并发支持。

asyncio实现了TCP、UDP、SSL等协议，aiohttp则是基于asyncio实现的HTTP框架。

一、安装`aiohttp`

pip3 install aiohttp

二、`aiohttp`的使用

1、`aiohttp`的简单使用

import asyncio,aiohttp

async def fetch_async(url):
    print(url)
    async with aiohttp.request("GET",url) as r:
        # 或者直接await r.read()不编码，直接读取，适合于图像等无法编码文件
        reponse = await r.text(encoding="utf-8")　　
        print(reponse)

tasks = [fetch_async('http://www.baidu.com/'), fetch_async('http://www.chouti.com/')]

event_loop = asyncio.get_event_loop()
results = event_loop.run_until_complete(asyncio.gather(*tasks))
event_loop.close()

2、发起一个`session`请求

import asyncio,aiohttp

async def fetch_async(url):
    print(url)
    async with aiohttp.ClientSession() as session:　　# 协程嵌套，只需要处理最外层协程即可fetch_async
        async with session.get(url) as resp:
            print(resp.status)
            # 因为这里使用到了await关键字，实现异步，所有他上面的函数体需要声明为异步async
            print(await resp.text())　　

tasks = [fetch_async('http://www.baidu.com/'), fetch_async('http://www.cnblogs.com/ssyfj/')]

event_loop = asyncio.get_event_loop()
results = event_loop.run_until_complete(asyncio.gather(*tasks))
event_loop.close()

session除了上面的get方法之外，会话还支持post、put、delete等方法。

session.put('http://httpbin.org/put', data=b'data')
session.delete('http://httpbin.org/delete')
session.head('http://httpbin.org/get')
session.options('http://httpbin.org/get')
session.patch('http://httpbin.org/patch', data=b'data')

不要为每次的连接都创建一次session,一般情况下只需要创建一个session，然后使用这个session执行所有的请求。

每个session对象，内部包含了一个连接池，并且将会保持连接和连接复用（默认开启）可以加快整体的性能。

3、在`url`中传递参数（其实跟`requests`模块的使用大致相同）

只需要将参数字典，传入params参数即可。

import asyncio, aiohttp

async def func1(url,params):
    async with aiohttp.ClientSession() as session:
        async with session.get(url,params=params) as r:
            print(r.url)
            print(await r.read())

tasks = [func1('https://www.ckook.com/forum.php',{"gid":6}),]

event_loop = asyncio.get_event_loop()
results = event_loop.run_until_complete(asyncio.gather(*tasks))
event_loop.close()

4、获取响应内容

由于获取的响应内容是一个阻塞耗时过程，所以我们使用await实现协程切换。

（1）使用`text()`方法

async def func1(url,params):
    async with aiohttp.ClientSession() as session:
        async with session.get(url,params=params) as r:
            print(r.url) 
            print(r.charset)　　#查看默认编码为utf-8
            print(await r.text())　　# 不编码，则是使用默认编码　　使用encoding指定编码

（2）使用read()方法，不进行编码，为字节形式

async def func1(url,params):
    async with aiohttp.ClientSession() as session:
        async with session.get(url,params=params) as r:
            print(r.url)
            print(await r.read())

注意：text(),read()方法是把整个响应体读入内存，如果你是获取大量的数据，请考虑使用”字节流“（StreamResponse）

5、特殊响应内容`json`

async def func1(url,params):
    async with aiohttp.ClientSession() as session:
        async with session.get(url,params=params) as r:
            print(r.url)
            print(r.charset)
            print(await r.json())　　# 可以设置编码，设置处理函数

6、字节流形式获取数据

不同于text，read一次性获取所有数据。
注意：我们获取的session.get()是Response对象，他继承于StreamResponse

async def func1(url,params):
    async with aiohttp.ClientSession() as session:
        async with session.get(url,params=params) as r:
            print(await r.content.read(10))    # 读取前10字节

下面字节流形式读取数据，保存文件

async def func1(url,params,filename):
    async with aiohttp.ClientSession() as session:
        async with session.get(url,params=params) as r:
            with open(filename,"wb") as fp:
                while True:
                    chunk = await r.content.read(10)
                    if not chunk:
                        break
                    fp.write(chunk)

tasks = [func1('https://www.ckook.com/forum.php',{"gid":6},"1.html"),]

注意：

async with session.get(url,params=params) as r:　　# 异步上下文管理器

with open(filename,"wb") as fp:　　# 普通上下文管理器

两者的区别：

异步上下文管理器指的是在enter和exit方法处能够暂停执行的上下文管理器

为了实现这样的功能，需要加入两个新的方法：__aenter__ 和__aexit__。这两个方法都要返回一个 awaitable类型的值。

7、自定义请求头

async def func1(url,params,filename):
    async with aiohttp.ClientSession() as session:
        headers = {'Content-Type':'text/html; charset=utf-8'}
        async with session.get(url,params=params,headers=headers) as r:
            with open(filename,"wb") as fp:
                while True:
                    chunk = await r.content.read(10)
                    if not chunk:
                        break
                    fp.write(chunk)

8、自定义`cookie`

注意：对于自定义cookie，我们需要设置在ClientSession（cookie=自定义cookie字典），而不是session.get()字典中。

    def __init__(self, *, connector=None, loop=None, cookies=None,
                 headers=None, skip_auto_headers=None,
                 auth=None, json_serialize=json.dumps,
                 request_class=ClientRequest, response_class=ClientResponse,
                 ws_response_class=ClientWebSocketResponse,
                 version=http.HttpVersion11,
                 cookie_jar=None, connector_owner=True, raise_for_status=False,
                 read_timeout=sentinel, conn_timeout=None,
                 timeout=sentinel,
                 auto_decompress=True, trust_env=False,
                 trace_configs=None):

使用：

cookies = {'cookies_are': 'working'}
async with ClientSession(cookies=cookies) as session:

9、获取当前访问网站的`cookie`

async with session.get(url) as resp:
    print(resp.cookies)

10、获取网站的响应状态码

async with session.get(url) as resp:
    print(resp.status)

11、查看响应头

resp.headers 			来查看响应头，得到的值类型是一个dict：
resp.raw_headers　	　  查看原生的响应头，字节类型

12、查看重定向的响应头

我们此时已经到了新的网址，向之前的网址查看

resp.history　　#查看被重定向之前的响应头

13、超时处理

默认的IO操作都有5分钟的响应时间，我们可以通过timeout进行重写：

async with session.get('https://github.com', timeout=60) as r:
    ...

如果 timeout=None 或者 timeout=0 将不进行超时检查，也就是不限时长。

14、`ClientSession`用于在多个链接之间（同一网站）共享`cookie`，请求头等

async def func1():
    cookies = {'my_cookie': "my_value"}
    async with aiohttp.ClientSession(cookies=cookies) as session:
        async with session.get("https://segmentfault.com/q/1010000007987098") as r:
            print(session.cookie_jar.filter_cookies("https://segmentfault.com"))
        async with session.get("https://segmentfault.com/hottest") as rp:
            print(session.cookie_jar.filter_cookies("https://segmentfault.com"))

处理结果：

Set-Cookie: PHPSESSID=web2~d8grl63pegika2202s8184ct2q
Set-Cookie: my_cookie=my_value
Set-Cookie: PHPSESSID=web2~d8grl63pegika2202s8184ct2q
Set-Cookie: my_cookie=my_value

我们最好使用session.cookie_jar.filter_cookies()获取网站cookie，不同于requests模块，虽然我们可以使用rp.cookies有可能获取到cookie，但似乎并未获取到所有的cookies。

async def func1():
    cookies = {'my_cookie': "my_value"}
    async with aiohttp.ClientSession(cookies=cookies) as session:
        async with session.get("https://segmentfault.com/q/1010000007987098") as rp:
            print(session.cookie_jar.filter_cookies("https://segmentfault.com"))
            print(rp.cookies)　　# Set-Cookie: PHPSESSID=web2~jh3ouqoabvr4e72f87vtherkp6; Domain=segmentfault.com; Path=/　　# 首次访问会获取网站设置的cookie
        async with session.get("https://segmentfault.com/hottest") as rp:
            print(session.cookie_jar.filter_cookies("https://segmentfault.com"))
            print(rp.cookies)　　#为空，服务端未设置cookie
        async with session.get("https://segmentfault.com/newest") as rp:
            print(session.cookie_jar.filter_cookies("https://segmentfault.com"))
            print(rp.cookies)　　#为空，服务端未设置cookie

小结

当我们使用rp.cookie时，只会获取到当前url下设置的cookie，不会维护整站的cookie

而session.cookie_jar.filter_cookies("https://segmentfault.com")会一直保留这个网站的所有设置cookies，含有我们在会话时设置的cookie，并且会根据响应修改更新cookie。这个才是我们需要的，而我们设置cookie，也是需要在aiohttp.ClientSession(cookies=cookies)中设置

15、控制同时连接的数量

TCPConnector维持连接池，限制并行连接的数量，当池满了，有请求退出再加入新的请求：

async def func1():
    cookies = {'my_cookie': "my_value"}
    conn = aiohttp.TCPConnector(limit=2)　　# 默认100，0表示无限
    async with aiohttp.ClientSession(cookies=cookies,connector=conn) as session:
        for i in range(7,35):
            url = "https://www.ckook.com/list-%s-1.html"%i
            async with session.get(url) as rp:
                print('---------------------------------')
                print(rp.status)

限制同时打开连接到同一端点的数量（(host, port, is_ssl)三的倍数），可以通过设置 limit_per_host 参数：
limit_per_host：同一端点的最大连接数量。同一端点即(host, port, is_ssl)完全相同

conn = aiohttp.TCPConnector(limit_per_host=30)#默认是0

16、`post`传递数据的方法

（1）模拟表单

payload = {'key1': 'value1', 'key2': 'value2'}
async with session.post('http://httpbin.org/post', data=payload) as resp:
    print(await resp.text())

注意：data=dict的方式post的数据将被转码，和form提交数据是一样的作用，如果你不想被转码，可以直接以字符串的形式 data=str 提交，这样就不会被转码。

（2）`post` `json`

payload = {'some': 'data'}
 
async with session.post(url, data=json.dumps(payload)) as resp:

其实json.dumps(payload)返回的也是一个字符串，只不过这个字符串可以被识别为json格式

（3）`post`小文件

url = 'http://httpbin.org/post'
files = {'file': open('report.xls', 'rb')}
 
await session.post(url, data=files)



url = 'http://httpbin.org/post'
data = FormData()
data.add_field('file',
               open('report.xls', 'rb'),
               filename='report.xls',
               content_type='application/vnd.ms-excel')
 
await session.post(url, data=data)

如果将文件对象设置为数据参数，aiohttp将自动以字节流的形式发送给服务器。

（4）`post`大文件

aiohttp支持多种类型的文件以流媒体的形式上传，所以我们可以在文件未读入内存的情况下发送大文件。

@aiohttp.streamer
def file_sender(writer, file_name=None):
    with open(file_name, 'rb') as f:
        chunk = f.read(2**16)
        while chunk:
            yield from writer.write(chunk)
            chunk = f.read(2**16)
 
# Then you can use `file_sender` as a data provider:
 
async with session.post('http://httpbin.org/post',
                        data=file_sender(file_name='huge_file')) as resp:
    print(await resp.text())

（5）从一个`url`获取文件后，直接`post`给另一个`url`

r = await session.get('http://python.org')
await session.post('http://httpbin.org/post',data=r.content)

（6）`post`预压缩数据

在通过aiohttp发送前就已经压缩的数据, 调用压缩函数的函数名（通常是deflate 或 zlib）作为content-encoding的值：

async def my_coroutine(session, headers, my_data):
    data = zlib.compress(my_data)
    headers = {'Content-Encoding': 'deflate'}
    async with session.post('http://httpbin.org/post',
                            data=data,
                            headers=headers)
        pass

最后

以上就是小巧外套最近收集整理的关于aiohttp 模块的全部内容，更多相关aiohttp内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：# Python 模块
浏览次数：306 次浏览
发布日期：2023-08-11 08:05:04

aiohttp 模块

一、安装`aiohttp`

二、`aiohttp`的使用

1、`aiohttp`的简单使用

2、发起一个`session`请求

3、在`url`中传递参数（其实跟`requests`模块的使用大致相同）

4、获取响应内容

（1）使用`text()`方法

5、特殊响应内容`json`

6、字节流形式获取数据

7、自定义请求头

8、自定义`cookie`

9、获取当前访问网站的`cookie`

10、获取网站的响应状态码

11、查看响应头

12、查看重定向的响应头

13、超时处理

14、`ClientSession`用于在多个链接之间（同一网站）共享`cookie`，请求头等

小结

15、控制同时连接的数量

16、`post`传递数据的方法

（1）模拟表单

（2）`post` `json`

（3）`post`小文件

（4）`post`大文件

（5）从一个`url`获取文件后，直接`post`给另一个`url`

（6）`post`预压缩数据

最后

评论列表共有 0 条评论

发表评论取消回复

aiohttp 模块

一、安装aiohttp

二、aiohttp的使用

1、aiohttp的简单使用

2、发起一个session请求

3、在url中传递参数（其实跟requests模块的使用大致相同）

4、获取响应内容

（1）使用text()方法

5、特殊响应内容json

6、字节流形式获取数据

7、自定义请求头

8、自定义cookie

9、获取当前访问网站的cookie

10、获取网站的响应状态码

11、查看响应头

12、查看重定向的响应头

13、超时处理

14、ClientSession用于在多个链接之间（同一网站）共享cookie，请求头等

小结

15、控制同时连接的数量

16、post传递数据的方法

（1）模拟表单

（2）post json

（3）post小文件

（4）post大文件

（5）从一个url获取文件后，直接post给另一个url

（6）post预压缩数据

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

一、安装`aiohttp`

二、`aiohttp`的使用

1、`aiohttp`的简单使用

2、发起一个`session`请求

3、在`url`中传递参数（其实跟`requests`模块的使用大致相同）

（1）使用`text()`方法

5、特殊响应内容`json`

8、自定义`cookie`

9、获取当前访问网站的`cookie`

14、`ClientSession`用于在多个链接之间（同一网站）共享`cookie`，请求头等

16、`post`传递数据的方法

（2）`post` `json`

（3）`post`小文件

（4）`post`大文件

（5）从一个`url`获取文件后，直接`post`给另一个`url`

（6）`post`预压缩数据

发表评论取消回复