Python Cookbook学习笔记ch6_03这里哈哈更方便看6.9编码喝解码十六进制6.10编码解码Base64数据6.11读写二进制数组数据6.12读取嵌套和可变长的二进制数据

69 阅读 0 评论 46 点赞

我是靠谱客的博主开放爆米花，这篇文章主要介绍Python Cookbook学习笔记ch6_03这里哈哈更方便看6.9编码喝解码十六进制6.10编码解码Base64数据6.11读写二进制数组数据6.12读取嵌套和可变长的二进制数据，现在分享给大家，希望可以做个参考。

这里哈哈更方便看

有些自己的代码并没有成功。

6.9编码喝解码十六进制

问题：如何将十六进制字符串解码成一个字节字符串或者反过来执行
方案：可以使用binascii模块的b2a_hex()和a2b_hex()

复制代码

1
2
3
4
import binascii
s = b'hello'
h = binascii.b2a_hex(s)
h

复制代码

1
2
b'68656c6c6f'

复制代码

1
binascii.a2b_hex(h)

复制代码

1
2
b'hello'

也可以使用base64模块

复制代码

1
2
3
import base64
h = base64.b16encode(s)
h

复制代码

1
2
b'68656C6C6F'

复制代码

1
base64.b16decode(h)

复制代码

1
2
b'hello'

注意：b26encode()和b26decode()只能操作大写形式的十六进制字母，而binascii可以处理大小写
上述两个模块输出都是字节字符串，如果想要以Unicode输出，需要额外指令

复制代码

1
2
h = base64.b16encode(s)
h

复制代码

1
2
b'68656C6C6F'

复制代码

1
h.decode('ascii')

复制代码

1
2
'68656C6C6F'

6.10编码解码Base64数据

问题：需要使用Base64格式解码编码二进制数据
方案：b64encode()和b64decode()

复制代码

1
2
3
4
import base64
s = b'hello'
a = base64.b64encode(s)
a

复制代码

1
2
b'aGVsbG8='

复制代码

1
2
b = base64.b64decode(a)
b

复制代码

1
2
b'hello'

复制代码

1
base64.b64encode(s).decode('ascii')

复制代码

1
2
'aGVsbG8='

6.11读写二进制数组数据

问题：如何读写一个二进制数组的结构化数据到python元组中去
方案：使用struct模块
写入

复制代码

1
2
3
4
5
6
from struct import Struct
'''将一个元组写入到二进制文件'''
def write_records(records, format, f):
    record_struct = Struct(format)
    for r in records:
        f.write(record_struct.pack(*r))

复制代码

1
2
3
4
5
6
records = [(1,2.3,4.5),
          (6,3.4,5.8),
          (12,13.5,56.8)
          ]
with open('data_file/data.b','wb') as f:
    write_records(records,'<idd',f)

读取(以块的形式，增量式读取)

复制代码

1
2
3
4
5
from struct import Struct
def read_records(format,f):
    record_struct = Struct(format)
    chunks = iter(lambda: f.read(record_struct.size),b'')
    return (record_struct.unpack(chunk) for chunk in chunks)

复制代码

1
2
3
with open('data_file/data.b','rb') as f:
    for rec in read_records('<idd',f):
        print(rec)

复制代码

1
2
3
4
(1, 2.3, 4.5)
(6, 3.4, 5.8)
(12, 13.5, 56.8)

读取（整个文件一次性读取到一个字节字符串，然后分片解析）

复制代码

1
2
3
4
from struct import Struct 
def unpack_records(format,data):
    record_struct = Struct(format)
    return (record_struct.unpack_from(data,offset) for offset in range(0,len(data),record_struct.size))

复制代码

1
2
3
4
with open('data_file/data.b','rb') as f:
    data = f.read()
    for rec in unpack_records('<idd',data):
        print(rec)

复制代码

1
2
3
4
(1, 2.3, 4.5)
(6, 3.4, 5.8)
(12, 13.5, 56.8)

结构体通常会使用一些结构码值 i, d, f 等。这些代码分别代表某个特定的二进制数据类型如 32 位整数， 64 位浮点数， 32 位浮点数等。第一个字符 <指定了字节顺序。在这个例子中，它表示”低位在前”。更改这个字符为 > 表示高位在前，或者是 ! 表示网络字节顺序
size 属性包含了结构的字节数，这在 I/O 操作时非常有用。 pack() 和 unpack() 方法被用来打包和解包数据

复制代码

1
2
3
from struct import Struct
record_struct = Struct('<idd')
record_struct.size

复制代码

1
2
20

复制代码

1
record_struct.pack(1,2.0,3.0)

复制代码

1
2
b'x01x00x00x00x00x00x00x00x00x00x00@x00x00x00x00x00x00x08@'

复制代码

1
record_struct.unpack(_)

复制代码

1
2
(1, 2.0, 3.0)

iter() 被用来创建一个返回固定大小数据块的迭代器，这个迭代器会不断的调用一个用户提供的可调用对象 (比如 lambda: f.read(record_struct.size) )，直到它返回一个特殊的值 (如 b’‘)，这时候迭代停止

复制代码

1
2
3
4
5
with open('data_file/data.b','rb') as f:
    chunks = iter(lambda: f.read(20),b'')
    for chunk in chunks:
        print(len(chunk))
        print(chunk,'n')

复制代码

1
2
3
4
5
6
7
8
9
20
b'x01x00x00x00ffffffx02@x00x00x00x00x00x00x12@' 

20
b'x06x00x00x00333333x0b@333333x17@' 

20
b'x0cx00x00x00x00x00x00x00x00x00+@ffffffL@'

unpack_from() 对于从一个大型二进制数组中提取二进制数据非常有用，因为它不会产生任何的临时对象或者进行内存复制操作。你只需要给它一个字节字符串 (或数组) 和一个字节偏移量，它会从那个位置开始直接解包数据
在解包时命名元组会很有用

复制代码

1
2
3
4
5
6
from collections import namedtuple
Record = namedtuple('Racord',['kind','x','y'])
with open('data_file/data.b','rb') as f:
    records = (Record(*r) for r in read_records('<idd',f))
    for r in records:
        print(r.kind,r.x,r.y)

复制代码

1
2
3
4
1 2.3 4.5
6 3.4 5.8
12 13.5 56.8

如果是处理大量的二进制数据，可以使用numpy

复制代码

1
2
3
4
import numpy as np
with open('data_file/data.b','rb') as f :
    records =  np.fromfile(f,dtype='<i,<d,<d')
    print(records)

复制代码

1
2
[( 1,  2.3,  4.5) ( 6,  3.4,  5.8) (12, 13.5, 56.8)]

6.12读取嵌套和可变长的二进制数据

问题：读取包含嵌套或者可变长记录集合的复杂二进制格式的数据
方案：struct模块几乎可以处理所有二进制的数据结构

复制代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#写
polys = [[ (1.0, 2.5), (3.5, 4.0), (2.5, 1.5) ],
         [ (7.0, 1.2), (5.1, 3.0), (0.5, 7.5), (0.8, 9.0) ],
         [ (3.4, 6.3), (1.2, 0.5), (4.6, 9.2) ],
        ]
import struct
import itertools
def write_polys(filename,polys):
    flattened = list(itertools.chain(*polys))
    min_x = min(x for x,y in flattened)
    max_x = max(x for x,y in flattened)
    min_y = min(y for x,y in flattened)
    max_y = max(y for x,y in flattened)
    with open(filename,'wb') as f:
        f.write(struct.pack('iddddi',0x1234,min_x,min_y,max_x,max_y,len(polys)))
        for poly in polys:
            size = len(poly) * struct.calcsize('<idd')
            f.write(struct.pack('<i',size+4))
            for pt in poly:
                f.write(struct.pack('<dd',*pt))

复制代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 读
def read_polys(filename):
    with open(filename,'rb') as f:
        header = f.read(40)
        file_code, min_x,min_y,max_x,max_y,num_polys = struct.unpack('<iddddi',header)
        polys = []
        for n in range(num_polys):
            pbytes = struct.unpack('<i',f.read(4))
            poly = []
            for m in range(pbytes//16):
                pt = struct.unpack('<dd',f.read(16))
                poly.append(pt)
            polys.append(poly)
        return polys

上述代码过于繁琐，参考下面得代码

复制代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import struct
class StructField:
    def __init__(self,format,offset):
        self.format = format
        self.offset = offset
    def __get__(self,instance,cls):
        if  instance is None:
            return self
        else:
            r = struct.unpack_from(self.foramt,instance._buffer,se.f.offset)
            return r[0] if len(r) == 1 else r

class Structure:
    def __init__(self,bytedata):
        self._buffer = memoryview(bytedata)

class PolyHeader(structure):
    file_code = StructField('<i',0)
    min_x = StructField('<d',4)
    min_y = StructField('<d',12)
    max_x = StructField('<d',20)
    max_y = StructField('<d',28)
    num_polys = StructField('<i',36)