概述
Pytorch Notebook
由于使用emacs-org编辑,为方便暂且使用了英文
Table of Contents
- tensor
- create
- cloning
- operation
- in-place operations
- transpose (permute)
- about size and indexing
- add
- with numpy
- cuda
- autograd
- track and gradient computing
- function
- backward()
- torch.no_grad()
- neural network
- structured construction
- layers (no order)
- forward propagate structure (ordered)
- sequential construction
- structured construction
- data load
- torchvision
- optimizer
- train
- gpu support
- loss function
- train
- about step()s
- optimizer.step(self, closure = None)
- schedular.step()
- model I/O
- method 1 (recommended)
- method 2
- evaluate
- models
- attributes
- pretrained models
- torchvision.models
- sundry
- problem shooting
- pytorch is deeplearning’s numpy
import torch import torch.nn as nn import torch.nn.functional as F import torch.utils.data as Data import torch.optim as optim import numpy as np
tensor
create
-
uninitialized tensor:
x = torch.empty(5, 3)
-
random tensor:
x = torch.rand(5, 3)
-
zeros:
x = torch.zeros(5, 3)
-
define dtype:
x = torch.zeros(5, 3, dtype = torch.long)
-
from known data:
x = torch.tensor([5.5, 3])
cloning
tryna reuse existing tensor’s properties.
-
new_* methods:
x = x.new_ones(5, 3, dtype = torch.double)# 64-bit
-
copy the size:
x = torch.randn_like(x, dtype = torch.float)# 32-bit*
operation
in-place operations
write ‘_’ behind.
ex. y.add_(x)
-> +=
x.t_() -> directly transpose x
transpose (permute)
x = x.permute(1, 2, 0)
about size and indexing
-
get size:
x.size(axes)
-
resize:
x = torch.randn(4, 4) y = x.view(16) z = x.view(-1, 8) # '-1's size will be inferred from other dims # use .item() to get a scalar to python number x = torch.randn(1) num = x.item() ```
add
# simply
x + y
torch.add(x, y)
# introduce the result
torch.add(x, y, out = result)
# in-place (+=)
y.add_(x)
with numpy
numpy-form and torch-form
share the same memory location,
change together.
- torch.from_numpy(npdata)
- torchdata.numpy()
npdata = np.arange(6).reshape(2, 3) np2torch = torch.from_numpy(npdata) ''' tensor([[0, 1, 2], [3, 4, 5]], dtype=torch.int32) ''' torch2np = np2torch.numpy()
cuda
if torch.cuda.is_available():
device = torch.device('cuda')
# directly create on GPU
y = torch.ones_like(x, device = device)
# copy to GPU
x = x.to(device)
# or x.to('cuda')
z = x + y
# tensor([0.1034], device='cuda:0')
z.to('cpu')
autograd
track and gradient computing
-
set
sometensor.requires_grad
True,
to keep track of all the computations.
(enable training) -
call
.backward()
to compute all gradients. -
gradient accumulate to
.grad
attribute. -
stop tracking:
.detach()
. -
prevent tracking: use code block
with torch.no_grad():
.
function
for operation-created tensor,
tensor.grad_fn
refer to a function that has
created the tensor.
for user-defined tensor, .grad_fn
is None.
backward()
for non-scalar, specify a gradient
that is a tensor
of matching shape.
torch.no_grad()
use with torch.no_grad():
when testing the model.
neural network
the typical learning precedure:
- define the network, define the learnable params.
- iterate over a dataset of inputs.
- process the input through the network.
- compute the loss.
- back-propagate.
- update the params.
(weight = weight - learningrate * gradient
)
structured construction
layers (no order)
import torch.nn as nn
define in net_class’s __init__()
class LeNet(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
forward propagate structure (ordered)
import torch.nn.functional as F
define in net_class’s forward()
class LeNet(nn.Module):
def __init_(self):# layers
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.pool(x)
# write simply with nested structure
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
# single output/input is/should be row vector
# -1 is for batchsize
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
sequential construction
net = nn.Sequential(
nn.Linear(2, 10),
nn.ReLU(),# btw, this ReLU is a class
nn.Linear(10, 2)
)
data load
transforms.ToTensor
<-> transforms.ToPILImage()
import torch.utils.data as Data
mydataset = Data.TensorDataset(data_tensor = x, target_tensor = y)
mydataloader = Data.DataLoader(
dataset = mydataset,
batch_size = BATCH_SIZE,
shuffle = True,
num_workers = 2
)
torchvision
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root = './data', train = True,
download = True, transform = transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = BATCH_SIZE,
shuffle = True, num_workers = 0)
transforms.RandomResizedCrop((height, width))
optimizer
import torch.optim as optim
optimizer = optim.SGD(net.parameters(), lr = 0.001, momentum = 0.9)
train
gpu support
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)
net = Net()
net.to(device)
'''...'''
for epoch in range(epochs):
for i, data in enumerate(trainloader, 0):
inputs, labels = data[0].to(device), data[1].to(device)
loss function
import torch.nn as nn
criterion = nn.CrossEntropyLoss()
train
for epoch in range(2):
trainingloss = 0.0
for i, data in enumerate(trainloader, 0):
# for gpu support
inputs, labels = data[0].to(device), data[1].to(device)
# clear the gradient buffer
optimizer.zero_grad()
# forward
outputs = net(inputs)
# loss computing
loss = criterion(outputs, labels)
# back propagate
loss.backward()
# update weights
optimizer.step()
about step()s
optimizer.step(self, closure = None)
usually used every mini-batch to update the weights.
closure
(callable, optional): A closure that reevaluates the model
proceed back-propagation, and returns the loss.
if closure
isn’t passed, a backward()
should be
proceeded before optimizer.step()
.
schedular.step()
usually used every epoch to adjust learning rate.
model I/O
method 1 (recommended)
only save the weights, not structure.
needa reconstruct the net when evaluating.
PATH = './example-model.pth'
# save
torch.save(net.state_dict(), PATH)
# load
net = Net()# reconstruct the network
net.load_state_dict(torch.load(PATH))
method 2
save all, but unstable for refactor or transfer usage.
PATH = './example-model.pth'
# save
torch.save(net, PATH)
# load
net = torch.load(PATH)
evaluate
class Net(nn.Module):# copy the structure
net = Net()
net.load_state_dict(torch.load(PATH))
# evaluate
class_correct = list(0. for i in range(10))# 10 classes example
class_total = list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs, 1)
c = (predicted == labels).squeeze()
for i in range(4):
label = labels[i]
class_correct[label] += c[i].item()
class_total[label] += 1
models
attributes
- modules() -> all the working modules in a network.
pretrained models
torchvision.models
import torchvision.models as models
import torchvision.transforms as transforms
vgg16 = models.vgg16(pretrained = True).eval()
# all the models use the same normalization
normalization = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
sundry
- normalization: (with mean and std, x -= mean /= std)
is for making the data centralized, thus making the
distribution normal, so as to bettern the classification
performance.
-
torch.max(input, dim,) -> (Tensor, LongTensor): -
torch.max(a, 0)
returns each column’s max value,
then their index.torch.max(a, 1)
returns each row’s max value,
then their columns.
-
torch.nn.functional.softmax(input, dim) -> (Tensor): -
softmax(a, 0)
changea
into Tensor that have all column
sum as 1.softmax(a, 1)
changea
into Tensor that have all row
sum as 1.
-
Tensor.squeeze(): squeeze the length 1 dimensions in the Tensor.
t = torch.Tensor([[1], [2], [3]]) t.squeeze() # tensor([1., 2., 3.])
-
torch.bmm(batch1, batch2, out = None) -> Tensor
: batch-matmul, saybatch1.size()
= [2, 3, 4],
andbatch2.size()
= [2, 4, 5],
so the result’ssize()
would be [2, 3, 5]. -
torch.unsqueeze(input, dim, output = None) -> Tensor
: returns a new tensor with a dimension of size one
inserted at the specified position.the new Tensor shares the same underlying data with this Tensor.
- positive
dim
: range from 0 toinput.dim()
. - negative
dim
: counting backward.
- positive
-
prediction first, label second: when calling lossfunctions, we should pass predicted and label
in order. -
labels are LongTensor (64-bit) by default.
-
paddings
-
nn.ReflectionPad1d(padding)
~nn.ReflectionPad3d(padding)
: use the reflection of the opposite boundary to pad.padding
is number: pad all directions for the same length.- or
padding
is (left_padding, right_padding).
-
nn.ReplicationPad1d(padding)
~nn.ReplicationPad3d(padding)
: use the copy of the original boundary to pad. -
nn.ConstantPad1d(padding, value)
~nn.ConstantPad3d(padding, value)
: use the same value to pad all directions. -
F.pad(input, pad, mode = 'constant', value = 0)
-
problem shooting
-
BrokenPipe Error: encountering this on windows when downloading dataset:
set the num_workers to 0. -
TypeError: ‘module’ object is not callable: - maybe it’s your capital letters’ problem.
like
datasets.MNIST
shouldn’t bedatasets.mnist
. -
Adding softmax layer to CIFAR10-lenet makes the training slower.
-
"trying to backward multiple times without ‘retained = True’": see if mse_loss’s parameters’ shape don’t match.
最后
以上就是淡定自行车为你收集整理的Pytorch笔记Pytorch NotebookTable of Contentstensorautogradneural networkdata loadoptimizertrainmodel I/Oevaluatemodelssundryproblem shooting的全部内容,希望文章能够帮你解决Pytorch笔记Pytorch NotebookTable of Contentstensorautogradneural networkdata loadoptimizertrainmodel I/Oevaluatemodelssundryproblem shooting所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复