PyTorch学习记录
PyTorch是一个Python机器学习框架
张量
Tensors
Tensors很像矩阵、向量,在PyTorch中使用Tensors编码输入和输出
构造
import torch import numpy as np
data = [[1, 2],[3, 4]] x_data = torch.tensor(data)
np_array = np.array(data) x_np = torch.from_numpy(np_array)
shape = (2,3,) rand_tensor = torch.rand(shape) ones_tensor = torch.ones(shape) zeros_tensor = torch.zeros(shape)
a = torch.arange(10)
|
矩阵运算
拼接
tensor1 = torch.tensor([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
tensor2 = torch.tensor([[13, 14, 15, 16], [17, 18, 19, 20], [21, 22, 23, 24]]) result1 = torch.cat([tensor1, tensor2], dim=0)
result2 = torch.cat([tensor1, tensor2], dim=1)
|
重复
a = torch.tensor([1, 2, 3])
b = torch.repeat_interleave(a, 2)
repeats = torch.tensor([2, 3, 1]) c = torch.repeat_interleave(a, repeats)
|
保存和加载权重
torch.save(model, 'model.pth') model = torch.load('model.pth')
|
nn
Module
所有模型层的基类
class Attention(nn.Module): def __init__(self, embed_dim): super().__init__() self.embed_dim = embed_dim self.query = nn.Linear(embed_dim, embed_dim) self.key = nn.Linear(embed_dim, embed_dim) self.value = nn.Linear(embed_dim, embed_dim) def forward(self, query, key, value): pass
|
Sequential
model = nn.Sequential( nn.Conv2d(1, 20, 5), nn.ReLU(), nn.Conv2d(20, 64, 5), nn.ReLU() )
model2 = nn.Sequential() model2.add_module("conv1", nn.Conv2d(1, 20, 5)) model2.add_module('relu1', nn.ReLU()) model2.add_module('conv2', nn.Conv2d(20, 64, 5)) model2.add_module('relu2', nn.ReLU())
|
Sequential有自动前向传播,模型层只要串在一起,就能自动处理forward,而ModuleList需要手动处理
ModuleList
class Encoder(nn.Module): def __init__(self): self.down_blocks = nn.ModuleList([]) for i in xxx: down_block = ResnetDownsampleBlock2D(...) self.down_blocks.append(down_block) def forward(self, sample: torch.Tensor) -> torch.Tensor: def create_custom_forward(module): def custom_forward(*inputs): return module(*inputs) return custom_forward for down_block in self.down_blocks: sample = torch.utils.checkpoint.checkpoint( create_custom_forward(down_block), sample, use_reentrant=False ) return sample
|
全连接层
query = nn.Linear(in_features, out_features)
|
- in_features输入纬度
- out_features:输出纬度
Embedding
是一个查找表,将离散输入转为连续向量,常用于将单词转为向量
position_encoding = nn.Embedding(targets_length, d_model)
|
卷积层
conv = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1)
|
- in_channels:输入信号的通道数,比如RGB就是3,灰度就是1
- out_channels:输出的深度
- kernel_size:卷积核大小
- stride:卷积核移动的步长
- padding_mode:边界填充规则,默认为zeros
逆卷积层
conv_transpose = nn.ConvTranspose2d(in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1)
|
激活函数
激活函数主要是为模型引入非线性,好的激活函数能加速训练、缓解梯度爆炸
激活函数 |
nn |
Sigmoid |
nn.Sigmoid() |
ReLU |
nn.ReLU() |
Sigmoid Linear Unit |
nn.SiLU() |
Softmax |
nn.Softmax(dim=None) |
Exponential Linear Unit |
nn.ELU() |
Dropout
dropout = nn.Dropout(p=dropout_rate)
|
Identity
恒等映射,不做任何处理,直接返回输入
self.drop_path = DropPath(drop_path_rate) if drop_path_rate > 0. else nn.Identity()
|
- transforms.Compose:将多个数据变化操作串联
- transforms.CenterCrop:中间裁剪
- transforms.Resize:调整大小
- transforms.ToTensor:图片数据转为tensor