文章目录 显示

Python微信订餐小程序课程视频

https://edu.csdn.net/course/detail/36074

Python实战量化交易理财系统

https://edu.csdn.net/course/detail/35475

1、均匀分布初始化

　　 torch.nn.init.uniform_(tensor, a=0, b=1)

　　从均匀分布U(a, b)中采样，初始化张量。　　参数：

- tensor - 需要填充的张量
- a - 均匀分布的下界
- b - 均匀分布的上界

　　例子：

w = torch.empty(3, 5)
nn.init.uniform\_(w)
"""
tensor([[0.2116, 0.3085, 0.5448, 0.6113, 0.7697],
 [0.8300, 0.2938, 0.4597, 0.4698, 0.0624],
 [0.5034, 0.1166, 0.3133, 0.3615, 0.3757]])
"""

　　均匀分布详解：

　　若 xxx 服从均匀分布，即 x U(a,b)x U(a,b)x~U(a,b)，其概率密度函数（表征随机变量每个取值有多大的可能性）为，

　　　　f(x)={1b−a,a<x<b0,elsef(x)={1b−a,a<x<b0,elsef(x)=\left{\begin{array}{l}\frac{1}{b-a}, \quad a<x<b \ 0, \quad else \end{array}\right.

　　则有期望和方差，

　　　　E(x)=∫∞−∞xf(x)dx=12(a+b)D(x)=E(x2)−[E(x)]2=(b−a)212E(x)=∫∞−∞xf(x)dx=12(a+b)D(x)=E(x2)−[E(x)]2=(b−a)212\begin{array}{c}E(x)=\int_{-\infty}^{\infty} x f(x) d x=\frac{1}{2}(a+b) \D(x)=E\left(x^{2}\right)-[E(x)]^{2}=\frac{(b-a)^{2}}{12}\end{array}

2、正态(高斯)分布初始化

　　　　 torch.nn.init.normal_(tensor, mean=0.0, std=1.0)

　　从给定的均值和标准差的正态分布 N(mean,std2)N(mean,std2)N\left(\right. mean, \left.s t d^{2}\right) 中生成值，初始化张量。

　　参数:

- tensor - 需要填充的张量
- mean - 正态分布的均值
- std - 正态分布的标准偏差

例子：

w = torch.Tensor(3, 5)
torch.nn.init.normal\_(w, mean=0, std=1)
"""
tensor([[-1.3903, 0.4045, 0.3048, 0.7537, -0.5189],
 [-0.7672, 0.1891, -0.2226, 0.2913, 0.1295],
 [ 1.4719, -0.3049, 0.3144, -1.0047, -0.5424]])
"""

　　正态分布详解:

　　若随机变量 xxx 服从正态分布，即 x∼N(μ,σ2)x∼N(μ,σ2)x \sim N\left(\mu, \sigma^{2}\right) , 其概率密度函数为，

　　　　f(x)=1σ√2πexp(−(x−μ2)2σ2)f(x)=\frac{1}{\sigma \sqrt{2 \pi}} \exp \left(-\frac{\left(x-\mu^{2}\right)}{2 \sigma^{2}}\right)

　　正态分布概率密度函数中一些特殊的概率值:　　

- 68.268949% 的面积在平均值左右的一个标准差 σ\sigma 范围内 (μ±σ\mu \pm \sigma)
- 95.449974% 的面积在平均值左右两个标准差 2σ2 \sigma 的范围内 (μ±2σ\mu \pm 2 \sigma)
- 99.730020% 的面积在平均值左右三个标准差 3σ3 \sigma 的范围内 (μ±3σ\mu \pm 3 \sigma)
- 99.993666% 的面积在平均值左右四个标准差 4σ4 \sigma 的范围内 (μ±4σ\mu \pm 4 \sigma)
　　μ=0\mu=0, σ=1\sigma=1 时的正态分布是标准正态分布。

3. Xavier初始化

3.1 Xavier均匀分布初始化

　　　　 torch.nn.init.xavier_uniform_(tensor, gain=1.0)

　　又称 Glorot 初始化，按照 Glorot, X. & Bengio, Y.(2010)在论文Understanding the difficulty of training deep feedforward neural networks 中描述的方法，从均匀分布 U(−a,a)U(−a, a) 中采样，初始化输入张量 tensortensor，其中 aa 值由下式确定：

　　　　a= gain ×√6 fan_in + fan_out a=\text { gain } \times \sqrt{\frac{6}{\text { fan_in }+\text { fan_out }}}

　　例子：

w = torch.Tensor(3, 5)
nn.init.xavier\_uniform\_(w, gain=torch.nn.init.calculate\_gain('relu'))
"""
tensor([[ 0.7695, -0.7687, -0.2561, -0.5307, 0.5195],
 [-0.6187, 0.4913, 0.3037, -0.6374, 0.9725],
 [-0.2658, -0.4051, -1.1006, -1.1264, -0.1310]])
"""

3.2 Xavier正态分布初始化

　　　　 torch.nn.init.xavier_normal_(tensor, gain=1.0)

　　又称 Glorot 初始化，按照 Glorot, X. & Bengio, Y.(2010)在论文Understanding the difficulty of training deep feedforward neural networks 中描述的方法，从均匀分布 N(0,std2)N\left(0, s t d^{2}\right) 中采样，初始化输入张量 tensortensor，其中 stdstd 值由下式确定：

　　　　std= gain ×√2 fan_in + fan_out \operatorname{std}=\text { gain } \times \sqrt{\frac{2}{\text { fan_in }+\text { fan_out }}}

　　参数:

- tensor - 需要初始化的张量
- gain - 可选的放缩因子

例子：

w = torch.arange(10).view(2,-1).type(torch.float32)
torch.nn.init.xavier\_normal\_(w)
"""
tensor([[-0.3139, -0.3557, 0.1285, -0.9556, 0.3255],
 [-0.6212, 0.3405, -0.4150, -1.3227, -0.0069]])
"""

4. kaiming初始化

4.1 kaiming均匀分布初始化

　　　　 torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')

　　又称 He 初始化，按照He, K. et al. (2015)在论文Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification中描述的方法，从均匀分布U(−bound,bound)U(−bound, bound) 中采样，初始化输入张量 tensor，其中 bound 值由下式确定：

　　　　 bound = gain ×√3 fan_mode \text { bound }=\text { gain } \times \sqrt{\frac{3}{\text { fan_mode }}}

参数:

- tensor - 需要初始化的张量；
- a\mathrm{a}- 这层之后使用的 rectifier的斜率系数，用来计算gain =\sqrt{\frac{2}{1+\mathrm{a}^{2}}} (此参数仅在参数nonlinea rity为'leaky_relu'时生效)；
- mode - 可以为“fan_in”（默认）或“fan_out”。“fan_in”维持前向传播时权值方差，“fan_out”维持反向传播时的方差；
- nonlinearity - 非线性函数（nn.functional中的函数名），pytorch建议仅与“relu”或“leaky_relu”(默认)一起使用；

例子：

w = torch.Tensor(3, 5)
torch.nn.init.kaiming\_uniform\_(w, mode='fan\_in', nonlinearity='relu')
"""
tensor([[-0.4362, -0.8177, -0.7034, 0.7306, -0.6457],
 [-0.5749, -0.6480, -0.8016, -0.1434, 0.0785],
 [ 1.0369, -0.0676, 0.7430, -0.2484, -0.0895]])
"""

4.2 kaiming正态分布初始化

　　　　 torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')

　　又称He初始化，按照He, K. et al. (2015)在论文Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification中描述的方法，从正态分布 N(0,std2)N\left(0, s t d^{2}\right) 中采样，初始化输入张量tensor，其中std值由下式确定：

参数:

- tensor - 需要初始化的张量；
- a\mathrm{a} - 这层之后使用的 rectifier 的斜率系数，用来计算 gain=√21+a2gain =\sqrt{\frac{2}{1+\mathrm{a}^{2}}} (此参数仅在参数nonlinea rity为'leaky_relu'时生效)；
- mode - 可以为"fan_in" (默认) 或“fan_out"。"fan_in"维持前向传播时权值方差，"fan_out"维持反向传播时的方差；
- nonlinearity - 非线性函数 (nn.functional中的函数名)，pytorch建议仅与“relu”或"leaky_relu”(默认)一起使用；

5、正交矩阵初始化

　　　　 torch.nn.init.orthogonal_(tensor, gain=1)

　　用一个(半)正交矩阵初始化输入张量，参考Saxe, A. et al. (2013) - Exact solutions to the nonlinear dynamics of learning in deep linear neural networks。输入张量必须至少有 2 维，对于大于 2 维的张量，超出的维度将被flatten化。

　　正交初始化可以使得卷积核更加紧凑，可以去除相关性，使模型更容易学到有效的参数。

　　参数:

- tensor - 需要初始化的张量
- gain - 可选的放缩因子

　　例子：

w = torch.Tensor(3, 5)
torch.nn.init.orthogonal\_(w)
"""
tensor([[ 0.7395, -0.1503, 0.4474, 0.4321, -0.2090],
 [-0.2625, 0.0112, 0.6515, -0.4770, -0.5282],
 [ 0.4554, 0.6548, 0.0970, -0.4851, 0.3453]])
"""

6、稀疏矩阵初始化

　　　　 torch.nn.init.sparse_(tensor, sparsity, std=0.01)

　　将2维的输入张量作为稀疏矩阵填充，其中非零元素由正态分布 N(0,0.012)N\left(0,0.01^{2}\right) 生成。参考Martens, J.(2010)的 Deep learning via Hessian-free optimization。

　　参数:

- tensor - 需要填充的张量
- sparsity - 每列中需要被设置成零的元素比例
- std - 用于生成非零元素的正态分布的标准偏差

例子：

w = torch.Tensor(3, 5)
torch.nn.init.sparse\_(w, sparsity=0.1)
"""
tensor([[-0.0026, 0.0000, 0.0100, 0.0046, 0.0048],
 [ 0.0106, -0.0046, 0.0000, 0.0000, 0.0000],
 [ 0.0000, -0.0005, 0.0150, -0.0097, -0.0100]])
"""

7、常数初始化

　　　　 torch.nn.init.constant_(tensor, val)

　　使值为常数 val 。

例子：

w=torch.Tensor(3,5)
nn.init.constant\_(w,1.2)
"""
tensor([[1.2000, 1.2000, 1.2000, 1.2000, 1.2000],
 [1.2000, 1.2000, 1.2000, 1.2000, 1.2000],
 [1.2000, 1.2000, 1.2000, 1.2000, 1.2000]])
"""

8、单位矩阵初始化

　　　　 torch.nn.init.eye_(tensor)

　　将二维 tensor 初始化为单位矩阵（the identity matrix）

例子：

w=torch.Tensor(3,5)
nn.init.eye\_(w)
"""
tensor([[1., 0., 0., 0., 0.],
 [0., 1., 0., 0., 0.],
 [0., 0., 1., 0., 0.]])
"""

9、零填充初始化

　　　　 torch.nn.init.zeros_(tensor)

例子：

w = torch.empty(3, 5)
nn.init.zeros\_(w)
"""
tensor([[0., 0., 0., 0., 0.],
 [0., 0., 0., 0., 0.],
 [0., 0., 0., 0., 0.]])
"""

10、应用

　　例子：

print('module-----------')
print(model)
print('setup-----------')
for m in model.modules():
    if isinstance(m,nn.Linear):
        nn.init.xavier\_uniform\_(m.weight, gain=nn.init.calculate\_gain('relu'))
"""
module-----------
Sequential(
  (flatten): FlattenLayer()
  (linear1): Linear(in\_features=784, out\_features=512, bias=True)
  (activation): ReLU()
  (linear2): Linear(in\_features=512, out\_features=256, bias=True)
  (linear3): Linear(in\_features=256, out\_features=10, bias=True)
)
setup-----------
"""

　　例子： 　

for param in model.parameters():
    nn.init.uniform\_(param)

例子：

def weights\_init(m):
    classname = m.\_\_class\_\_.\_\_name\_\_
    if classname.find('Conv2d') != -1:
        nn.init.xavier\_normal\_(m.weight.data)
        nn.init.constant\_(m.bias.data, 0.0)
    elif classname.find('Linear') != -1:
        nn.init.xavier\_normal\_(m.weight)
        nn.init.constant\_(m.bias, 0.0)
model.apply(weights\_init) #apply函数会递归地搜索网络内的所有module并把参数表示的函数应用到所有的module上。

1、均匀分布初始化
__EOF__

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-xU3OTp9P-1646759404745)(https://blog.csdn.net/BlairGrowing)]Blair - 本文链接： https://blog.csdn.net/BlairGrowing/p/15981694.html
- 关于博主： 评论和私信会在第一时间回复。或者直接私信我。
- 版权声明： 本博客所有文章除特别声明外，均采用 BY-NC-SA 许可协议。转载请注明出处！
- 声援博主： 如果您觉得文章对您有帮助，可以点击文章右下角【推荐】一下。

转载请注明：xuhss » PyTorch常用参数初始化方法详解

xuhss 简单、充实的学习

PyTorch常用参数初始化方法详解

Python微信订餐小程序课程视频

Python实战量化交易理财系统

1、均匀分布初始化

2、正态(高斯)分布初始化

3. Xavier初始化

3.1 Xavier均匀分布初始化

3.2 Xavier正态分布初始化

4. kaiming初始化

4.1 kaiming均匀分布初始化

4.2 kaiming正态分布初始化

5、正交矩阵初始化

6、稀疏矩阵初始化

7、常数初始化

8、单位矩阵初始化

9、零填充初始化

10、应用

您必须登录才能发表评论！

Python微信订餐小程序课程视频

Python实战量化交易理财系统

1、均匀分布初始化

2、正态(高斯)分布初始化

3. Xavier初始化

3.1 Xavier均匀分布初始化

3.2 Xavier正态分布初始化

4. kaiming初始化

4.1 kaiming均匀分布初始化

4.2 kaiming正态分布初始化

5、正交矩阵初始化

6、稀疏矩阵初始化

7、常数初始化

8、单位矩阵初始化

9、零填充初始化

10、应用

您必须 登录 才能发表评论！

您必须登录才能发表评论！