PyTorch 通過示例學習 PyTorch

2020-09-10 14:47 更新
原文: https://pytorch.org/tutorials/beginner/pytorch_with_examples.html

作者:賈斯汀·約翰遜

本教程通過獨立的示例介紹 PyTorch 的基本概念。

PyTorch 的核心是提供兩個主要功能:

  • n 維張量,類似于 numpy,但可以在 GPU 上運行
  • 自動區(qū)分以構建和訓練神經網絡

我們將使用完全連接的 ReLU 網絡作為我們的運行示例。 該網絡將具有單個隱藏層,并且將通過最小化網絡輸出與真實輸出之間的歐幾里德距離來進行梯度下降訓練,以適應隨機數(shù)據(jù)。

張量

Warm-up:numpy

在介紹 PyTorch 之前,我們將首先使用 numpy 實現(xiàn)網絡。

Numpy 提供了一個 n 維數(shù)組對象,以及許多用于操縱這些數(shù)組的函數(shù)。 Numpy 是用于科學計算的通用框架。 它對計算圖,深度學習或梯度一無所知。 但是,我們可以使用 numpy 操作手動實現(xiàn)通過網絡的前向和后向傳遞,從而輕松地使用 numpy 使兩層網絡適合隨機數(shù)據(jù):

  1. # -*- coding: utf-8 -*-
  2. import numpy as np
  3. ## N is batch size; D_in is input dimension;
  4. ## H is hidden dimension; D_out is output dimension.
  5. N, D_in, H, D_out = 64, 1000, 100, 10
  6. ## Create random input and output data
  7. x = np.random.randn(N, D_in)
  8. y = np.random.randn(N, D_out)
  9. ## Randomly initialize weights
  10. w1 = np.random.randn(D_in, H)
  11. w2 = np.random.randn(H, D_out)
  12. learning_rate = 1e-6
  13. for t in range(500):
  14. # Forward pass: compute predicted y
  15. h = x.dot(w1)
  16. h_relu = np.maximum(h, 0)
  17. y_pred = h_relu.dot(w2)
  18. # Compute and print loss
  19. loss = np.square(y_pred - y).sum()
  20. print(t, loss)
  21. # Backprop to compute gradients of w1 and w2 with respect to loss
  22. grad_y_pred = 2.0 * (y_pred - y)
  23. grad_w2 = h_relu.T.dot(grad_y_pred)
  24. grad_h_relu = grad_y_pred.dot(w2.T)
  25. grad_h = grad_h_relu.copy()
  26. grad_h[h < 0] = 0
  27. grad_w1 = x.T.dot(grad_h)
  28. # Update weights
  29. w1 -= learning_rate * grad_w1
  30. w2 -= learning_rate * grad_w2

PyTorch:張量

Numpy 是一個很棒的框架,但是它不能利用 GPU 來加速其數(shù)值計算。 對于現(xiàn)代深度神經網絡,GPU 通常會提供 50 倍或更高的加速,因此遺憾的是,numpy 不足以實現(xiàn)現(xiàn)代深度學習。

在這里,我們介紹最基本的 PyTorch 概念:張量。 PyTorch 張量在概念上與 numpy 數(shù)組相同:張量是 n 維數(shù)組,而 PyTorch 提供了許多在這些張量上運行的功能。 在幕后,張量可以跟蹤計算圖和漸變,但它們也可用作科學計算的通用工具。

與 numpy 不同,PyTorch 張量可以利用 GPU 加速其數(shù)字計算。 要在 GPU 上運行 PyTorch Tensor,只需要將其轉換為新的數(shù)據(jù)類型。

在這里,我們使用 PyTorch 張量使兩層網絡適合隨機數(shù)據(jù)。 像上面的 numpy 示例一樣,我們需要手動實現(xiàn)通過網絡的正向和反向傳遞:

  1. # -*- coding: utf-8 -*-
  2. import torch
  3. dtype = torch.float
  4. device = torch.device("cpu")
  5. ## device = torch.device("cuda:0") # Uncomment this to run on GPU
  6. ## N is batch size; D_in is input dimension;
  7. ## H is hidden dimension; D_out is output dimension.
  8. N, D_in, H, D_out = 64, 1000, 100, 10
  9. ## Create random input and output data
  10. x = torch.randn(N, D_in, device=device, dtype=dtype)
  11. y = torch.randn(N, D_out, device=device, dtype=dtype)
  12. ## Randomly initialize weights
  13. w1 = torch.randn(D_in, H, device=device, dtype=dtype)
  14. w2 = torch.randn(H, D_out, device=device, dtype=dtype)
  15. learning_rate = 1e-6
  16. for t in range(500):
  17. # Forward pass: compute predicted y
  18. h = x.mm(w1)
  19. h_relu = h.clamp(min=0)
  20. y_pred = h_relu.mm(w2)
  21. # Compute and print loss
  22. loss = (y_pred - y).pow(2).sum().item()
  23. if t % 100 == 99:
  24. print(t, loss)
  25. # Backprop to compute gradients of w1 and w2 with respect to loss
  26. grad_y_pred = 2.0 * (y_pred - y)
  27. grad_w2 = h_relu.t().mm(grad_y_pred)
  28. grad_h_relu = grad_y_pred.mm(w2.t())
  29. grad_h = grad_h_relu.clone()
  30. grad_h[h < 0] = 0
  31. grad_w1 = x.t().mm(grad_h)
  32. # Update weights using gradient descent
  33. w1 -= learning_rate * grad_w1
  34. w2 -= learning_rate * grad_w2

自動分級

PyTorch:張量和自定等級

在以上示例中,我們必須手動實現(xiàn)神經網絡的正向和反向傳遞。 對于小型的兩層網絡,手動實施反向傳遞并不是什么大問題,但是對于大型的復雜網絡而言,可以很快變得非常麻煩。

幸運的是,我們可以使用自動微分來自動計算神經網絡中的反向傳遞。 PyTorch 中的 autograd 軟件包正是提供了此功能。 使用 autograd 時,網絡的正向傳遞將定義計算圖; 圖中的節(jié)點為張量,邊為從輸入張量生成輸出張量的函數(shù)。 然后通過該圖進行反向傳播,可以輕松計算梯度。

這聽起來很復雜,在實踐中非常簡單。 每個張量代表計算圖中的一個節(jié)點。 如果x是具有x.requires_grad=True的張量,則x.grad是另一個張量,其保持x相對于某個標量值的梯度。

在這里,我們使用 PyTorch 張量和 autograd 來實現(xiàn)我們的兩層網絡。 現(xiàn)在我們不再需要手動通過網絡實現(xiàn)反向傳遞:

  1. # -*- coding: utf-8 -*-
  2. import torch
  3. dtype = torch.float
  4. device = torch.device("cpu")
  5. ## device = torch.device("cuda:0") # Uncomment this to run on GPU
  6. ## N is batch size; D_in is input dimension;
  7. ## H is hidden dimension; D_out is output dimension.
  8. N, D_in, H, D_out = 64, 1000, 100, 10
  9. ## Create random Tensors to hold input and outputs.
  10. ## Setting requires_grad=False indicates that we do not need to compute gradients
  11. ## with respect to these Tensors during the backward pass.
  12. x = torch.randn(N, D_in, device=device, dtype=dtype)
  13. y = torch.randn(N, D_out, device=device, dtype=dtype)
  14. ## Create random Tensors for weights.
  15. ## Setting requires_grad=True indicates that we want to compute gradients with
  16. ## respect to these Tensors during the backward pass.
  17. w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
  18. w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)
  19. learning_rate = 1e-6
  20. for t in range(500):
  21. # Forward pass: compute predicted y using operations on Tensors; these
  22. # are exactly the same operations we used to compute the forward pass using
  23. # Tensors, but we do not need to keep references to intermediate values since
  24. # we are not implementing the backward pass by hand.
  25. y_pred = x.mm(w1).clamp(min=0).mm(w2)
  26. # Compute and print loss using operations on Tensors.
  27. # Now loss is a Tensor of shape (1,)
  28. # loss.item() gets the scalar value held in the loss.
  29. loss = (y_pred - y).pow(2).sum()
  30. if t % 100 == 99:
  31. print(t, loss.item())
  32. # Use autograd to compute the backward pass. This call will compute the
  33. # gradient of loss with respect to all Tensors with requires_grad=True.
  34. # After this call w1.grad and w2.grad will be Tensors holding the gradient
  35. # of the loss with respect to w1 and w2 respectively.
  36. loss.backward()
  37. # Manually update weights using gradient descent. Wrap in torch.no_grad()
  38. # because weights have requires_grad=True, but we don't need to track this
  39. # in autograd.
  40. # An alternative way is to operate on weight.data and weight.grad.data.
  41. # Recall that tensor.data gives a tensor that shares the storage with
  42. # tensor, but doesn't track history.
  43. # You can also use torch.optim.SGD to achieve this.
  44. with torch.no_grad():
  45. w1 -= learning_rate * w1.grad
  46. w2 -= learning_rate * w2.grad
  47. # Manually zero the gradients after updating weights
  48. w1.grad.zero_()
  49. w2.grad.zero_()

PyTorch:定義新的autograd功能

在幕后,每個原始的 autograd 運算符實際上都是在 Tensor 上運行的兩個函數(shù)。 正向函數(shù)從輸入張量計算輸出張量。 向后函數(shù)接收相對于某個標量值的輸出張量的梯度,并計算相對于相同標量值的輸入張量的梯度。

在 PyTorch 中,我們可以通過定義torch.autograd.Function的子類并實現(xiàn)forward和backward函數(shù)來輕松定義自己的 autograd 運算符。 然后,我們可以通過構造實例并像調用函數(shù)一樣調用新的 autograd 運算符,并傳遞包含輸入數(shù)據(jù)的張量。

在此示例中,我們定義了自己的自定義 autograd 函數(shù)來執(zhí)行 ReLU 非線性,并使用它來實現(xiàn)我們的兩層網絡:

  1. # -*- coding: utf-8 -*-
  2. import torch
  3. class MyReLU(torch.autograd.Function):
  4. """
  5. We can implement our own custom autograd Functions by subclassing
  6. torch.autograd.Function and implementing the forward and backward passes
  7. which operate on Tensors.
  8. """
  9. @staticmethod
  10. def forward(ctx, input):
  11. """
  12. In the forward pass we receive a Tensor containing the input and return
  13. a Tensor containing the output. ctx is a context object that can be used
  14. to stash information for backward computation. You can cache arbitrary
  15. objects for use in the backward pass using the ctx.save_for_backward method.
  16. """
  17. ctx.save_for_backward(input)
  18. return input.clamp(min=0)
  19. @staticmethod
  20. def backward(ctx, grad_output):
  21. """
  22. In the backward pass we receive a Tensor containing the gradient of the loss
  23. with respect to the output, and we need to compute the gradient of the loss
  24. with respect to the input.
  25. """
  26. input, = ctx.saved_tensors
  27. grad_input = grad_output.clone()
  28. grad_input[input < 0] = 0
  29. return grad_input
  30. dtype = torch.float
  31. device = torch.device("cpu")
  32. ## device = torch.device("cuda:0") # Uncomment this to run on GPU
  33. ## N is batch size; D_in is input dimension;
  34. ## H is hidden dimension; D_out is output dimension.
  35. N, D_in, H, D_out = 64, 1000, 100, 10
  36. ## Create random Tensors to hold input and outputs.
  37. x = torch.randn(N, D_in, device=device, dtype=dtype)
  38. y = torch.randn(N, D_out, device=device, dtype=dtype)
  39. ## Create random Tensors for weights.
  40. w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
  41. w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)
  42. learning_rate = 1e-6
  43. for t in range(500):
  44. # To apply our Function, we use Function.apply method. We alias this as 'relu'.
  45. relu = MyReLU.apply
  46. # Forward pass: compute predicted y using operations; we compute
  47. # ReLU using our custom autograd operation.
  48. y_pred = relu(x.mm(w1)).mm(w2)
  49. # Compute and print loss
  50. loss = (y_pred - y).pow(2).sum()
  51. if t % 100 == 99:
  52. print(t, loss.item())
  53. # Use autograd to compute the backward pass.
  54. loss.backward()
  55. # Update weights using gradient descent
  56. with torch.no_grad():
  57. w1 -= learning_rate * w1.grad
  58. w2 -= learning_rate * w2.grad
  59. # Manually zero the gradients after updating weights
  60. w1.grad.zero_()
  61. w2.grad.zero_()

<cite>nn</cite>模塊

PyTorch:nn

計算圖和 autograd 是定義復雜運算符并自動采用導數(shù)的非常強大的范例。 但是對于大型神經網絡,原始的 autograd 可能會有點太低了。

在構建神經網絡時,我們經常想到將計算安排在層中,其中某些層具有可學習的參數(shù),這些參數(shù)將在學習期間進行優(yōu)化。

在 TensorFlow 中,像 Keras , TensorFlow-Slim 和 TFLearn 之類的軟件包在原始計算圖上提供了更高層次的抽象,可用于構建神經網絡。

在 PyTorch 中,nn包也達到了相同的目的。 nn包定義了一組模塊,它們大致等效于神經網絡層。 模塊接收輸入張量并計算輸出張量,但也可以保持內部狀態(tài),例如包含可學習參數(shù)的張量。 nn程序包還定義了一組有用的損失函數(shù),這些函數(shù)通常在訓練神經網絡時使用。

在此示例中,我們使用nn包來實現(xiàn)我們的兩層網絡:

  1. # -*- coding: utf-8 -*-
  2. import torch
  3. ## N is batch size; D_in is input dimension;
  4. ## H is hidden dimension; D_out is output dimension.
  5. N, D_in, H, D_out = 64, 1000, 100, 10
  6. ## Create random Tensors to hold inputs and outputs
  7. x = torch.randn(N, D_in)
  8. y = torch.randn(N, D_out)
  9. ## Use the nn package to define our model as a sequence of layers. nn.Sequential
  10. ## is a Module which contains other Modules, and applies them in sequence to
  11. ## produce its output. Each Linear Module computes output from input using a
  12. ## linear function, and holds internal Tensors for its weight and bias.
  13. model = torch.nn.Sequential(
  14. torch.nn.Linear(D_in, H),
  15. torch.nn.ReLU(),
  16. torch.nn.Linear(H, D_out),
  17. )
  18. ## The nn package also contains definitions of popular loss functions; in this
  19. ## case we will use Mean Squared Error (MSE) as our loss function.
  20. loss_fn = torch.nn.MSELoss(reduction='sum')
  21. learning_rate = 1e-4
  22. for t in range(500):
  23. # Forward pass: compute predicted y by passing x to the model. Module objects
  24. # override the __call__ operator so you can call them like functions. When
  25. # doing so you pass a Tensor of input data to the Module and it produces
  26. # a Tensor of output data.
  27. y_pred = model(x)
  28. # Compute and print loss. We pass Tensors containing the predicted and true
  29. # values of y, and the loss function returns a Tensor containing the
  30. # loss.
  31. loss = loss_fn(y_pred, y)
  32. if t % 100 == 99:
  33. print(t, loss.item())
  34. # Zero the gradients before running the backward pass.
  35. model.zero_grad()
  36. # Backward pass: compute gradient of the loss with respect to all the learnable
  37. # parameters of the model. Internally, the parameters of each Module are stored
  38. # in Tensors with requires_grad=True, so this call will compute gradients for
  39. # all learnable parameters in the model.
  40. loss.backward()
  41. # Update the weights using gradient descent. Each parameter is a Tensor, so
  42. # we can access its gradients like we did before.
  43. with torch.no_grad():
  44. for param in model.parameters():
  45. param -= learning_rate * param.grad

PyTorch:優(yōu)化

到目前為止,我們通過手動更改持有可學習參數(shù)的張量(使用torch.no_grad().data來避免在自動分級中跟蹤歷史記錄)來更新模型的權重。 對于像隨機梯度下降這樣的簡單優(yōu)化算法而言,這并不是一個巨大的負擔,但是在實踐中,我們經常使用更復雜的優(yōu)化器(例如 AdaGrad,RMSProp,Adam 等)來訓練神經網絡。

PyTorch 中的optim軟件包抽象了優(yōu)化算法的思想,并提供了常用優(yōu)化算法的實現(xiàn)。

在此示例中,我們將使用nn包像以前一樣定義我們的模型,但是我們將使用optim包提供的 Adam 算法優(yōu)化模型:

  1. # -*- coding: utf-8 -*-
  2. import torch
  3. ## N is batch size; D_in is input dimension;
  4. ## H is hidden dimension; D_out is output dimension.
  5. N, D_in, H, D_out = 64, 1000, 100, 10
  6. ## Create random Tensors to hold inputs and outputs
  7. x = torch.randn(N, D_in)
  8. y = torch.randn(N, D_out)
  9. ## Use the nn package to define our model and loss function.
  10. model = torch.nn.Sequential(
  11. torch.nn.Linear(D_in, H),
  12. torch.nn.ReLU(),
  13. torch.nn.Linear(H, D_out),
  14. )
  15. loss_fn = torch.nn.MSELoss(reduction='sum')
  16. ## Use the optim package to define an Optimizer that will update the weights of
  17. ## the model for us. Here we will use Adam; the optim package contains many other
  18. ## optimization algoriths. The first argument to the Adam constructor tells the
  19. ## optimizer which Tensors it should update.
  20. learning_rate = 1e-4
  21. optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
  22. for t in range(500):
  23. # Forward pass: compute predicted y by passing x to the model.
  24. y_pred = model(x)
  25. # Compute and print loss.
  26. loss = loss_fn(y_pred, y)
  27. if t % 100 == 99:
  28. print(t, loss.item())
  29. # Before the backward pass, use the optimizer object to zero all of the
  30. # gradients for the variables it will update (which are the learnable
  31. # weights of the model). This is because by default, gradients are
  32. # accumulated in buffers( i.e, not overwritten) whenever .backward()
  33. # is called. Checkout docs of torch.autograd.backward for more details.
  34. optimizer.zero_grad()
  35. # Backward pass: compute gradient of the loss with respect to model
  36. # parameters
  37. loss.backward()
  38. # Calling the step function on an Optimizer makes an update to its
  39. # parameters
  40. optimizer.step()

PyTorch:自定義 nn 模塊

有時,您將需要指定比一系列現(xiàn)有模塊更復雜的模型。 對于這些情況,您可以通過子類化nn.Module并定義一個forward來定義自己的模塊,該模塊使用其他模塊或在 Tensors 上的其他自動轉換操作來接收輸入 Tensors 并生成輸出 Tensors。

在此示例中,我們將兩層網絡實現(xiàn)為自定義的 Module 子類:

  1. # -*- coding: utf-8 -*-
  2. import torch
  3. class TwoLayerNet(torch.nn.Module):
  4. def __init__(self, D_in, H, D_out):
  5. """
  6. In the constructor we instantiate two nn.Linear modules and assign them as
  7. member variables.
  8. """
  9. super(TwoLayerNet, self).__init__()
  10. self.linear1 = torch.nn.Linear(D_in, H)
  11. self.linear2 = torch.nn.Linear(H, D_out)
  12. def forward(self, x):
  13. """
  14. In the forward function we accept a Tensor of input data and we must return
  15. a Tensor of output data. We can use Modules defined in the constructor as
  16. well as arbitrary operators on Tensors.
  17. """
  18. h_relu = self.linear1(x).clamp(min=0)
  19. y_pred = self.linear2(h_relu)
  20. return y_pred
  21. ## N is batch size; D_in is input dimension;
  22. ## H is hidden dimension; D_out is output dimension.
  23. N, D_in, H, D_out = 64, 1000, 100, 10
  24. ## Create random Tensors to hold inputs and outputs
  25. x = torch.randn(N, D_in)
  26. y = torch.randn(N, D_out)
  27. ## Construct our model by instantiating the class defined above
  28. model = TwoLayerNet(D_in, H, D_out)
  29. ## Construct our loss function and an Optimizer. The call to model.parameters()
  30. ## in the SGD constructor will contain the learnable parameters of the two
  31. ## nn.Linear modules which are members of the model.
  32. criterion = torch.nn.MSELoss(reduction='sum')
  33. optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
  34. for t in range(500):
  35. # Forward pass: Compute predicted y by passing x to the model
  36. y_pred = model(x)
  37. # Compute and print loss
  38. loss = criterion(y_pred, y)
  39. if t % 100 == 99:
  40. print(t, loss.item())
  41. # Zero gradients, perform a backward pass, and update the weights.
  42. optimizer.zero_grad()
  43. loss.backward()
  44. optimizer.step()

PyTorch:控制流+權重共享

作為動態(tài)圖和權重共享的示例,我們實現(xiàn)了一個非常奇怪的模型:一個完全連接的 ReLU 網絡,該網絡在每個前向傳遞中選擇 1 到 4 之間的隨機數(shù),并使用那么多隱藏層,多次重復使用相同的權重 計算最里面的隱藏層。

對于此模型,我們可以使用常規(guī)的 Python 流控制來實現(xiàn)循環(huán),并且可以通過在定義前向傳遞時簡單地多次重復使用同一模塊來實現(xiàn)最內層之間的權重共享。

我們可以輕松地將此模型實現(xiàn)為 Module 子類:

  1. # -*- coding: utf-8 -*-
  2. import random
  3. import torch
  4. class DynamicNet(torch.nn.Module):
  5. def __init__(self, D_in, H, D_out):
  6. """
  7. In the constructor we construct three nn.Linear instances that we will use
  8. in the forward pass.
  9. """
  10. super(DynamicNet, self).__init__()
  11. self.input_linear = torch.nn.Linear(D_in, H)
  12. self.middle_linear = torch.nn.Linear(H, H)
  13. self.output_linear = torch.nn.Linear(H, D_out)
  14. def forward(self, x):
  15. """
  16. For the forward pass of the model, we randomly choose either 0, 1, 2, or 3
  17. and reuse the middle_linear Module that many times to compute hidden layer
  18. representations.
  19. Since each forward pass builds a dynamic computation graph, we can use normal
  20. Python control-flow operators like loops or conditional statements when
  21. defining the forward pass of the model.
  22. Here we also see that it is perfectly safe to reuse the same Module many
  23. times when defining a computational graph. This is a big improvement from Lua
  24. Torch, where each Module could be used only once.
  25. """
  26. h_relu = self.input_linear(x).clamp(min=0)
  27. for _ in range(random.randint(0, 3)):
  28. h_relu = self.middle_linear(h_relu).clamp(min=0)
  29. y_pred = self.output_linear(h_relu)
  30. return y_pred
  31. ## N is batch size; D_in is input dimension;
  32. ## H is hidden dimension; D_out is output dimension.
  33. N, D_in, H, D_out = 64, 1000, 100, 10
  34. ## Create random Tensors to hold inputs and outputs
  35. x = torch.randn(N, D_in)
  36. y = torch.randn(N, D_out)
  37. ## Construct our model by instantiating the class defined above
  38. model = DynamicNet(D_in, H, D_out)
  39. ## Construct our loss function and an Optimizer. Training this strange model with
  40. ## vanilla stochastic gradient descent is tough, so we use momentum
  41. criterion = torch.nn.MSELoss(reduction='sum')
  42. optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9)
  43. for t in range(500):
  44. # Forward pass: Compute predicted y by passing x to the model
  45. y_pred = model(x)
  46. # Compute and print loss
  47. loss = criterion(y_pred, y)
  48. if t % 100 == 99:
  49. print(t, loss.item())
  50. # Zero gradients, perform a backward pass, and update the weights.
  51. optimizer.zero_grad()
  52. loss.backward()
  53. optimizer.step()

范例

您可以在此處瀏覽以上示例。

張量

../_images/sphx_glr_two_layer_net_numpy_thumb.png

Warm-up:numpy

../_images/sphx_glr_two_layer_net_tensor_thumb.png

PyTorch:張量

自動分級

../_images/sphx_glr_two_layer_net_autograd_thumb.png

PyTorch:張量和自定等級

../_images/sphx_glr_two_layer_net_custom_function_thumb.png

PyTorch:定義新的autograd函數(shù)

../_images/sphx_glr_tf_two_layer_net_thumb.png

TensorFlow:靜態(tài)圖

<cite>nn</cite>模塊

../_images/sphx_glr_two_layer_net_nn_thumb.png

PyTorch:nn

../_images/sphx_glr_two_layer_net_optim_thumb.png

PyTorch:優(yōu)化

../_images/sphx_glr_two_layer_net_module_thumb.png

PyTorch:自定義nn模塊

../_images/sphx_glr_dynamic_net_thumb.png

PyTorch:控制流+權重共享


以上內容是否對您有幫助:
在線筆記
App下載
App下載

掃描二維碼

下載編程獅App

公眾號
微信公眾號

編程獅公眾號