PyTorch 可選: 數(shù)據(jù)并行處理

2020-09-07 17:25 更新
原文: https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html
作者: Sung Kim Jenny Kang
譯者: bat67
校驗(yàn)者: FontTian 片刻 yearing1017

在這個(gè)教程里,我們將學(xué)習(xí)如何使用數(shù)據(jù)并行(DataParallel)來(lái)使用多GPU。

PyTorch非常容易的就可以使用GPU,可以用如下方式把一個(gè)模型放到GPU上:

  1. device = torch.device("cuda: 0")
  2. model.to(device)

然后可以復(fù)制所有的張量到GPU上:

  1. mytensor = my_tensor.to(device)

請(qǐng)注意,調(diào)用my_tensor.to(device)返回一個(gè)GPU上的my_tensor副本,而不是重寫my_tensor。你需要把它賦值給一個(gè)新的張量并在GPU上使用這個(gè)張量。

在多GPU上執(zhí)行正向和反向傳播是自然而然的事。然而,PyTorch 默認(rèn)將只是用一個(gè)GPU。你可以使用DataParallel讓模型并行運(yùn)行來(lái)輕易的在多個(gè)GPU上運(yùn)行你的操作。

  1. model = nn.DataParallel(model)

這是這篇教程背后的核心,我們接下來(lái)將更詳細(xì)的介紹它。

導(dǎo)入和參數(shù)

導(dǎo)入 PyTorch 模塊和定義參數(shù)。

  1. import torch
  2. import torch.nn as nn
  3. from torch.utils.data import Dataset, DataLoader
  4. ## Parameters 和 DataLoaders
  5. input_size = 5
  6. output_size = 2
  7. batch_size = 30
  8. data_size = 100

設(shè)備( Device ):

  1. device = torch.device("cuda: 0" if torch.cuda.is_available() else "cpu")

虛擬數(shù)據(jù)集

要制作一個(gè)虛擬(隨機(jī))數(shù)據(jù)集,你只需實(shí)現(xiàn)__getitem__

  1. class RandomDataset(Dataset):
  2. def __init__(self, size, length):
  3. self.len = length
  4. self.data = torch.randn(length, size)
  5. def __getitem__(self, index):
  6. return self.data[index]
  7. def __len__(self):
  8. return self.len
  9. rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size),
  10. batch_size=batch_size, shuffle=True)

簡(jiǎn)單模型

作為演示,我們的模型只接受一個(gè)輸入,執(zhí)行一個(gè)線性操作,然后得到結(jié)果。然而,你能在任何模型(CNN,RNN,Capsule Net等)上使用DataParallel。

我們?cè)谀P蛢?nèi)部放置了一條打印語(yǔ)句來(lái)檢測(cè)輸入和輸出向量的大小。請(qǐng)注意批等級(jí)為0時(shí)打印的內(nèi)容。

  1. class Model(nn.Module):
  2. # Our model
  3. def __init__(self, input_size, output_size):
  4. super(Model, self).__init__()
  5. self.fc = nn.Linear(input_size, output_size)
  6. def forward(self, input):
  7. output = self.fc(input)
  8. print("\tIn Model: input size", input.size(),
  9. "output size", output.size())
  10. return output

創(chuàng)建一個(gè)模型和數(shù)據(jù)并行

這是本教程的核心部分。首先,我們需要?jiǎng)?chuàng)建一個(gè)模型實(shí)例和檢測(cè)我們是否有多個(gè)GPU。如果我們有多個(gè)GPU,我們使用nn.DataParallel來(lái)包裝我們的模型。然后通過(guò)model.to(device)把模型放到GPU上。

  1. model = Model(input_size, output_size)
  2. if torch.cuda.device_count() > 1:
  3. print("Let's use", torch.cuda.device_count(), "GPUs!")
  4. # dim = 0 [30, xxx] -> [10, ...], [10, ...], [10, ...] on 3 GPUs
  5. model = nn.DataParallel(model)
  6. model.to(device)

輸出:

  1. Let's use 2 GPUs!

運(yùn)行模型

現(xiàn)在我們可以看輸入和輸出張量的大小。

  1. for data in rand_loader:
  2. input = data.to(device)
  3. output = model(input)
  4. print("Outside: input size", input.size(),
  5. "output_size", output.size())

輸出:

  1. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  2. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  3. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  4. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  5. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  6. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  7. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  8. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  9. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  10. In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
  11. In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
  12. Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

結(jié)果

如果沒有GPU或只有1個(gè)GPU,當(dāng)我們對(duì)30個(gè)輸入和輸出進(jìn)行批處理時(shí),我們和期望的一樣得到30個(gè)輸入和30個(gè)輸出,但是若有多個(gè)GPU,會(huì)得到如下的結(jié)果。

2個(gè)GPU

若有2個(gè)GPU,將看到:

  1. Let's use 2 GPUs!
  2. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  3. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  4. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  5. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  6. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  7. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  8. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  9. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  10. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  11. In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
  12. In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
  13. Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

3個(gè)GPU

若有3個(gè)GPU,將看到:

  1. Let's use 3 GPUs!
  2. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  3. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  4. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  5. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  6. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  7. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  8. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  9. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  10. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  11. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  12. In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  13. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  14. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  15. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  16. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  17. Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

8個(gè)GPU

若有8個(gè)GPU,將看到:

  1. Let's use 8 GPUs!
  2. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  3. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  4. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  5. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  6. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  7. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  8. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  9. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  10. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  11. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  12. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  13. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  14. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  15. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  16. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  17. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  18. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  19. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  20. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  21. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  22. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  23. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  24. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  25. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  26. In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  27. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  28. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  29. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  30. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  31. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  32. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  33. In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  34. Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

總結(jié)

DataParallel自動(dòng)的劃分?jǐn)?shù)據(jù),并將作業(yè)順序發(fā)送到多個(gè)GPU上的多個(gè)模型。DataParallel會(huì)在每個(gè)模型完成作業(yè)后,收集與合并結(jié)果然后返回給你。

更多信息,請(qǐng)參考: https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html


以上內(nèi)容是否對(duì)您有幫助:
在線筆記
App下載
App下載

掃描二維碼

下載編程獅App

公眾號(hào)
微信公眾號(hào)

編程獅公眾號(hào)