๋”ฅ๋Ÿฌ๋‹/Today I learned :

์ฝ”๋žฉ(colab) ํŒŒ์ดํ† ์น˜ ํŠœํ† ๋ฆฌ์–ผ, ํ…์„œ ์—ฐ์‚ฐ, AUTOGRAD

์ฃผ์˜ ๐Ÿฑ 2022. 12. 12. 11:21
728x90
๋ฐ˜์‘ํ˜•

์ฝ”๋žฉ์—์„œ ํŒŒ์ดํ† ์น˜ ์‹คํ–‰๋ฐฉ๋ฒ•๊ณผ ๊ธฐ๋ณธ์ ์ธ ์‚ฌ์šฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋จผ์ €, ๋Ÿฐํƒ€์ž„์˜ ์œ ํ˜•์„ ๋ณ€๊ฒฝํ•ด์ค๋‹ˆ๋‹ค. ์ƒ๋‹จ ๋ฉ”๋‰ด์—์„œ [๋Ÿฐํƒ€์ž„]->[๋Ÿฐํƒ€์ž„์œ ํ˜•๋ณ€๊ฒฝ]->[ํ•˜๋“œ์›จ์–ด๊ฐ€์†๊ธฐ]->[GPU]๋กœ ๋ณ€๊ฒฝ ์ดํ›„ ์•„๋ž˜์˜ cell์„ ์‹คํ–‰ ์‹œ์ผฐ์„ ๋•Œ, torch.cuda.is_avialable() ๊ฐ’์ด True๊ฐ€ ๋‚˜์™€์•ผ ํ•ฉ๋‹ˆ๋‹ค.

import torch
print(torch.__version__)
print(torch.cuda.is_available())

import matplotlib.pyplot as plt
import numpy as np
import scipy as sp

np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)

 

PyTorch์—์„œ๋Š” ํ…์„œ๋ผ๋Š” ์ž๋ฃŒ๊ตฌ์กฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ,  ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ํ…์„œ๋ฅผ ๋ฐฐ์—ด, ํ–‰๋ ฌ๊ณผ ์œ ์‚ฌํ•œ ์ ์„ ๋งŽ์ด ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. GPU๋‚˜ ๋‹ค๋ฅธ ์—ฐ์‚ฐ ๊ฐ€์†์„ ์œ„ํ•œ ํŠน์ˆ˜ํ•œ ํ•˜๋“œ์›จ์–ด์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์„ ์ œ์™ธํ•˜๋ฉด, ํ…์„œ๋Š” NumPy์˜ ndarray์™€ ๋งค์šฐ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

ํ…์„œ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ: ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์ง์ ‘ ์ƒ์„ฑํ•˜๊ธฐ

data = [[1, 2],[3, 4]]
x = torch.tensor(data)
x

Numpy array๋กœ๋ถ€ํ„ฐ ์ƒ์„ฑํ•˜๊ธฐ

np_array = np.array(data)
x = torch.from_numpy(np_array)
x

 

Tensor์—์„œ Numpy array๋กœ ๋ณ€ํ™˜ํ•˜๊ธฐ

x.numpy()



๋‹ค๋ฅธ ํ…์„œ์™€ ๊ฐ™์€ ๋ชจ์–‘์˜ ํ…์„œ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ

x_ones = torch.ones_like(x) # x_data์˜ ์†์„ฑ์„ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x, dtype=torch.float) # x_data์˜ ์†์„ฑ์„ ๋ฎ์–ด์”๋‹ˆ๋‹ค.
print(f"Random Tensor: \n {x_rand} \n")

 

์ฃผ์–ด์ง„ shape์œผ๋กœ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ

shape = (3,4)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

 

ํ…์„œ์˜ ์†์„ฑ์€ ํ…์„œ์˜ ๋ชจ์–‘(shape), ์ž๋ฃŒํ˜•(datatype) ๋ฐ ์–ด๋Š ์žฅ์น˜์— ์ €์žฅ๋˜๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

 

์•„๋ž˜์™€ ๊ฐ™์ด cpu์— ํ• ๋‹น๋˜์–ด ์žˆ๋Š” tensor๋ฅผ gpu์— ์˜ฎ๊ธธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

device = torch.device('cuda')
tensor = tensor.to(device)
print(f"Device tensor is stored on: {tensor.device}")

 

ํ…์„œ ์—ฐ์‚ฐ

Numpy์‹์˜ ์ธ๋ฑ์‹ฑ๊ณผ ์Šฌ๋ผ์ด์‹ฑ

tensor = torch.ones(3, 4)
tensor[:,1] = 0
print(tensor)

 

ํ…์„œ ํ•ฉ์น˜๊ธฐ

t1 = torch.cat([tensor, tensor, tensor], dim=0)
print(t1)

t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)



ํ…์„œ ๊ณฑํ•˜๊ธฐ

# ์š”์†Œ๋ณ„ ๊ณฑ(element-wise product)์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค
print(f"tensor.mul(tensor) \n {tensor.mul(tensor)} \n")

# ๋‹ค๋ฅธ ๋ฌธ๋ฒ•:
print(f"tensor * tensor \n {tensor * tensor}")


ํ…์„œ๊ฐ„ matrix multiplication ์ง„ํ–‰ํ•˜๊ธฐ

print(f"tensor.matmul(tensor.T) \n {tensor.matmul(tensor.T)} \n")

# ๋‹ค๋ฅธ ๋ฌธ๋ฒ•:
print(f"tensor @ tensor.T \n {tensor @ tensor.T}")

 

AUTOGRAD

PyTorch์—๋Š” torch.autograd๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” ์ž๋™ ๋ฏธ๋ถ„ ์—”์ง„์ด ๋‚ด์žฅ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋ชจ๋“  node์— ๋Œ€ํ•œ ๋ฏธ๋ถ„ ๊ฐ’์„ ์ž๋™์œผ๋กœ ๊ณ„์‚ฐํ•ด์ฃผ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ X, ํŒŒ๋ผ๋ฏธํ„ฐ W , ๊ทธ๋ฆฌ๊ณ  cross-entropy loss๋ฅผ ์‚ฌ์šฉํ•˜๋Š” logistic regression model์˜ gradient๋ฅผ autograd๋ฅผ ์ด์šฉํ•ด์„œ ๊ตฌํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

x = torch.ones(5)  # input tensor
y = torch.zeros(3)  # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
print(x)
print(y)
print(w)
print(b)
 
Forward
z = torch.matmul(x,w)+b
z

 

Loss Function
PyTorch์—์„œ๋Š” node๋ฅผ ํฌ๊ฒŒ 2๊ฐ€์ง€์˜ ๋ฐฉ๋ฒ•์˜ api๋ฅผ ํ™œ์šฉํ•ด์„œ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

torch.nn
torch.nn.functional


torch.nn์€ ์‚ฌ์ „์— node๋ฅผ ์ดˆ๊ธฐํ™” ์‹œ์ผœ๋†“๊ณ , ํ•ด๋‹น node์— ํ…์„œ๋ฅผ ํ†ต๊ณผ์‹œ์ผœ ๊ฐ’์„ ๋ฐ›๋Š” ํ˜•ํƒœ์ธ ๋ฐ˜๋ฉด, torch.nn.functional์€ ์‚ฌ์ „์— ์ดˆ๊ธฐํ™”์—†์ด ๋ฐ”๋กœ ํ•จ์ˆ˜์ฒ˜๋Ÿผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์›ํ•˜์‹œ๋Š” api๋ฅผ ์„ ํƒํ•˜์…”์„œ ์‚ฌ์šฉํ•˜์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

loss_fn = torch.nn.BCEWithLogitsLoss()
loss = loss_fn(z, y)


loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

 

Backward

๋ชจ๋ธ์—์„œ ๋งค๊ฐœ๋ณ€์ˆ˜์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์ตœ์ ํ™”ํ•˜๋ ค๋ฉด ํŒŒ๋ผ๋ฏธํ„ฐ์— ๋Œ€ํ•œ loss function์˜ ๋„ํ•จ์ˆ˜(derivative)๋ฅผ ๊ณ„์‚ฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋„ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด, loss.backward() ๋ฅผ ํ˜ธ์ถœํ•œ ๋‹ค์Œ w.grad์™€ b.grad์—์„œ ๊ฐ’์„ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค

loss.backward()
print(x.grad)
print(w.grad)
print(b.grad)

๊ธฐ๋ณธ์ ์œผ๋กœ, requires_grad=True์ธ ๋ชจ๋“  ํ…์„œ๋“ค์€ ์—ฐ์‚ฐ ๊ธฐ๋ก์„ ์ถ”์ ํ•˜๊ณ  ๋ฏธ๋ถ„ ๊ณ„์‚ฐ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ชจ๋ธ์„ ํ•™์Šตํ•œ ๋’ค ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹จ์ˆœํžˆ ์ ์šฉํ•˜๊ธฐ๋งŒ ํ•˜๋Š” ๊ฒฝ์šฐ์™€ ๊ฐ™์ด forward ์—ฐ์‚ฐ๋งŒ ํ•„์š”ํ•œ ๊ฒฝ์šฐ์—๋Š”, ๋ฏธ๋ถ„ ์—ฐ์‚ฐ์„ ์œ„ํ•œ ๊ฐ’๋“ค์„ ์ €์žฅํ•ด๋‘๋Š” ๊ฒƒ์ด ์†๋ ฅ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ์˜ ์ €ํ•˜๋ฅผ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฐ์‚ฐ ์ฝ”๋“œ๋ฅผ torch.no_grad() ๋ธ”๋ก์œผ๋กœ ๋‘˜๋Ÿฌ์‹ธ์„œ ๋ฏธ๋ถ„ ์ถ”์ ์„ ๋ฉˆ์ถœ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

z = torch.matmul(x, w)+b
print(z.requires_grad)

with torch.no_grad():
    z = torch.matmul(x, w)+b
print(z.requires_grad)
๋ฐ˜์‘ํ˜•