Skip to content

Commit c198460

Browse files
authored
beginner_source/nlp/pytorch_tutorial.py ๋ฒˆ์—ญ (#573)
* beginner_source/nlp/pytorch_tutorial.py ๋ฒˆ์—ญ
1 parent 8d1ac3b commit c198460

1 file changed

Lines changed: 115 additions & 116 deletions

File tree

โ€Žbeginner_source/nlp/pytorch_tutorial.pyโ€Ž

Lines changed: 115 additions & 116 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,16 @@
11
# -*- coding: utf-8 -*-
22
r"""
3-
Introduction to PyTorch
3+
PyTorch ์†Œ๊ฐœ
44
***********************
5+
**๋ฒˆ์—ญ**: `๋ฐ˜๋ณด์˜ <https://github.com/2kkeullim>`_
56
6-
Introduction to Torch's tensor library
7+
Torch์˜ tensor ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์†Œ๊ฐœ
78
======================================
89
9-
All of deep learning is computations on tensors, which are
10-
generalizations of a matrix that can be indexed in more than 2
11-
dimensions. We will see exactly what this means in-depth later. First,
12-
let's look what we can do with tensors.
10+
๋”ฅ๋Ÿฌ๋‹์€ ๋ชจ๋‘ tensor์— ๋Œ€ํ•œ ์—ฐ์‚ฐ์œผ๋กœ,
11+
2์ฐจ์› ์ด์ƒ์—์„œ ์ธ๋ฑ์‹ฑํ•  ์ˆ˜ ์žˆ๋Š” ํ–‰๋ ฌ์˜ ์ผ๋ฐ˜ํ™”์ž…๋‹ˆ๋‹ค.
12+
์ด๊ฒƒ์ด ์ •ํ™•ํžˆ ๋ฌด์—‡์„ ์˜๋ฏธํ•˜๋Š”์ง€๋Š” ๋‚˜์ค‘์— ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
13+
๋จผ์ € tensor๋กœ ๋ฌด์—‡์„ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
1314
"""
1415
# Author: Robert Guthrie
1516

@@ -19,78 +20,77 @@
1920

2021

2122
######################################################################
22-
# Creating Tensors
23+
# Tensor ์ƒ์„ฑํ•˜๊ธฐ
2324
# ~~~~~~~~~~~~~~~~
2425
#
25-
# Tensors can be created from Python lists with the torch.tensor()
26-
# function.
26+
# torch.tensor() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Python ๋ฆฌ์ŠคํŠธ๋กœ๋ถ€ํ„ฐ
27+
# tensor๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
2728
#
2829

29-
# torch.tensor(data) creates a torch.Tensor object with the given data.
30+
# torch.tensor(data)๋ฅผ ํ†ตํ•ด ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ torch.Tensor object๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
3031
V_data = [1., 2., 3.]
3132
V = torch.tensor(V_data)
3233
print(V)
3334

34-
# Creates a matrix
35+
# ํ–‰๋ ฌ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค
3536
M_data = [[1., 2., 3.], [4., 5., 6]]
3637
M = torch.tensor(M_data)
3738
print(M)
3839

39-
# Create a 3D tensor of size 2x2x2.
40+
# 2x2x2 ํฌ๊ธฐ์˜ 3D tensor๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
4041
T_data = [[[1., 2.], [3., 4.]],
4142
[[5., 6.], [7., 8.]]]
4243
T = torch.tensor(T_data)
4344
print(T)
4445

4546

4647
######################################################################
47-
# What is a 3D tensor anyway? Think about it like this. If you have a
48-
# vector, indexing into the vector gives you a scalar. If you have a
49-
# matrix, indexing into the matrix gives you a vector. If you have a 3D
50-
# tensor, then indexing into the tensor gives you a matrix!
48+
# ๋„๋Œ€์ฒด 3D tensor๋Š” ๋ฌด์—‡์ผ๊นŒ์š”? ์ด๋ ‡๊ฒŒ ์ƒ๊ฐํ•ด ๋ณด์„ธ์š”.
49+
# ๋ฒกํ„ฐ์˜ ๊ฒฝ์šฐ, ์ธ๋ฑ์‹ฑํ•˜๋ฉด ์Šค์นผ๋ผ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค.
50+
# ํ–‰๋ ฌ์˜ ๊ฒฝ์šฐ, ์ธ๋ฑ์‹ฑํ•˜๋ฉด ๋ฒกํ„ฐ๊ฐ€ ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค.
51+
# 3D tensor์˜ ๊ฒฝ์šฐ, ์ธ๋ฑ์‹ฑํ•˜๋ฉด ํ–‰๋ ฌ์ด ์ถœ๋ ฅ๋ฉ๋‹ˆ๋‹ค!
5152
#
52-
# A note on terminology:
53-
# when I say "tensor" in this tutorial, it refers
54-
# to any torch.Tensor object. Matrices and vectors are special cases of
55-
# torch.Tensors, where their dimension is 2 and 1 respectively. When I am
56-
# talking about 3D tensors, I will explicitly use the term "3D tensor".
53+
# ์šฉ์–ด ๋…ธํŠธ:
54+
# ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ "tensor" ๋ผ๋Š” ์šฉ์–ด๋Š” torch.Tensor object๋ฅผ
55+
# ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ํ–‰๋ ฌ๊ณผ ๋ฒกํ„ฐ๋Š” torch.Tensor์˜ ์ฐจ์›์ด 2, 1์ธ
56+
# ํŠน๋ณ„ํ•œ ๊ฒฝ์šฐ์ž…๋‹ˆ๋‹ค. 3D tensor์— ๋Œ€ํ•œ
57+
# ์šฉ์–ด๋ฅผ ํ‘œํ˜„ํ•  ๋•Œ๋Š” ๋ช…๋ฐฑํ•˜๊ฒŒ "3D tensor" ๋ผ๋Š” ์šฉ์–ด๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
5758
#
5859

59-
# Index into V and get a scalar (0 dimensional tensor)
60+
# V์—์„œ ์ธ๋ฑ์‹ฑํ•˜์—ฌ ์Šค์นผ๋ผ ์–ป๊ธฐ (0 ์ฐจ์› tensor)
6061
print(V[0])
61-
# Get a Python number from it
62-
print(V[0].item())
62+
# ๋ฒกํ„ฐ๋กœ๋ถ€ํ„ฐ Python ์ˆซ์ž ์–ป๊ธฐ
6363

64-
# Index into M and get a vector
64+
# M์—์„œ ์ธ๋ฑ์‹ฑํ•˜์—ฌ ๋ฒกํ„ฐ ์–ป๊ธฐ
6565
print(M[0])
6666

67-
# Index into T and get a matrix
67+
# T์—์„œ ์ธ๋ฑ์‹ฑํ•˜์—ฌ ํ–‰๋ ฌ ์–ป๊ธฐ
6868
print(T[0])
6969

7070

7171
######################################################################
72-
# You can also create tensors of other data types. To create a tensor of integer types, try
73-
# torch.tensor([[1, 2], [3, 4]]) (where all elements in the list are integers).
74-
# You can also specify a data type by passing in ``dtype=torch.data_type``.
75-
# Check the documentation for more data types, but
76-
# Float and Long will be the most common.
72+
# ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ํƒ€์ž…์˜ tensor๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. integer ํƒ€์ž…์˜ tensor๋Š”
73+
# torch.tensor([[1, 2], [3, 4]])์œผ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (์ด ๋•Œ, ๋ฆฌ์ŠคํŠธ์˜ ๋ชจ๋“  ์›์†Œ๋Š” integer)
74+
# ๋˜ํ•œ ``dtype=torch.data_type`` ์„ ์ด์šฉํ•ด ๋ฐ์ดํ„ฐ ํƒ€์ž…์„ ์ง€์ •ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
75+
# ์ถ”๊ฐ€์ ์ธ ๋ฐ์ดํ„ฐ ํƒ€์ž…์— ๋Œ€ํ•ด์„œ๋Š” ๋ฌธ์„œ๋กœ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ,
76+
# Float์™€ Long์ด ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ž…๋‹ˆ๋‹ค.
7777
#
7878

7979

8080
######################################################################
81-
# You can create a tensor with random data and the supplied dimensionality
82-
# with torch.randn()
81+
# torch.randn() ์œผ๋กœ ์ œ๊ณตํ•œ ์ฐจ์›์„ ์‚ฌ์šฉํ•ด
82+
# ๋žœ๋ค ๋ฐ์ดํ„ฐ๋กœ tensor๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
8383
#
8484

8585
x = torch.randn((3, 4, 5))
8686
print(x)
8787

8888

8989
######################################################################
90-
# Operations with Tensors
90+
# Tensor ๋ฅผ ์‚ฌ์šฉํ•œ ์—ฐ์‚ฐ
9191
# ~~~~~~~~~~~~~~~~~~~~~~~
9292
#
93-
# You can operate on tensors in the ways you would expect.
93+
# tensor๋Š” ์›ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์—ฐ์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
9494

9595
x = torch.tensor([1., 2., 3.])
9696
y = torch.tensor([4., 5., 6.])
@@ -99,179 +99,178 @@
9999

100100

101101
######################################################################
102-
# See `the documentation <https://pytorch.org/docs/torch.html>`__ for a
103-
# complete list of the massive number of operations available to you. They
104-
# expand beyond just mathematical operations.
102+
# `๋ฌธ์„œ <https://pytorch.org/docs/torch.html>`__ ์—์„œ
103+
# ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์—„์ฒญ๋‚œ ์ˆ˜์˜ ์—ฐ์‚ฐ๋“ค์˜ ์ „์ฒด ๋ชฉ๋ก์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
104+
# ํ•ด๋‹น ์—ฐ์‚ฐ ๋ชฉ๋ก์€ ๋‹จ์ˆœํ•œ ์ˆ˜ํ•™์ ์ธ ์—ฐ์‚ฐ์„ ๋›ฐ์–ด๋„˜์Šต๋‹ˆ๋‹ค.
105105
#
106-
# One helpful operation that we will make use of later is concatenation.
106+
# ๋‚˜์ค‘์— ์‚ฌ์šฉํ•  ์œ ์šฉํ•œ ์—ฐ์‚ฐ ์ค‘ ํ•˜๋‚˜๋Š” ๊ฒฐํ•ฉ(concatenation)์ž…๋‹ˆ๋‹ค.
107107
#
108108

109-
# By default, it concatenates along the first axis (concatenates rows)
109+
# ๊ธฐ๋ณธ์ ์œผ๋กœ ์ฒซ ๋ฒˆ์งธ ์ถ•์„ ๋”ฐ๋ผ ๊ฒฐํ•ฉ๋ฉ๋‹ˆ๋‹ค (ํ–‰(row) ๊ฒฐํ•ฉ)
110110
x_1 = torch.randn(2, 5)
111111
y_1 = torch.randn(3, 5)
112112
z_1 = torch.cat([x_1, y_1])
113113
print(z_1)
114114

115-
# Concatenate columns:
115+
# ์—ด(column) ๊ฒฐํ•ฉ
116116
x_2 = torch.randn(2, 3)
117117
y_2 = torch.randn(2, 5)
118-
# second arg specifies which axis to concat along
118+
# ๋‘ ๋ฒˆ์งธ ์ธ์ž๋Š” ์—ฐ๊ฒฐํ•  ์ถ•์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค
119119
z_2 = torch.cat([x_2, y_2], 1)
120120
print(z_2)
121121

122-
# If your tensors are not compatible, torch will complain. Uncomment to see the error
122+
# tensor๊ฐ€ ํ˜ธํ™˜๋˜์ง€ ์•Š์œผ๋ฉด torch๊ฐ€ ๋ถˆํ‰ํ• ๊ฒ๋‹ˆ๋‹ค. ๋ฐ‘์˜ ๋ช…๋ น์–ด๋ฅผ ์ฃผ์„ ํ•ด์ œํ•˜์—ฌ ์—๋Ÿฌ๋ฅผ ์ถœ๋ ฅํ•ด๋ณด์„ธ์š”
123123
# torch.cat([x_1, x_2])
124124

125-
126125
######################################################################
127-
# Reshaping Tensors
126+
# Tensor ๊ตฌ์กฐ ๋ฐ”๊พธ๊ธฐ
128127
# ~~~~~~~~~~~~~~~~~
129128
#
130-
# Use the .view() method to reshape a tensor. This method receives heavy
131-
# use, because many neural network components expect their inputs to have
132-
# a certain shape. Often you will need to reshape before passing your data
133-
# to the component.
129+
# .view() ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•ด tensor์˜ ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊ฟ”๋ด…์‹œ๋‹ค. ๋งŽ์€ ์‹ ๊ฒฝ๋ง ๊ตฌ์„ฑ ์š”์†Œ๋“ค์€
130+
# ํŠน์ •ํ•œ ๊ตฌ์กฐ์˜ ์ž…๋ ฅ์„ ์›ํ•˜๊ธฐ์—, ์ด ๋ฉ”์†Œ๋“œ๋Š” ์•„์ฃผ ๋งŽ์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
131+
# ์ข…์ข… ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ์„ฑ ์š”์†Œ๋กœ ์ „๋‹ฌํ•˜๊ธฐ ์ „ ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊ฟ”์•ผ ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€
132+
# ์žˆ์Šต๋‹ˆ๋‹ค.
134133
#
135134

136135
x = torch.randn(2, 3, 4)
137136
print(x)
138-
print(x.view(2, 12)) # Reshape to 2 rows, 12 columns
139-
# Same as above. If one of the dimensions is -1, its size can be inferred
137+
print(x.view(2, 12)) # 2์—ด 12ํ–‰์œผ๋กœ ๊ตฌ์กฐ ๋ฐ”๊พธ๊ธฐ
138+
# ์œ„์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ฐจ์› ์ค‘ ํ•˜๋‚˜๊ฐ€ -1์ธ ๊ฒฝ์šฐ ๊ทธ ํฌ๊ธฐ๋ฅผ ์œ ์ถ”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
140139
print(x.view(2, -1))
141140

142141

143142
######################################################################
144-
# Computation Graphs and Automatic Differentiation
143+
# ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„(Computation Graph) ์™€ ์ž๋™ ๋ฏธ๋ถ„(Automatic Differentiation)
145144
# ================================================
146145
#
147-
# The concept of a computation graph is essential to efficient deep
148-
# learning programming, because it allows you to not have to write the
149-
# back propagation gradients yourself. A computation graph is simply a
150-
# specification of how your data is combined to give you the output. Since
151-
# the graph totally specifies what parameters were involved with which
152-
# operations, it contains enough information to compute derivatives. This
153-
# probably sounds vague, so let's see what is going on using the
154-
# fundamental flag ``requires_grad``.
146+
# ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„์˜ ๊ฐœ๋…์€ ์ง์ ‘ ์—ญ์ „ํŒŒ ๋ณ€ํ™”๋„(gradient)๋ฅผ ์“ธ ํ•„์š”๊ฐ€ ์—†๊ฒŒ ํ•ด์ฃผ๋ฉฐ
147+
# ํšจ์œจ์ ์ธ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์— ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค.
148+
# ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„๋Š” ๊ฐ„๋‹จํžˆ ๋งํ•˜์ž๋ฉด ์ถœ๋ ฅ์„ ๋‚ด๊ธฐ ์œ„ํ•ด
149+
# ์–ด๋–ป๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒฐํ•ฉํ–ˆ๋Š”์ง€์— ๋Œ€ํ•œ ์„ค๋ช…์„œ์ž…๋‹ˆ๋‹ค.
150+
# ๊ทธ๋ž˜ํ”„๋Š” ์–ด๋–ค ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์–ด๋–ค ์—ฐ์‚ฐ์— ๊ด€์—ฌํ•˜๋Š”์ง€๋ฅผ ๋ชจ๋‘ ๋งํ•ด์ฃผ๋ฏ€๋กœ
151+
# ๋„ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ ์ •๋ณด๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค.
152+
# ์ด ๋ง์ด ๋ชจํ˜ธํ•  ์ˆ˜ ์žˆ์œผ๋‹ˆ, ํ•ต์‹ฌ ํ”Œ๋ž˜๊ทธ์ธ
153+
# ``requires_grad`` ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚˜๋Š”์ง€ ์•Œ์•„๋ด…์‹œ๋‹ค.
155154
#
156-
# First, think from a programmers perspective. What is stored in the
157-
# torch.Tensor objects we were creating above? Obviously the data and the
158-
# shape, and maybe a few other things. But when we added two tensors
159-
# together, we got an output tensor. All this output tensor knows is its
160-
# data and shape. It has no idea that it was the sum of two other tensors
161-
# (it could have been read in from a file, it could be the result of some
162-
# other operation, etc.)
155+
# ๋จผ์ € ํ”„๋กœ๊ทธ๋ž˜๋จธ์˜ ๊ด€์ ์—์„œ ์ƒ๊ฐํ•ด๋ด…์‹œ๋‹ค. ์œ„์—์„œ ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“ 
156+
# torch.Tensor ์˜ค๋ธŒ์ ํŠธ์—๋Š” ๋ฌด์—‡์ด ์ €์žฅ๋˜์–ด ์žˆ์„๊นŒ์š”? ๋ถ„๋ช…ํžˆ ๋ฐ์ดํ„ฐ์™€
157+
# ๊ตฌ์กฐ, ๋ช‡ ๊ฐ€์ง€ ๋‹ค๋ฅธ ๊ฒƒ๋“ค์ด ์žˆ์„ ๊ฒ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ 2๊ฐœ์˜ tensor๋ฅผ
158+
# ํ•จ๊ป˜ ์ถ”๊ฐ€ํ•˜๋ฉด ํ•˜๋‚˜์˜ ๊ฒฐ๊ณผ tensor๋ฅผ ์–ป๊ฒŒ ๋˜๋Š”๋ฐ, ๊ฒฐ๊ณผ tensor๊ฐ€ ์•„๋Š” ๊ฒƒ์€
159+
# ์ž์‹ ์˜ ๋ฐ์ดํ„ฐ์™€ ๊ตฌ์กฐ๋ฟ์ž…๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ๋‘ tensor์˜ ํ•ฉ์ด๋ผ๋Š” ๊ฒƒ์„ ์ „ํ˜€ ์•Œ์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค
160+
# (ํŒŒ์ผ์—์„œ ์ฝ์—ˆ์„ ์ˆ˜๋„ ์žˆ๊ณ ,
161+
# ๋‹ค๋ฅธ ์—ฐ์‚ฐ์˜ ๊ฒฐ๊ณผ์ผ ์ˆ˜๋„ ์žˆ์Œ)
163162
#
164-
# If ``requires_grad=True``, the Tensor object keeps track of how it was
165-
# created. Let's see it in action.
163+
# ``requires_grad=True`` ์ธ ๊ฒฝ์šฐ, Tensor ๊ฐ์ฒด๋Š” ์ž์‹ ์ด ์–ด๋–ป๊ฒŒ ์ƒ์„ฑ๋˜์—ˆ๋Š”์ง€
164+
# ์ถ”์ ํ•ฉ๋‹ˆ๋‹ค. ํ•œ๋ฒˆ ๋ด…์‹œ๋‹ค.
166165
#
167166

168-
# Tensor factory methods have a ``requires_grad`` flag
167+
# Tensor factory ๋ฉ”์†Œ๋“œ๋Š” ``requires_grad`` ํ”Œ๋ž˜๊ทธ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค
169168
x = torch.tensor([1., 2., 3], requires_grad=True)
170169

171-
# With requires_grad=True, you can still do all the operations you previously
172-
# could
170+
# requires_grad=True ๋ฅผ ์‚ฌ์šฉํ•ด๋„ ์ด์ „์— ํ•  ์ˆ˜ ์žˆ์—ˆ๋˜ ๋ชจ๋“  ์—ฐ์‚ฐ์„
171+
# ์—ฌ์ „ํžˆ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
173172
y = torch.tensor([4., 5., 6], requires_grad=True)
174173
z = x + y
175174
print(z)
176175

177-
# BUT z knows something extra.
176+
# ํ•˜์ง€๋งŒ, z๋Š” ๋ญ”๊ฐ€๋ฅผ ๋” ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
178177
print(z.grad_fn)
179178

180179

181180
######################################################################
182-
# So Tensors know what created them. z knows that it wasn't read in from
183-
# a file, it wasn't the result of a multiplication or exponential or
184-
# whatever. And if you keep following z.grad_fn, you will find yourself at
185-
# x and y.
181+
# ์ฆ‰ Tensor๋Š” ๋ฌด์—‡์œผ๋กœ ๊ทธ๋“ค์ด ๋งŒ๋“ค์–ด์กŒ๋Š”์ง€ ์••๋‹ˆ๋‹ค. z๋Š” Tensor๊ฐ€ ํŒŒ์ผ์—์„œ
182+
# ์ฝํžˆ์ง€ ์•Š์•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ์žˆ์œผ๋ฉฐ, ๊ณฑ์…ˆ์ด๋‚˜ ์ง€์ˆ˜ ๊ฐ™์€ ๊ฒƒ์˜ ๊ฒฐ๊ณผ๋„ ์•„๋‹™๋‹ˆ๋‹ค.
183+
# ๊ทธ๋ฆฌ๊ณ  z.grad_fn ์„ ๊ณ„์† ๋”ฐ๋ผ๊ฐ€๋‹ค ๋ณด๋ฉด,
184+
# x์™€ y๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
186185
#
187-
# But how does that help us compute a gradient?
186+
# ๊ทธ๋Ÿฌ๋ฉด ์ด ์ •๋ณด๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ณ€ํ™”๋„๋ฅผ ๊ณ„์‚ฐํ•˜๋Š”๋ฐ ๋„์›€์„ ์ค„๊นŒ์š”?
188187
#
189188

190-
# Let's sum up all the entries in z
189+
# z์˜ ๋ชจ๋“  ํ•ญ๋ชฉ์„ ํ•˜๋‚˜๋กœ ๋”ํ•ด๋ด…์‹œ๋‹ค.
191190
s = z.sum()
192191
print(s)
193192
print(s.grad_fn)
194193

195194

196195
######################################################################
197-
# So now, what is the derivative of this sum with respect to the first
198-
# component of x? In math, we want
196+
# ๊ทธ๋ ‡๋‹ค๋ฉด, x์˜ ์ฒซ๋ฒˆ์งธ ๊ตฌ์„ฑ ์š”์†Œ์— ๋Œ€ํ•ด ์ด ๋ง์…ˆ์˜ ๋ฏธ๋ถ„์€
197+
# ๋ฌด์—‡์ผ๊นŒ์š”? ์ˆ˜ํ•™์ ์œผ๋กœ๋Š”, ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ๊ฒƒ์€ ๋‹ค์Œ์˜ ์‹์ž…๋‹ˆ๋‹ค.
199198
#
200199
# .. math::
201200
#
202201
# \frac{\partial s}{\partial x_0}
203202
#
204203
#
205204
#
206-
# Well, s knows that it was created as a sum of the tensor z. z knows
207-
# that it was the sum x + y. So
205+
# ์ž, s๋Š” ์ž์‹ ์ด tensor z์˜ ํ•ฉ์œผ๋กœ ๋งŒ๋“ค์–ด์กŒ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. z๋Š”
206+
# ์ž์‹ ์ด x + y์˜ ํ•ฉ์ด๋ผ๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ
208207
#
209208
# .. math:: s = \overbrace{x_0 + y_0}^\text{$z_0$} + \overbrace{x_1 + y_1}^\text{$z_1$} + \overbrace{x_2 + y_2}^\text{$z_2$}
210209
#
211-
# And so s contains enough information to determine that the derivative
212-
# we want is 1!
210+
# ๋”ฐ๋ผ์„œ, s๋Š” ๋„ํ•จ์ˆ˜๋ฅผ ๊ฒฐ์ •ํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
211+
# ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ๋„ํ•จ์ˆ˜์˜ ๊ฐ’์€ 1 ์ž…๋‹ˆ๋‹ค!
213212
#
214-
# Of course this glosses over the challenge of how to actually compute
215-
# that derivative. The point here is that s is carrying along enough
216-
# information that it is possible to compute it. In reality, the
217-
# developers of Pytorch program the sum() and + operations to know how to
218-
# compute their gradients, and run the back propagation algorithm. An
219-
# in-depth discussion of that algorithm is beyond the scope of this
220-
# tutorial.
213+
# ๋ฌผ๋ก  ์ด๊ฒƒ์€ ๊ทธ ๋ฏธ๋ถ„์„ ์‹ค์ œ๋กœ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๋„์ „์„
214+
# ์ˆจ๊น๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ์š”์ ์€ s๊ฐ€ ๋ฏธ๋ถ„์„ ๊ณ„์‚ฐํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ
215+
# ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ๋‹ค๋‹Œ๋‹ค๋Š” ๊ฒ๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ
216+
# Pytorch ๊ฐœ๋ฐœ์ž๋Š” sum() ๊ณผ + ์—ฐ์‚ฐ์—์„œ
217+
# ๋ณ€ํ™”๋„๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋ฒ•์„ ํ”„๋กœ๊ทธ๋ž˜๋ฐํ•˜๊ณ , ์—ญ์ „ํŒŒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.
218+
# ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•œ ๊นŠ์ด์žˆ๋Š” ์„ค๋ช…์€ ์ด ํŠœํ† ๋ฆฌ์–ผ์˜ ๋ฒ”์œ„๋ฅผ
219+
# ๋ฒ—์–ด๋‚ฉ๋‹ˆ๋‹ค.
221220
#
222221

223222

224223
######################################################################
225-
# Let's have Pytorch compute the gradient, and see that we were right:
226-
# (note if you run this block multiple times, the gradient will increment.
227-
# That is because Pytorch *accumulates* the gradient into the .grad
228-
# property, since for many models this is very convenient.)
224+
# Pytorch๋กœ ๋ณ€ํ™”๋„๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ๋งž์•˜๋Š”์ง€ ํ™•์ธํ•ด ๋ด…์‹œ๋‹ค.
225+
# (์ด ๋ธ”๋ก์„ ์—ฌ๋Ÿฌ๋ฒˆ ์‹คํ–‰ํ•˜๋ฉด ๋ณ€ํ™”๋„๊ฐ€ ์ฆ๊ฐ€ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
226+
# ๊ทธ ์ด์œ ๋Š” Pytorch๊ฐ€ ๋ณ€ํ™”๋„๋ฅผ .grad ์†์„ฑ์— *์ถ•์ * ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋ฉฐ,
227+
# ์ด๋Š” ๋งŽ์€ ๋ชจ๋ธ์—์„œ ๋งค์šฐ ํŽธ๋ฆฌํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.)
229228
#
230229

231-
# calling .backward() on any variable will run backprop, starting from it.
230+
# ์–ด๋–ค ๋ณ€์ˆ˜์—์„œ๋“ ์ง€ .backward()๋ฅผ ํ˜ธ์ถœํ•˜๋ฉด ํ•ด๋‹น ๋ณ€์ˆ˜์—์„œ ์‹œ์ž‘ํ•˜๋Š” ์—ญ์ „ํŒŒ๊ฐ€ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.
232231
s.backward()
233232
print(x.grad)
234233

235234

236235
######################################################################
237-
# Understanding what is going on in the block below is crucial for being a
238-
# successful programmer in deep learning.
236+
# ๋”ฅ๋Ÿฌ๋‹์—์„œ ์„ฑ๊ณตํ•œ ํ”„๋กœ๊ทธ๋ž˜๋จธ๊ฐ€ ๋˜๊ธฐ ์œ„ํ•ด์„œ๋Š”
237+
# ์•„๋ž˜ ๋ธ”๋ก์—์„œ ๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚˜๊ณ  ์žˆ๋Š”์ง€ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์ด ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค.
239238
#
240239

241240
x = torch.randn(2, 2)
242241
y = torch.randn(2, 2)
243-
# By default, user created Tensors have ``requires_grad=False``
242+
# ์‚ฌ์šฉ์ž๊ฐ€ ์ƒ์„ฑํ•œ Tensor๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ``requires_grad=False`` ๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค
244243
print(x.requires_grad, y.requires_grad)
245244
z = x + y
246-
# So you can't backprop through z
245+
# ๊ทธ๋ž˜์„œ z๋ฅผ ํ†ตํ•ด ์—ญ์ „ํŒŒ๋ฅผ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค
247246
print(z.grad_fn)
248247

249-
# ``.requires_grad_( ... )`` changes an existing Tensor's ``requires_grad``
250-
# flag in-place. The input flag defaults to ``True`` if not given.
248+
# ``.requires_grad_( ... )`` ๋Š” ๊ธฐ์กด ํ…์„œ์˜ ``requires_grad``
249+
# ํ”Œ๋ž˜๊ทธ๋ฅผ ์ œ์ž๋ฆฌ์—์„œ(in-place) ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค. ์ž…๋ ฅ ํ”Œ๋ž˜๊ทธ๊ฐ€ ์ง€์ •๋˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ๊ธฐ๋ณธ๊ฐ’์€ ``True`` ์ž…๋‹ˆ๋‹ค.
251250
x = x.requires_grad_()
252251
y = y.requires_grad_()
253-
# z contains enough information to compute gradients, as we saw above
252+
# z๋Š” ์œ„์—์„œ ๋ณธ ๊ฒƒ์ฒ˜๋Ÿผ ๋ณ€ํ™”๋„๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ ์ •๋ณด๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค
254253
z = x + y
255254
print(z.grad_fn)
256-
# If any input to an operation has ``requires_grad=True``, so will the output
255+
# ์—ฐ์‚ฐ์— ๋Œ€ํ•œ ์ž…๋ ฅ์ด ``requires_grad=True`` ์ธ ๊ฒฝ์šฐ ์ถœ๋ ฅ๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค
257256
print(z.requires_grad)
258257

259-
# Now z has the computation history that relates itself to x and y
260-
# Can we just take its values, and **detach** it from its history?
258+
# ์ด์ œ z๋Š” x์™€ y์— ๋Œ€ํ•œ ๊ณ„์‚ฐ ๊ธฐ๋ก์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค
259+
# z์˜ ๊ฐ’๋งŒ ๊ฐ€์ ธ๊ฐ€๊ณ , ๊ธฐ๋ก์—์„œ **๋ถ„๋ฆฌ** ํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”?
261260
new_z = z.detach()
262261

263-
# ... does new_z have information to backprop to x and y?
264-
# NO!
262+
# ... new_z ๊ฐ€ x์™€ y๋กœ์˜ ์—ญ์ „ํŒŒ๋ฅผ ์œ„ํ•œ ์ •๋ณด๋ฅผ ๊ฐ–๊ณ  ์žˆ์„๊นŒ์š”?
263+
# ์•„๋‹™๋‹ˆ๋‹ค!
265264
print(new_z.grad_fn)
266-
# And how could it? ``z.detach()`` returns a tensor that shares the same storage
267-
# as ``z``, but with the computation history forgotten. It doesn't know anything
268-
# about how it was computed.
269-
# In essence, we have broken the Tensor away from its past history
265+
# ์–ด๋–ป๊ฒŒ ๊ทธ๋Ÿด ์ˆ˜๊ฐ€ ์žˆ์„๊นŒ์š”? ``z.detach()`` ๋Š” ``z`` ์™€ ๋™์ผํ•œ ์ €์žฅ๊ณต๊ฐ„์„ ์‚ฌ์šฉํ•˜์ง€๋งŒ
266+
# ๊ณ„์‚ฐ ๊ธฐ๋ก์€ ์—†๋Š” tensor๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ tensor๋Š” ์ž์‹ ์ด ์–ด๋–ป๊ฒŒ ๊ณ„์‚ฐ๋˜์—ˆ๋Š”์ง€
267+
# ์•„๋ฌด๊ฒƒ๋„ ์•Œ์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค.
268+
# ๋ณธ์งˆ์ ์œผ๋กœ๋Š” Tensor๋ฅผ ๊ณผ๊ฑฐ ๊ธฐ๋ก์œผ๋กœ๋ถ€ํ„ฐ ๋–ผ์–ด๋‚ธ ๊ฒ๋‹ˆ๋‹ค
270269

271270
###############################################################
272-
# You can also stop autograd from tracking history on Tensors
273-
# with ``.requires_grad=True`` by wrapping the code block in
274-
# ``with torch.no_grad():``
271+
# ๋˜ํ•œ ์ฝ”๋“œ ๋ธ”๋ก์„ ``with torch.no_grad():`` ๋กœ ๊ฐ์‹ธ
272+
# ``.requires_grad=True`` ์ธ ํ…์„œ์˜ ๊ธฐ๋ก์„ ์ถ”์ ํ•˜์ง€ ๋ชปํ•˜๊ฒŒ๋”
273+
# autograd๋ฅผ ๋ฉˆ์ถœ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
275274
print(x.requires_grad)
276275
print((x ** 2).requires_grad)
277276

0 commit comments

Comments
ย (0)