Skip to content

Commit b362eb4

Browse files
authored
docs: translate compiling_optimizer.rst (#1049)
* docs: translate compiling_optimizer.rst * typo: tix typo
1 parent d1d2548 commit b362eb4

File tree

1 file changed

+19
-23
lines changed

1 file changed

+19
-23
lines changed
Lines changed: 19 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,24 @@
1-
(beta) Compiling the optimizer with torch.compile
1+
(beta) torch.compile๋กœ ์˜ตํ‹ฐ๋งˆ์ด์ € ์ปดํŒŒ์ผํ•˜๊ธฐ
22
==========================================================================================
33

4-
**Author:** `Michael Lazos <https://github.com/mlazos>`_
4+
**์ €์ž:** `Michael Lazos <https://github.com/mlazos>`_
5+
**๋ฒˆ์—ญ:** `๊น€์Šนํ™˜ <https://github.com/7SH7>`_
56

6-
The optimizer is a key algorithm for training any deep learning model.
7-
Since it is responsible for updating every model parameter, it can often
8-
become the bottleneck in training performance for large models. In this recipe,
9-
we will apply ``torch.compile`` to the optimizer to observe the GPU performance
10-
improvement.
7+
์˜ตํ‹ฐ๋งˆ์ด์ €๋Š” ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๋Š” ํ•ต์‹ฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค.
8+
๋ชจ๋“  ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๋Š” ์—ญํ• ์„ ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋Œ€๊ทœ๋ชจ ๋ชจ๋ธ์—์„œ๋Š” ์ข…์ข… ํ›ˆ๋ จ ์„ฑ๋Šฅ์˜ ๋ณ‘๋ชฉ์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
9+
์ด ๋ ˆ์‹œํ”ผ์—์„œ๋Š” ์˜ตํ‹ฐ๋งˆ์ด์ €์— ``torch.compile``์„ ์ ์šฉํ•˜์—ฌ GPU ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๊ด€์ฐฐํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
1110
1211
.. note::
1312
14-
This tutorial requires PyTorch 2.2.0 or later.
13+
์ด ํŠœํ† ๋ฆฌ์–ผ์€ PyTorch 2.2.0 ์ด์ƒ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
1514
16-
Model Setup
15+
๋ชจ๋ธ ์„ค์ •
1716
~~~~~~~~~~~~~~~~~~~~~
18-
For this example, we'll use a simple sequence of linear layers.
19-
Since we are only benchmarking the optimizer, the choice of model doesn't matter
20-
because optimizer performance is a function of the number of parameters.
17+
์ด ์˜ˆ์ œ์—์„œ๋Š” ๊ฐ„๋‹จํ•œ ์„ ํ˜• ๊ณ„์ธต์˜ ์‹œํ€€์Šค๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
18+
์šฐ๋ฆฌ๋Š” ์˜ตํ‹ฐ๋งˆ์ด์ €์˜ ์„ฑ๋Šฅ๋งŒ ๋ฒค์น˜๋งˆํ‚นํ•  ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์—, ๋ชจ๋ธ์˜ ์„ ํƒ์€ ์ค‘์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
19+
์˜ตํ‹ฐ๋งˆ์ด์ €์˜ ์„ฑ๋Šฅ์€ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ์ˆ˜์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
2120
22-
Depending on what machine you are using, your exact results may vary.
21+
์‚ฌ์šฉํ•˜๋Š” ๋จธ์‹ ์— ๋”ฐ๋ผ ์ •ํ™•ํ•œ ๊ฒฐ๊ณผ๋Š” ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
2322
2423
.. code-block:: python
2524
@@ -32,19 +31,17 @@ Depending on what machine you are using, your exact results may vary.
3231
output = model(input)
3332
output.sum().backward()
3433
35-
Setting up and running the optimizer benchmark
34+
์˜ตํ‹ฐ๋งˆ์ด์ € ๋ฒค์น˜๋งˆํฌ ์„ค์ • ๋ฐ ์‹คํ–‰
3635
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37-
In this example, we'll use the Adam optimizer
38-
and create a helper function to wrap the step()
39-
in ``torch.compile()``.
36+
์ด ์˜ˆ์ œ์—์„œ๋Š” Adam ์˜ตํ‹ฐ๋งˆ์ด์ €๋ฅผ ์‚ฌ์šฉํ•˜๊ณ , ``torch.compile()``์—์„œ step()์„ ๊ฐ์‹ธ๋Š” ๋„์šฐ๋ฏธ ํ•จ์ˆ˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
4037
4138
.. note::
4239
4340
``torch.compile`` is only supported on cuda devices with compute capability >= 7.0
4441

4542
.. code-block:: python
4643
47-
# exit cleanly if we are on a device that doesn't support torch.compile
44+
# torch.compile์ด ์ง€์›๋˜์ง€ ์•Š๋Š” ๋””๋ฐ”์ด์Šค์—์„œ๋Š” ๊น”๋”ํ•˜๊ฒŒ ์ข…๋ฃŒํ•ฉ๋‹ˆ๋‹ค.
4845
if torch.cuda.get_device_capability() < (7, 0):
4946
print("Exiting because torch.compile is not supported on this device.")
5047
import sys
@@ -59,7 +56,7 @@ in ``torch.compile()``.
5956
opt.step()
6057
6158
62-
# Let's define a helpful benchmarking function:
59+
# ์œ ์šฉํ•œ ๋ฒค์น˜๋งˆํ‚น ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•ด๋ด…์‹œ๋‹ค.
6360
import torch.utils.benchmark as benchmark
6461
6562
@@ -70,7 +67,7 @@ in ``torch.compile()``.
7067
return t0.blocked_autorange().mean * 1e6
7168
7269
73-
# Warmup runs to compile the function
70+
# ํ•จ์ˆ˜๋ฅผ ์ปดํŒŒ์ผํ•˜๊ธฐ ์œ„ํ•œ ์›œ์—… ์‹คํ–‰
7471
for _ in range(5):
7572
fn()
7673
@@ -82,13 +79,12 @@ in ``torch.compile()``.
8279
print(f"eager runtime: {eager_runtime}us")
8380
print(f"compiled runtime: {compiled_runtime}us")
8481
85-
Sample Results:
82+
์ƒ˜ํ”Œ ๊ฒฐ๊ณผ:
8683

8784
* Eager runtime: 747.2437149845064us
8885
* Compiled runtime: 392.07384741178us
8986

9087
See Also
9188
~~~~~~~~~
9289

93-
* For an in-depth technical overview, see
94-
`Compiling the optimizer with PT2 <https://dev-discuss.pytorch.org/t/compiling-the-optimizer-with-pt2/1669>`__
90+
* ์‹ฌ์ธต์ ์ธ ๊ธฐ์ˆ  ๊ฐœ์š”๋ฅผ ์œ„ํ•ด์„œ, `PT2๋กœ ์˜ตํ‹ฐ๋งˆ์ด์ € ์ปดํŒŒ์ผํ•˜๊ธฐ <https://dev-discuss.pytorch.org/t/compiling-the-optimizer-with-pt2/1669>`__ ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

0 commit comments

Comments
ย (0)