Skip to content

Commit d2e0dc9

Browse files
committed
Specify nvidia env throughout puzzle 09
1 parent c7f951a commit d2e0dc9

File tree

3 files changed

+23
-41
lines changed

3 files changed

+23
-41
lines changed

book/src/puzzle_09/first_case.md

Lines changed: 8 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ This puzzle presents a crashing GPU program where your task is to identify the i
77
**Prerequisites**: Complete [Mojo GPU Debugging Essentials](./essentials.md) to understand CUDA-GDB setup and basic debugging commands. Make sure you've run:
88

99
```bash
10-
pixi run setup-cuda-gdb
10+
pixi run -e nvidia setup-cuda-gdb
1111
```
1212

1313
This auto-detects your CUDA installation and sets up the necessary links for GPU debugging.
@@ -35,31 +35,15 @@ To experience the bug firsthand, run the following command in your terminal (`pi
3535
pixi run -e nvidia p09 --first-case
3636
```
3737

38-
You'll see output like when the program crashes with this error:
38+
You'll see output like this when the program crashes:
3939

4040
```txt
41-
CUDA call failed: CUDA_ERROR_ILLEGAL_ADDRESS (an illegal memory access was encountered)
42-
[24326:24326:20250801,180816.333593:ERROR file_io_posix.cc:144] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
43-
[24326:24326:20250801,180816.333653:ERROR file_io_posix.cc:144] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
44-
Please submit a bug report to https://github.com/modular/modular/issues and include the crash backtrace along with all the relevant source codes.
45-
Stack dump:
46-
0. Program arguments: /home/ubuntu/workspace/mojo-gpu-puzzles/.pixi/envs/default/bin/mojo problems/p09/p09.mojo
47-
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
48-
0 mojo 0x0000653a338d3d2b
49-
1 mojo 0x0000653a338d158a
50-
2 mojo 0x0000653a338d48d7
51-
3 libc.so.6 0x00007cbc08442520
52-
4 libc.so.6 0x00007cbc0851e88d syscall + 29
53-
5 libAsyncRTMojoBindings.so 0x00007cbc0ab68653
54-
6 libc.so.6 0x00007cbc08442520
55-
7 libc.so.6 0x00007cbc084969fc pthread_kill + 300
56-
8 libc.so.6 0x00007cbc08442476 raise + 22
57-
9 libc.so.6 0x00007cbc084287f3 abort + 211
58-
10 libAsyncRTMojoBindings.so 0x00007cbc097c7c7b
59-
11 libAsyncRTMojoBindings.so 0x00007cbc097c7c9e
60-
12 (error) 0x00007cbb5c00600f
61-
mojo crashed!
62-
Please file a bug report.
41+
First Case: Try to identify what's wrong without looking at the code!
42+
43+
stack trace was not collected. Enable stack trace collection with environment variable `MOJO_ENABLE_STACK_TRACE_ON_ERROR`
44+
Unhandled exception caught during execution: At open-source/max/mojo/stdlib/stdlib/gpu/host/device_context.mojo:2082:17: CUDA call failed: CUDA_ERROR_INVALID_IMAGE (device kernel image is invalid)
45+
To get more accurate error information, set MODULAR_DEVICE_CONTEXT_SYNC_MODE=true.
46+
/home/ubuntu/workspace/mojo-gpu-puzzles/.pixi/envs/nvidia/bin/mojo: error: execution exited with a non-zero result: 1
6347
```
6448

6549
## Your task: detective work

book/src/puzzle_09/second_case.md

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,10 @@ Building on your [crash debugging skills from the First Case](./first_case.md),
1111

1212
This intermediate-level debugging challenge covers investigating **algorithmic errors** using `LayoutTensor` operations, where the program runs successfully but produces wrong output - a much more common (and trickier) real-world debugging scenario.
1313

14-
**Prerequisites**: Complete [Mojo GPU Debugging Essentials](./essentials.md) and [Detective Work: First Case](./first_case.md) to understand CUDA-GDB workflow and systematic debugging techniques. Make sure you've run `pixi run setup-cuda-gdb` or similar symlink is available
14+
**Prerequisites**: Complete [Mojo GPU Debugging Essentials](./essentials.md) and [Detective Work: First Case](./first_case.md) to understand CUDA-GDB workflow and systematic debugging techniques. Make sure you run the setup:
1515

1616
```bash
17-
ln -sf /usr/local/cuda/bin/cuda-gdb-minimal $CONDA_PREFIX/bin/cuda-gdb-minimal
18-
ln -sf /usr/local/cuda/bin/cuda-gdb-python3.12-tui $CONDA_PREFIX/bin/cuda-gdb-python3.12-tui
17+
pixi run -e nvidia setup-cuda-gdb
1918
```
2019

2120
## Key concepts
@@ -38,7 +37,7 @@ First, examine the kernel without looking at the complete code:
3837
To experience the bug firsthand, run the following command in your terminal (`pixi` only):
3938

4039
```bash
41-
pixi run p09 --second-case
40+
pixi run -e nvidia p09 --second-case
4241
```
4342

4443
You'll see output like this - **no crash, but wrong results**:
@@ -49,10 +48,10 @@ This program computes sliding window sums for each position...
4948
Input array: [0, 1, 2, 3]
5049
Computing sliding window sums (window size = 3)...
5150
Each position should sum its neighbors: [left + center + right]
52-
Actual result: HostBuffer([0.0, 1.0, 3.0, 5.0])
53-
Expected: [1.0, 3.0, 6.0, 5.0]
54-
❌ Test FAILED - Sliding window sums are incorrect!
55-
Check the window indexing logic...
51+
stack trace was not collected. Enable stack trace collection with environment variable `MOJO_ENABLE_STACK_TRACE_ON_ERROR`
52+
Unhandled exception caught during execution: At open-source/max/mojo/stdlib/stdlib/gpu/host/device_context.mojo:2082:17: CUDA call failed: CUDA_ERROR_INVALID_IMAGE (device kernel image is invalid)
53+
To get more accurate error information, set MODULAR_DEVICE_CONTEXT_SYNC_MODE=true.
54+
/home/ubuntu/workspace/mojo-gpu-puzzles/.pixi/envs/nvidia/bin/mojo: error: execution exited with a non-zero result: 1
5655
```
5756

5857
## Your task: detective work
@@ -69,7 +68,7 @@ Check the window indexing logic...
6968
Start with:
7069

7170
```bash
72-
pixi run mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --second-case
71+
pixi run -e nvidia mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --second-case
7372
```
7473

7574
### GDB command shortcuts (faster debugging)
@@ -116,7 +115,7 @@ pixi run mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --second-
116115
#### Step 1: Start the debugger
117116

118117
```bash
119-
pixi run mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --second-case
118+
pixi run -e nvidia mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --second-case
120119
```
121120

122121
#### Step 2: analyze the symptoms first
@@ -360,7 +359,7 @@ for offset in range(ITER): # ← Only 2 iterations: [0, 1]
360359
- Focus on the algorithm logic rather than trying to inspect tensor contents
361360
- Use systematic reasoning to trace what each thread should vs actually accesses
362361
363-
**💡 Key Insight**: This type of off-by-one loop bug is extremely common in GPU programming. The systematic approach you learned here - combining limited debugger info with mathematical analysis and pattern recognition - is exactly how professional GPU developers debug when tools have limitations.
362+
**Key Insight**: This type of off-by-one loop bug is extremely common in GPU programming. The systematic approach you learned here - combining limited debugger info with mathematical analysis and pattern recognition - is exactly how professional GPU developers debug when tools have limitations.
364363
365364
</div>
366365
</details>

book/src/puzzle_09/third_case.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,10 @@ You've learned debugging [memory crashes](./first_case.md) and [logic bugs](./se
1111

1212
This advanced-level debugging challenge teaches you to investigate **thread coordination failures** using shared memory, LayoutTensor operations, and barrier synchronization - combining all the systematic investigation skills from the previous cases.
1313

14-
**Prerequisites**: Complete [Mojo GPU Debugging Essentials](./essentials.md), [Detective Work: First Case](./first_case.md), and [Detective Work: Second Case](./second_case.md) to understand CUDA-GDB workflow, variable inspection limitations, and systematic debugging approaches. Make sure you've run `pixi run setup-cuda-gdb` or similar symlink is available
14+
**Prerequisites**: Complete [Mojo GPU Debugging Essentials](./essentials.md), [Detective Work: First Case](./first_case.md), and [Detective Work: Second Case](./second_case.md) to understand CUDA-GDB workflow, variable inspection limitations, and systematic debugging approaches. Make sure you run the setup:
1515

1616
```bash
17-
ln -sf /usr/local/cuda/bin/cuda-gdb-minimal $CONDA_PREFIX/bin/cuda-gdb-minimal
18-
ln -sf /usr/local/cuda/bin/cuda-gdb-python3.12-tui $CONDA_PREFIX/bin/cuda-gdb-python3.12-tui
17+
pixi run -e nvidia setup-cuda-gdb
1918
```
2019

2120
## Key concepts
@@ -37,7 +36,7 @@ First, examine the kernel without looking at the complete code:
3736
To experience the bug firsthand, run the following command in your terminal (`pixi` only):
3837

3938
```bash
40-
pixi run p09 --third-case
39+
pixi -e nvidia run p09 --third-case
4140
```
4241

4342
You'll see output like this - **the program hangs indefinitely**:
@@ -68,7 +67,7 @@ Waiting for GPU computation to complete...
6867
Start with:
6968

7069
```bash
71-
pixi run mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --third-case
70+
pixi run -e nvidia mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --third-case
7271
```
7372

7473
### GDB command shortcuts (faster debugging)
@@ -114,7 +113,7 @@ pixi run mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --third-c
114113

115114
#### Step 1: start the debugger
116115
```bash
117-
pixi run mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --third-case
116+
pixi run -e nvidia mojo debug --cuda-gdb --break-on-launch problems/p09/p09.mojo --third-case
118117
```
119118

120119
#### Step 2: analyze the hanging behavior

0 commit comments

Comments
 (0)