Skip to content

Commit 2986198

Browse files
committed
Fix outdated server information in documentation
- Update NVIDIA driver mismatch solution to reference nvidia-upgrade.sh - Remove hardcoded resource specs (12 CPU cores, 128GB RAM, 256GB SSD) - Add notes about memory quota differences by server - Update cluster.md to reflect all 5 servers (roselab1-5) - Update bandwidth info to 100Gbps (post-migration)
1 parent b25f981 commit 2986198

File tree

2 files changed

+11
-7
lines changed

2 files changed

+11
-7
lines changed

docs/guide/cluster.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ This is particularly useful when you've installed many Python packages and pip's
5656

5757
## Benefits of Using This Utility
5858

59-
- Access to multiple servers (roselab1~4) independently
59+
- Access to multiple servers (roselab1-5) independently
6060
- Quick replication of your environment on another server when your primary server is overloaded
6161
- Easy synchronization of environments across multiple servers
6262
- Set up external access to JupyterLab, TCP services, or web services without SSH forwarding
@@ -94,7 +94,7 @@ HTTPS adds a security layer and is more browser-friendly but only supports hoste
9494

9595
## Automation
9696

97-
You can create scripts for frequent or scheduled synchronization. The current inter-server bandwidth is 25Gbps, with plans to upgrade to 100Gbps. The 300MB/s data transfer rate is well within these limits.
97+
You can create scripts for frequent or scheduled synchronization. The inter-server bandwidth is 100Gbps after the October 2024 server migration. The 300MB/s data transfer rate is well within these limits.
9898

9999
## Security Measures
100100

docs/guide/index.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Submit your request using the [Account Request Form](https://docs.google.com/for
4545
- You'll have root permission to install or remove software (except the NVIDIA driver and kernel modules).
4646

4747
::: warning
48-
The host and container NVIDIA driver versions must match. (You cannot change host driver versions, which are in the [config table](../config/)) If you see `Failed to initialize NVML: Driver/library version mismatch`, contact the admin.
48+
The host and container NVIDIA driver versions must match. If you see `Failed to initialize NVML: Driver/library version mismatch`, use the fix script: `sudo /utilities/nvidia-upgrade.sh && sudo reboot`. See the [Troubleshooting](./troubleshooting#nvidia-driver-issues) guide for details.
4949
:::
5050

5151
### 2. Access Your Container
@@ -184,7 +184,7 @@ If you're still experiencing connection issues after this step, please contact t
184184

185185
### 4. Explore Your Container
186186

187-
Now let's check the resources assigned to you. First, use `lscpu` to check the CPU cores. Although the CPU indices may differ, you should see 12 online CPU cores. Here's an example output:
187+
Now let's check the resources assigned to you. First, use `lscpu` to check the CPU cores. Although the CPU indices may differ, you should see your allocated CPU cores. Here's an example output:
188188

189189
```bash
190190
$ lscpu
@@ -195,7 +195,7 @@ CPU(s): 56
195195
...
196196
```
197197

198-
Next, you can inspect the memory assigned to you using the `/proc/meminfo` file. You should see around 128 GB of total RAM.
198+
Next, you can inspect the memory assigned to you using the `/proc/meminfo` file. The memory quota varies by server (see [Limitations](./limit#memory-quota-by-server) for details).
199199

200200
```bash
201201
$ cat /proc/meminfo
@@ -204,9 +204,13 @@ MemFree: 96093828 kB
204204
MemAvailable: 96883860 kB
205205
```
206206

207-
To see the file system, run `df -H` . You would see
207+
::: tip Note
208+
Memory quotas differ by server: roselab1-3 have standard quota, roselab4 has 2x quota, and roselab5 has 4x quota. See the [Memory Limits](./limit#memory-limits) section for more information.
209+
:::
210+
211+
To see the file system, run `df -H` . You would see
208212

209-
* the system SSD with around 256 GB of available space,
213+
* the system SSD (size varies by server),
210214
* a 5 TB private data HDD mounted under `/data` that is only accessible to you, and
211215
* a 5 TB public data HDD mounted under `/public` that is accessible to everyone.
212216

0 commit comments

Comments
 (0)