AI-Hypercomputer
diff --git a/‎.readthedocs.yml‎
Lines changed: 1 addition & 1 deletion b/‎.readthedocs.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎PREFLIGHT.md‎
Lines changed: 1 addition & 1 deletion b/‎PREFLIGHT.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎dependencies/dockerfiles/maxtext_db_dependencies.Dockerfile‎
Lines changed: 3 additions & 3 deletions b/‎dependencies/dockerfiles/maxtext_db_dependencies.Dockerfile‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎dependencies/dockerfiles/maxtext_dependencies.Dockerfile‎
Lines changed: 3 additions & 3 deletions b/‎dependencies/dockerfiles/maxtext_dependencies.Dockerfile‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎dependencies/dockerfiles/maxtext_gpu_dependencies.Dockerfile‎
Lines changed: 3 additions & 3 deletions b/‎dependencies/dockerfiles/maxtext_gpu_dependencies.Dockerfile‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎dependencies/dockerfiles/maxtext_jax_ai_image.Dockerfile‎
Lines changed: 3 additions & 3 deletions b/‎dependencies/dockerfiles/maxtext_jax_ai_image.Dockerfile‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/development.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/development.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/guides/data_input_pipeline/data_input_grain.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/guides/data_input_pipeline/data_input_grain.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/guides/data_input_pipeline/data_input_tfds.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/guides/data_input_pipeline/data_input_tfds.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/guides/knowledge_distillation.md‎
Lines changed: 3 additions & 2 deletions b/‎docs/guides/knowledge_distillation.md‎
Lines changed: 3 additions & 2 deletions
@@ -21,4 +21,4 @@ sphinx:
 # See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
 python:
   install:
-    - requirements: requirements_docs.txt
+    - requirements: dependencies/requirements/requirements_docs.txt
@@ -26,7 +26,7 @@ bash preflight.sh PLATFORM=GCE && numactl --membind 0 --cpunodebind=0 python3 -m
 ```
 
 For GKE,
-`numactl` should be built into your docker image from [maxtext_dependencies.Dockerfile](https://github.com/google/maxtext/blob/main/maxtext_dependencies.Dockerfile), so you can use it directly if you built the maxtext docker image. Here is an example
+`numactl` should be built into your docker image from [maxtext_dependencies.Dockerfile](https://github.com/google/maxtext/blob/main/dependencies/dockerfiles/maxtext_dependencies.Dockerfile), so you can use it directly if you built the maxtext docker image. Here is an example
 
 ```
 bash preflight.sh PLATFORM=GKE && numactl --membind 0 --cpunodebind=0 python3 -m MaxText.train src/MaxText/configs/base.yml run_name=$YOUR_JOB_NAME
 
@@ -40,9 +40,9 @@ ENV MAXTEXT_REPO_ROOT=/deps
 WORKDIR /deps
 
 # Copy setup files and dependency files separately for better caching
-COPY tools/setup /deps/tools/setup/
-COPY dependencies/requirements/ /deps/dependencies/requirements/
-COPY src/install_maxtext_extra_deps/extra_deps_from_github.txt /deps/dependencies/requirements/
+COPY tools/setup tools/setup/
+COPY dependencies/requirements/ dependencies/requirements/
+COPY src/install_maxtext_extra_deps/extra_deps_from_github.txt src/install_maxtext_extra_deps/
 
 # Install dependencies - these steps are cached unless the copied files change
 RUN echo "Running command: bash setup.sh MODE=$ENV_MODE JAX_VERSION=$ENV_JAX_VERSION LIBTPU_GCS_PATH=${ENV_LIBTPU_GCS_PATH} DEVICE=${ENV_DEVICE}"
 
@@ -40,9 +40,9 @@ ENV MAXTEXT_REPO_ROOT=/deps
 WORKDIR /deps
 
 # Copy setup files and dependency files separately for better caching
-COPY tools/setup /deps/tools/setup/
-COPY dependencies/requirements/ /deps/dependencies/requirements/
-COPY src/install_maxtext_extra_deps/extra_deps_from_github.txt /deps/dependencies/requirements/
+COPY tools/setup tools/setup/
+COPY dependencies/requirements/ dependencies/requirements/
+COPY src/install_maxtext_extra_deps/extra_deps_from_github.txt src/install_maxtext_extra_deps/
 
 # Install dependencies - these steps are cached unless the copied files change
 RUN echo "Running command: bash setup.sh MODE=$ENV_MODE JAX_VERSION=$ENV_JAX_VERSION LIBTPU_GCS_PATH=${ENV_LIBTPU_GCS_PATH} DEVICE=${ENV_DEVICE}"
 
@@ -42,9 +42,9 @@ ENV MAXTEXT_REPO_ROOT=/deps
 WORKDIR /deps
 
 # Copy setup files and dependency files separately for better caching
-COPY tools/setup /deps/tools/setup/
-COPY dependencies/requirements/ /deps/dependencies/requirements/
-COPY src/install_maxtext_extra_deps/extra_deps_from_github.txt /deps/dependencies/requirements/
+COPY tools/setup tools/setup/
+COPY dependencies/requirements/ dependencies/requirements/
+COPY src/install_maxtext_extra_deps/extra_deps_from_github.txt src/install_maxtext_extra_deps/
 
 # Install dependencies - these steps are cached unless the copied files change
 RUN echo "Running command: bash setup.sh MODE=$ENV_MODE JAX_VERSION=$ENV_JAX_VERSION DEVICE=${ENV_DEVICE}"
 
@@ -16,9 +16,9 @@ ENV MAXTEXT_REPO_ROOT=/deps
 WORKDIR /deps
 
 # Copy setup files and dependency files separately for better caching
-COPY tools/setup /deps/tools/setup/
-COPY dependencies/requirements/ /deps/dependencies/requirements/
-COPY src/install_maxtext_extra_deps/extra_deps_from_github.txt /deps/dependencies/requirements/
+COPY tools/setup tools/setup/
+COPY dependencies/requirements/ dependencies/requirements/
+COPY src/install_maxtext_extra_deps/extra_deps_from_github.txt src/install_maxtext_extra_deps/
 
 # For JAX AI tpu training images 0.4.37 AND 0.4.35
 # Orbax checkpoint installs the latest version of JAX,
 
@@ -12,7 +12,7 @@ If you are writing documentation for MaxText, you may want to preview the docume
 First, make sure you install the necessary dependencies. You can do this by navigating to your local clone of the MaxText repo and running:
 
 ```bash
-pip install -r requirements_docs.txt
+pip install -r dependencies/requirements/requirements_docs.txt
 ```
 
 Once the dependencies are installed, you can navigate to the `docs/` folder and run:
 
@@ -29,17 +29,17 @@ Grain ensures determinism in data input pipelines by saving the pipeline's state
 
 ## Using Grain
 1. Grain currently supports two data formats: [ArrayRecord](https://github.com/google/array_record) (random access) and [Parquet](https://arrow.apache.org/docs/python/parquet.html) (partial random-access through row groups). Only the ArrayRecord format supports the global shuffle mentioned above. For converting a dataset into ArrayRecord, see [Apache Beam Integration for ArrayRecord](https://github.com/google/array_record/tree/main/beam). Additionally, other random access data sources can be supported via a custom [data source](https://google-grain.readthedocs.io/en/latest/data_sources.html) class.
-2. When the dataset is hosted on a Cloud Storage bucket, Grain can read it through [Cloud Storage FUSE](https://cloud.google.com/storage/docs/gcs-fuse). The installation of Cloud Storage FUSE is included in [setup.sh](https://github.com/google/maxtext/blob/main/setup.sh). The user then needs to mount the Cloud Storage bucket to a local path for each worker, using the script [setup_gcsfuse.sh](https://github.com/google/maxtext/blob/main/setup_gcsfuse.sh). The script configures some parameters for the mount.
-```
-bash setup_gcsfuse.sh \
+2. When the dataset is hosted on a Cloud Storage bucket, Grain can read it through [Cloud Storage FUSE](https://cloud.google.com/storage/docs/gcs-fuse). The installation of Cloud Storage FUSE is included in [setup.sh](https://github.com/google/maxtext/blob/main/tools/setup/setup.sh). The user then needs to mount the Cloud Storage bucket to a local path for each worker, using the script [setup_gcsfuse.sh](https://github.com/google/maxtext/blob/main/tools/setup/setup_gcsfuse.sh). The script configures some parameters for the mount.
+```sh
+bash tools/setup/setup_gcsfuse.sh \
 DATASET_GCS_BUCKET=$BUCKET_NAME \
 MOUNT_PATH=$MOUNT_PATH \
 [FILE_PATH=$MOUNT_PATH/my_dataset]
 # FILE_PATH is optional, when provided, the script runs "ls -R" for pre-filling the metadata cache
 # https://cloud.google.com/storage/docs/cloud-storage-fuse/performance#improve-first-time-reads
 ```
 3. Set `dataset_type=grain`, `grain_file_type={arrayrecord|parquet}`, `grain_train_files` to match the file pattern on the mounted local path.
-4. Tune `grain_worker_count` for performance. This parameter controls the number of child processes used by Grain (more details in [behind_the_scenes](https://google-grain.readthedocs.io/en/latest/behind_the_scenes.html), [grain_pool.py](https://github.com/google/grain/blob/main/grain/_src/python/grain_pool.py)). If you use a large number of workers, check your config for gcsfuse in [setup_gcsfuse.sh](https://github.com/google/maxtext/blob/main/setup_gcsfuse.sh) to avoid gcsfuse throttling.
+4. Tune `grain_worker_count` for performance. This parameter controls the number of child processes used by Grain (more details in [behind_the_scenes](https://google-grain.readthedocs.io/en/latest/behind_the_scenes.html), [grain_pool.py](https://github.com/google/grain/blob/main/grain/_src/python/grain_pool.py)). If you use a large number of workers, check your config for gcsfuse in [setup_gcsfuse.sh](https://github.com/google/maxtext/blob/main/tools/setup/setup_gcsfuse.sh) to avoid gcsfuse throttling.
 
 5. For multi-source blending, you can specify multiple data sources with their respective weights using semicolon (;) as a separator and colon (:) for weights. The weights will be automatically normalized to sum to 1.0. For example:
 ```
@@ -52,8 +52,8 @@ grain_train_files=/tmp/gcsfuse/dataset1.array_record*:1;/tmp/gcsfuse/dataset2.ar
 Note: When using multiple data sources, only the ArrayRecord format is supported.
 
 6. Example command:
-```
-bash setup_gcsfuse.sh \
+```sh
+bash tools/setup/setup_gcsfuse.sh \
 DATASET_GCS_BUCKET=maxtext-dataset \
 MOUNT_PATH=/tmp/gcsfuse && \
 python3 -m MaxText.train src/MaxText/configs/base.yml \
 
@@ -1,8 +1,8 @@
 # TFDS pipeline
 
 1. Download the Allenai C4 dataset in TFRecord format to a Cloud Storage bucket. For information about cost, see [this discussion](https://github.com/allenai/allennlp/discussions/5056)
-```
-bash download_dataset.sh {GCS_PROJECT} {GCS_BUCKET_NAME}
+```sh
+bash tools/data_generation/download_dataset.sh ${GCS_PROJECT} ${GCS_BUCKET_NAME}
 ```
 2. In `src/MaxText/configs/base.yml` or through command line, set the following parameters:
 ```yaml
 
@@ -47,12 +47,13 @@ export RUN_NAME = <unique name for the run>
 
 #### b. Install dependencies
 
-```
+```sh
 git clone https://github.com/AI-Hypercomputer/maxtext.git
 python3 -m venv ~/venv-maxtext
 source ~/venv-maxtext/bin/activate
+python3 -m pip install uv
 cd maxtext
-uv pip install -r requirements.txt
+uv pip install -r dependencies/requirements/requirements.txt
 ```
 
 ### 1. Obtain and prepare the teacher model