From 85a339470751edd1e6f186277202d6954412cd9a Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Mon, 12 Jun 2023 17:53:58 +0800
Subject: [PATCH 01/12] update readme

---
 configs/rec/crnn/README.md | 72 ++++++++++++++++++++++++++++----------
 1 file changed, 54 insertions(+), 18 deletions(-)
diff --git a/configs/rec/crnn/README.md b/configs/rec/crnn/README.md
index 20d560f60..25e49f09f 100644
--- a/configs/rec/crnn/README.md
+++ b/configs/rec/crnn/README.md
@@ -39,33 +39,22 @@ According to our experiments, the evaluation results on public benchmark dataset
 
 <div align="center">
 
-| **Model** | **Context**    | **Backbone** | **Avg Accuracy** | **Train T.** | **Recipe** | **Download** |
-| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
-| CRNN      | D910x8-MS1.8-G | VGG7 | 82.03%    | 2445 s/epoch          | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-3a19e349.mindir)   |
-| CRNN      | D910x8-MS1.8-G | ResNet34_vd | 84.45%    | 2118 s/epoch         | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-2f016384.mindir) |
+| **Model** | **Context**  | **Backbone** | **Avg Accuracy** | **Train T.** | **FPS** | **Recipe** | **Download** |
+| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
+| CRNN      | D910x8-MS1.8-G | VGG7 | 82.03%  |  2445 s/epoch | 5802.71 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-573dbd61.mindir)   |
+| CRNN      | D910x8-MS1.8-G | ResNet34_vd | 84.45%  | 2118 s/epoch | 6694.84 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-eb10a0c9.mindir) |
 </div>
 
-<details open>
+- Detailed accuracy results for each benchmark dataset: 
   <div align="center">
-  <summary>Detailed accuracy results for each benchmark dataset</summary>
 
   | **Model** | **Backbone** | **IC03_860** | **IC03_867** | **IC13_857** | **IC13_1015** | **IC15_1811** | **IC15_2077** | **IIIT5k_3000** | **SVT** | **SVTP** | **CUTE80** | **Average** |
   | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: |
   | CRNN | VGG7 | 94.53% | 94.00% | 92.18% | 90.74% | 71.95% | 66.06% | 84.10% | 83.93% | 73.33% | 69.44% | 82.03% |
   | CRNN | ResNet34_vd | 94.42% | 94.23% | 93.35% | 92.02% | 75.92% | 70.15% | 87.73% | 86.40% | 76.28% | 73.96% | 84.45% |
   </div>
-</details>
 
-### Performance
-
-#### Training Perf.
-
-| Device | Model | Backbone | Dataset | Params | Batch size per card | Graph train 8P (s/epoch) | Graph train 8P (ms/step) | Graph train 8P (FPS) |
-| :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
-| Ascend910| CRNN | VGG7 | MJ+ST | 8.72 M | 16 | 2488.82 | 22.06 | 5802.71 |
-| Ascend910| CRNN | ResNet34_vd | MJ+ST | 24.48 M | 64 | 2157.18 | 76.48 | 6694.84 |
-
-#### Inference Perf.
+### Inference Perf.
 | Device | Env | Model | Backbone | Params | Test Dataset | Batch size | Graph infer 1P (FPS) |
 | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
 | Ascend310P | Lite2.0 | CRNN | ResNet34_vd | 24.48 M | IC15 | 1 | 361.09 |
@@ -76,6 +65,7 @@ According to our experiments, the evaluation results on public benchmark dataset
 - To reproduce the result on other contexts, please ensure the global batch size is the same.
 - The characters supported by model are lowercase English characters from a to z and numbers from 0 to 9. More explanation on dictionary, please refer to [4. Character Dictionary](#4-character-dictionary).
 - The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset Download & Dataset Usage](#312-dataset-download) section.
+- The input shape for exported MindIR file in the download link is (32, 100).
 
 
 ## 3. Quick Start
@@ -378,13 +368,59 @@ After training, evaluation results on the benchmark test set are as follows, whe
 
 | **Model** | **Language** | **Context**  |**Backbone** | **Scene** | **Web** | **Document** | **Train T.** | **FPS** | **Recipe** | **Download** |
 | :-----: | :-----:  | :--------: | :--------: | :--------: | :--------: | :--------: | :---------: | :--------: | :---------: | :-----------: |
-| CRNN    | Chinese | D910x4-MS1.10-G | ResNet34_vd | 60.45% | 65.95% | 97.68% | 647 s/epoch | 1180 | [crnn_resnet34_ch.yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34_ch.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_ch-7a342e3c.ckpt) \| [mindir]() |
+| CRNN    | Chinese | D910x4-MS1.10-G | ResNet34_vd | 60.45% | 65.95% | 97.68% | 647 s/epoch | 1180 | [crnn_resnet34_ch.yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34_ch.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_ch-7a342e3c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_ch-7a342e3c-105bccb2.mindir) |
 </div>
 
+**Notes:**
+- The input shape for exported MindIR file in the download link is (32, 320).
+
 ### Training with Custom Datasets
 You can train models for different languages with your own custom datasets. Loading the pretrained Chinese model to finetune on your own dataset usually yields better results than training from scratch. Please refer to the tutorial [Training Recognition Network with Custom Datasets](../../../docs/en/tutorials/training_recognition_custom_dataset.md).
 
 
+## 6. MindSpore Lite Inference
+
+To inference with MindSpot Lite on Ascend 310, please refer to the tutorial [MindOCR Inference](../../../docs/en/inference/inference_tutorial_en.md). In short, the whole process consists of the following steps:
+
+**1. Model Export**
+
+Please [download](#2-results) the exported MindIR file first, or refer to the [Model Export](../../README.md) tutorial and use the following command to export the trained ckpt model to  MindIR file:
+
+```shell
+python tools/export.py --model_name crnn --data_shape 32 100 --local_ckpt_path /path/to/local_ckpt.ckpt
+# or
+python tools/export.py --model_name configs/rec/crnn/crnn_resnet34.yaml --data_shape 32 100 --local_ckpt_path /path/to/local_ckpt.ckpt
+```
+
+The `data_shape` is the model input shape of height and width for MindIR file. The shape value of MindIR in the download link can be found in [Notes](#2-results) under results table.
+
+
+**2. Environment Installation**
+
+Please refer to [Environment Installation](../../../docs/en/inference/environment_en.md#2-mindspore-lite-inference) tutorial to configure the MindSpore Lite inference environment.
+
+**3. Model Conversion**
+
+Please refer to [Model Conversion](../../../docs/en/inference/convert_tutorial_en.md#1-mindocr-models),
+and use the `converter_lite` tool for offline conversion of the MindIR file, where the `input_shape` in `configFile` needs to be filled in with the value from MindIR export,
+as mentioned above (32, 100), and the format is NCHW.
+
+**4. Inference**
+
+Assuming that you obtain output.mindir after model conversion, go to the `deploy/py_infer` directory, and use the following command for inference:
+
+```shell
+python infer.py \
+    --input_images_dir=/your_path_to/test_images \
+    --device=Ascend \
+    --device_id=0 \
+    --det_model_path=your_path_to/output.mindir \
+    --det_config_path=../../configs/rec/crnn/crnn_resnet34.yaml \
+    --backend=lite \
+    --res_save_dir=results_dir
+```
+
+
 ## References
 <!--- Guideline: Citation format GB/T 7714 is suggested. -->
 

From 575e050b2591372d685be210184be4e78617d6bb Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Mon, 12 Jun 2023 17:57:01 +0800
Subject: [PATCH 02/12] Revert "update readme"

This reverts commit 85a339470751edd1e6f186277202d6954412cd9a.
---
 configs/rec/crnn/README.md | 72 ++++++++++----------------------------
 1 file changed, 18 insertions(+), 54 deletions(-)

diff --git a/configs/rec/crnn/README.md b/configs/rec/crnn/README.md
index 25e49f09f..20d560f60 100644
--- a/configs/rec/crnn/README.md
+++ b/configs/rec/crnn/README.md
@@ -39,22 +39,33 @@ According to our experiments, the evaluation results on public benchmark dataset
 
 <div align="center">
 
-| **Model** | **Context**  | **Backbone** | **Avg Accuracy** | **Train T.** | **FPS** | **Recipe** | **Download** |
-| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
-| CRNN      | D910x8-MS1.8-G | VGG7 | 82.03%  |  2445 s/epoch | 5802.71 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-573dbd61.mindir)   |
-| CRNN      | D910x8-MS1.8-G | ResNet34_vd | 84.45%  | 2118 s/epoch | 6694.84 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-eb10a0c9.mindir) |
+| **Model** | **Context**    | **Backbone** | **Avg Accuracy** | **Train T.** | **Recipe** | **Download** |
+| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
+| CRNN      | D910x8-MS1.8-G | VGG7 | 82.03%    | 2445 s/epoch          | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-3a19e349.mindir)   |
+| CRNN      | D910x8-MS1.8-G | ResNet34_vd | 84.45%    | 2118 s/epoch         | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-2f016384.mindir) |
 </div>
 
-- Detailed accuracy results for each benchmark dataset: 
+<details open>
   <div align="center">
+  <summary>Detailed accuracy results for each benchmark dataset</summary>
 
   | **Model** | **Backbone** | **IC03_860** | **IC03_867** | **IC13_857** | **IC13_1015** | **IC15_1811** | **IC15_2077** | **IIIT5k_3000** | **SVT** | **SVTP** | **CUTE80** | **Average** |
   | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: |
   | CRNN | VGG7 | 94.53% | 94.00% | 92.18% | 90.74% | 71.95% | 66.06% | 84.10% | 83.93% | 73.33% | 69.44% | 82.03% |
   | CRNN | ResNet34_vd | 94.42% | 94.23% | 93.35% | 92.02% | 75.92% | 70.15% | 87.73% | 86.40% | 76.28% | 73.96% | 84.45% |
   </div>
+</details>
 
-### Inference Perf.
+### Performance
+
+#### Training Perf.
+
+| Device | Model | Backbone | Dataset | Params | Batch size per card | Graph train 8P (s/epoch) | Graph train 8P (ms/step) | Graph train 8P (FPS) |
+| :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
+| Ascend910| CRNN | VGG7 | MJ+ST | 8.72 M | 16 | 2488.82 | 22.06 | 5802.71 |
+| Ascend910| CRNN | ResNet34_vd | MJ+ST | 24.48 M | 64 | 2157.18 | 76.48 | 6694.84 |
+
+#### Inference Perf.
 | Device | Env | Model | Backbone | Params | Test Dataset | Batch size | Graph infer 1P (FPS) |
 | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
 | Ascend310P | Lite2.0 | CRNN | ResNet34_vd | 24.48 M | IC15 | 1 | 361.09 |
@@ -65,7 +76,6 @@ According to our experiments, the evaluation results on public benchmark dataset
 - To reproduce the result on other contexts, please ensure the global batch size is the same.
 - The characters supported by model are lowercase English characters from a to z and numbers from 0 to 9. More explanation on dictionary, please refer to [4. Character Dictionary](#4-character-dictionary).
 - The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset Download & Dataset Usage](#312-dataset-download) section.
-- The input shape for exported MindIR file in the download link is (32, 100).
 
 
 ## 3. Quick Start
@@ -368,59 +378,13 @@ After training, evaluation results on the benchmark test set are as follows, whe
 
 | **Model** | **Language** | **Context**  |**Backbone** | **Scene** | **Web** | **Document** | **Train T.** | **FPS** | **Recipe** | **Download** |
 | :-----: | :-----:  | :--------: | :--------: | :--------: | :--------: | :--------: | :---------: | :--------: | :---------: | :-----------: |
-| CRNN    | Chinese | D910x4-MS1.10-G | ResNet34_vd | 60.45% | 65.95% | 97.68% | 647 s/epoch | 1180 | [crnn_resnet34_ch.yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34_ch.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_ch-7a342e3c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_ch-7a342e3c-105bccb2.mindir) |
+| CRNN    | Chinese | D910x4-MS1.10-G | ResNet34_vd | 60.45% | 65.95% | 97.68% | 647 s/epoch | 1180 | [crnn_resnet34_ch.yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34_ch.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_ch-7a342e3c.ckpt) \| [mindir]() |
 </div>
 
-**Notes:**
-- The input shape for exported MindIR file in the download link is (32, 320).
-
 ### Training with Custom Datasets
 You can train models for different languages with your own custom datasets. Loading the pretrained Chinese model to finetune on your own dataset usually yields better results than training from scratch. Please refer to the tutorial [Training Recognition Network with Custom Datasets](../../../docs/en/tutorials/training_recognition_custom_dataset.md).
 
 
-## 6. MindSpore Lite Inference
-
-To inference with MindSpot Lite on Ascend 310, please refer to the tutorial [MindOCR Inference](../../../docs/en/inference/inference_tutorial_en.md). In short, the whole process consists of the following steps:
-
-**1. Model Export**
-
-Please [download](#2-results) the exported MindIR file first, or refer to the [Model Export](../../README.md) tutorial and use the following command to export the trained ckpt model to  MindIR file:
-
-```shell
-python tools/export.py --model_name crnn --data_shape 32 100 --local_ckpt_path /path/to/local_ckpt.ckpt
-# or
-python tools/export.py --model_name configs/rec/crnn/crnn_resnet34.yaml --data_shape 32 100 --local_ckpt_path /path/to/local_ckpt.ckpt
-```
-
-The `data_shape` is the model input shape of height and width for MindIR file. The shape value of MindIR in the download link can be found in [Notes](#2-results) under results table.
-
-
-**2. Environment Installation**
-
-Please refer to [Environment Installation](../../../docs/en/inference/environment_en.md#2-mindspore-lite-inference) tutorial to configure the MindSpore Lite inference environment.
-
-**3. Model Conversion**
-
-Please refer to [Model Conversion](../../../docs/en/inference/convert_tutorial_en.md#1-mindocr-models),
-and use the `converter_lite` tool for offline conversion of the MindIR file, where the `input_shape` in `configFile` needs to be filled in with the value from MindIR export,
-as mentioned above (32, 100), and the format is NCHW.
-
-**4. Inference**
-
-Assuming that you obtain output.mindir after model conversion, go to the `deploy/py_infer` directory, and use the following command for inference:
-
-```shell
-python infer.py \
-    --input_images_dir=/your_path_to/test_images \
-    --device=Ascend \
-    --device_id=0 \
-    --det_model_path=your_path_to/output.mindir \
-    --det_config_path=../../configs/rec/crnn/crnn_resnet34.yaml \
-    --backend=lite \
-    --res_save_dir=results_dir
-```
-
-
 ## References
 <!--- Guideline: Citation format GB/T 7714 is suggested. -->
 

From dfe53139417b7c5e6157a5426a637e6792c92fe8 Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Mon, 19 Jun 2023 10:04:30 +0800
Subject: [PATCH 03/12] update rec img transform

---
 mindocr/data/transforms/rec_transforms.py | 173 +++++++++++++---------
 1 file changed, 99 insertions(+), 74 deletions(-)

diff --git a/mindocr/data/transforms/rec_transforms.py b/mindocr/data/transforms/rec_transforms.py
index 0897844cd..d019e1b9b 100644
--- a/mindocr/data/transforms/rec_transforms.py
+++ b/mindocr/data/transforms/rec_transforms.py
@@ -11,9 +11,9 @@
     "RecCTCLabelEncode",
     "RecAttnLabelEncode",
     "RecResizeImg",
+    "RecResizeNormImg",
     "RecResizeNormForInfer",
     "SVTRRecResizeImg",
-    "Rotate90IfVertical",
     "ClsLabelEncode",
 ]
 
@@ -247,7 +247,13 @@ def str2idx(text: str, label_dict: Dict[str, int], max_text_len: int = 23, lower
 
 
 # TODO: reorganize the code for different resize transformation in rec task
-def resize_norm_img(img, image_shape, padding=True, interpolation=cv2.INTER_LINEAR):
+def resize_norm_img(img,
+                    image_shape,
+                    padding=True,
+                    norm_before_pad=False,
+                    mean=[127.0, 127.0, 127.0],
+                    std=[127.0, 127.0, 127.0],
+                    interpolation=cv2.INTER_LINEAR):
     """
     resize image
     Args:
@@ -261,7 +267,8 @@ def resize_norm_img(img, image_shape, padding=True, interpolation=cv2.INTER_LINE
     w = img.shape[1]
     c = img.shape[2]
     if not padding:
-        resized_image = cv2.resize(img, (imgW, imgH), interpolation=interpolation)
+        resized_image = cv2.resize(
+            img, (imgW, imgH), interpolation=interpolation)
         resized_w = imgW
     else:
         ratio = w / float(h)
@@ -271,32 +278,45 @@ def resize_norm_img(img, image_shape, padding=True, interpolation=cv2.INTER_LINE
             resized_w = int(math.ceil(imgH * ratio))
         resized_image = cv2.resize(img, (resized_w, imgH))
 
-    """
-    resized_image = resized_image.astype('float32')
-    if image_shape[0] == 1:
-        resized_image = resized_image / 255
-        resized_image = resized_image[np.newaxis, :]
-    else:
-        resized_image = resized_image.transpose((2, 0, 1)) / 255
-    resized_image -= 0.5
-    resized_image /= 0.5
-    """
-    padding_im = np.zeros((imgH, imgW, c), dtype=np.uint8)
-    padding_im[:, 0:resized_w, :] = resized_image
     valid_ratio = min(1.0, float(resized_w / imgW))
-    return padding_im, valid_ratio
+
+    if padding:
+        if norm_before_pad:
+            resized_image = (resized_image - mean) / std
+
+        padded_img = np.zeros((imgH, imgW, c), dtype=resized_image.dtype)
+        padded_img[:, 0:resized_w, :] = resized_image
+
+        if not norm_before_pad:
+            padded_img = (padded_img - mean) / std
+
+        return padded_img, valid_ratio
+    else:
+        resized_image = (resized_image - mean) / std
+        return resized_image, valid_ratio
 
 
 # TODO: check diff from resize_norm_img
-def resize_norm_img_chinese(img, image_shape):
-    """adopted from paddle"""
+def resize_norm_img_chinese(img,
+                            image_shape,
+                            norm_before_pad=False,
+                            mean=[127.0, 127.0, 127.0],
+                            std=[127.0, 127.0, 127.0],
+                            interpolation=cv2.INTER_LINEAR):
+    '''
+    resize image with aspect-ratio keeping and padding
+    Args:
+        img: shape (H, W, C)
+        image_shape: image shape after resize, in (C, H, W)
+
+    '''
     imgH, imgW = image_shape
     # todo: change to 0 and modified image shape
     max_wh_ratio = imgW * 1.0 / imgH
     h, w = img.shape[0], img.shape[1]
     c = img.shape[2]
     ratio = w * 1.0 / h
-
+    max_wh_ratio = min(max(max_wh_ratio, ratio), max_wh_ratio)
     imgW = int(imgH * max_wh_ratio)
     if math.ceil(imgH * ratio) > imgW:
         resized_w = imgW
@@ -304,48 +324,80 @@ def resize_norm_img_chinese(img, image_shape):
         resized_w = int(math.ceil(imgH * ratio))
     resized_image = cv2.resize(img, (resized_w, imgH))
 
-    """
-    resized_image = resized_image.astype('float32')
-    if image_shape[0] == 1:
-        resized_image = resized_image / 255
-        resized_image = resized_image[np.newaxis, :]
-    else:
-        resized_image = resized_image.transpose((2, 0, 1)) / 255
-    resized_image -= 0.5
-    resized_image /= 0.5
-    """
-    # padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
-    padding_im = np.zeros((imgH, imgW, c), dtype=np.uint8)
-    # padding_im[:, :, 0:resized_w] = resized_image
-    padding_im[:, 0:resized_w, :] = resized_image
     valid_ratio = min(1.0, float(resized_w / imgW))
-    return padding_im, valid_ratio
 
+    if norm_before_pad:
+        resized_image = (resized_image - mean) / std
 
-# TODO: remove infer_mode and character_dict_path if they are not necesary
-class RecResizeImg(object):
-    """adopted from paddle
-    resize, convert from hwc to chw, rescale pixel value to -1 to 1
-    """
+    padded_img = np.zeros((imgH, imgW, c), dtype=resized_image.dtype)
+    padded_img[:, 0:resized_w, :] = resized_image
 
-    def __init__(self, image_shape, infer_mode=False, character_dict_path=None, padding=True, **kwargs):
+    if not norm_before_pad:
+        padded_img = (padded_img - mean) / std
+
+    return padded_img, valid_ratio
+
+
+class RecResizeNormImg(object):
+    ''' adopted from paddle
+    Resize and normalize image, and pad image if needed.
+
+    Args:
+        norm_before_pad: If True, perform normalization before padding (by doing so, the padding values will beall zero. Good practice.). Otherwise, per  Default: False
+    '''
+    def __init__(self,
+                 image_shape,
+                 infer_mode=False,
+                 character_dict_path=None,
+                 padding=True,
+                 norm_before_pad=False,
+                 mean=[127.0, 127.0, 127.0],
+                 std=[127.0, 127.0, 127.0],
+                 **kwargs):
         self.image_shape = image_shape
         self.infer_mode = infer_mode
         self.character_dict_path = character_dict_path
         self.padding = padding
+        self.norm_before_pad = norm_before_pad
+        self.mean = np.array(mean, dtype="float32")
+        self.std = np.array(std, dtype="float32")
 
     def __call__(self, data):
-        img = data["image"]
+        img = data['image']
         if self.infer_mode and self.character_dict_path is not None:
-            norm_img, valid_ratio = resize_norm_img_chinese(img, self.image_shape)
+            norm_img, valid_ratio = resize_norm_img_chinese(img,
+                                                            self.image_shape,
+                                                            self.norm_before_pad,
+                                                            self.mean,
+                                                            self.std
+                                                            )
         else:
-            norm_img, valid_ratio = resize_norm_img(img, self.image_shape, self.padding)
-        data["image"] = norm_img
-        data["valid_ratio"] = valid_ratio
-        # TODO: data['shape_list'] = ?
+            norm_img, valid_ratio = resize_norm_img(img,
+                                                    self.image_shape,
+                                                    self.padding,
+                                                    self.norm_before_pad,
+                                                    self.mean,
+                                                    self.std,
+                                                    )
+        data['image'] = norm_img
+        data['valid_ratio'] = valid_ratio
         return data
 
 
+# TODO: remove infer_mode and character_dict_path if they are not necesary
+class RecResizeImg(RecResizeNormImg):
+    '''
+    This is to make compatible with older version code that uses RecResizeImg, which is to be updated.
+
+    TODO: replace RecResizeImg followed by NormlaizeImage in yaml files with RecResizeNormImg op.
+    '''
+    def __init__(self, image_shape, infer_mode=False, character_dict_path=None, padding=True, **kwargs):
+        super.__init__(
+                image_shape, infer_mode, character_dict_path, padding, norm_befoer_pad=False,
+                mean=[0., 0., 0.], std=[1., 1., 1.],
+                )
+
+
 class SVTRRecResizeImg(object):
     def __init__(self, image_shape, padding=True, **kwargs):
         self.image_shape = image_shape
@@ -425,9 +477,7 @@ def __call__(self, data):
 
         # TODO: norm before padding
 
-        data["shape_list"] = np.array(
-            [h, w, resize_h / h, resize_w / w], dtype=np.float32
-        )  # TODO: reformat, currently align to det
+        data['shape_list'] = [h, w, resize_h / h, resize_w / w] # TODO: reformat, currently align to det
         if self.norm_before_pad:
             resized_img = self.norm(resized_img)
 
@@ -444,31 +494,6 @@ def __call__(self, data):
         return data
 
 
-class Rotate90IfVertical:
-    """Rotate the image by 90 degree when the height/width ratio is larger than the given threshold.
-    Note: It needs to be called before image resize."""
-
-    def __init__(self, threshold: float = 1.5, direction: str = "counterclockwise", **kwargs):
-        self.threshold = threshold
-
-        if direction == "counterclockwise":
-            self.flag = cv2.ROTATE_90_COUNTERCLOCKWISE
-        elif direction == "clockwise":
-            self.flag = cv2.ROTATE_90_CLOCKWISE
-        else:
-            raise ValueError("Unsupported direction")
-
-    def __call__(self, data):
-        img = data["image"]
-
-        h, w, _ = img.shape
-        if h / w > self.threshold:
-            img = cv2.rotate(img, self.flag)
-
-        data["image"] = img
-        return data
-
-
 class ClsLabelEncode(object):
     def __init__(self, label_list, **kwargs):
         self.label_list = label_list

From 95c5b52958d3fbfdab03878545e905bbc1d99e67 Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Mon, 19 Jun 2023 10:28:36 +0800
Subject: [PATCH 04/12] add config

---
 configs/rec/crnn/crnn_resnet34_server.yaml | 150 +++++++++++++++++++++
 1 file changed, 150 insertions(+)
 create mode 100644 configs/rec/crnn/crnn_resnet34_server.yaml

diff --git a/configs/rec/crnn/crnn_resnet34_server.yaml b/configs/rec/crnn/crnn_resnet34_server.yaml
new file mode 100644
index 000000000..756266868
--- /dev/null
+++ b/configs/rec/crnn/crnn_resnet34_server.yaml
@@ -0,0 +1,150 @@
+system:
+  mode: 0 # 0 for graph mode, 1 for pynative mode in MindSpore
+  distribute: True
+  amp_level: 'O3'
+  seed: 42
+  log_interval: 100
+  val_while_train: True
+  drop_overflow_update: False
+
+common:
+  character_dict_path: &character_dict_path  mindocr/utils/dict/en_dict.txt
+  num_classes: &num_classes 96 # num_chars_in_dict+1,  TODO: retreive it from dict or check correctness
+  max_text_len: &max_text_len 24
+  infer_mode: &infer_mode False
+  use_space_char: &use_space_char True
+  lower: &lower False
+  batch_size: &batch_size 64
+
+model:
+  type: rec
+  transform: null
+  backbone:
+    name: rec_resnet34
+    pretrained: False
+  neck:
+    name: RNNEncoder
+    hidden_size: 256
+  head:
+    name: CTCHead
+    weight_init: crnn_customised
+    bias_init: crnn_customised
+    out_channels: *num_classes
+
+postprocess:
+  name: RecCTCLabelDecode
+  character_dict_path: *character_dict_path
+  use_space_char: *use_space_char
+
+metric:
+  name: RecMetric
+  main_indicator: acc
+  character_dict_path: *character_dict_path
+  ignore_space: True
+  print_flag: False
+
+loss:
+  name: CTCLoss
+  pred_seq_len: 25 # TODO: retrieve from the network output shape.
+  max_label_len: *max_text_len  # this value should be smaller than pre_seq_len
+  batch_size: *batch_size
+
+scheduler:
+  scheduler: warmup_cosine_decay
+  min_lr: 0.000001
+  lr: 0.001
+  num_epochs: 30
+  warmup_epochs: 2
+  decay_epochs: 28
+
+optimizer:
+  opt: adamw
+  filter_bias_and_bn: True
+  momentum: 0.95
+  weight_decay: 0.0001
+  nesterov: False
+
+loss_scaler:
+  type: dynamic
+  loss_scale: 512
+  scale_factor: 2.0
+  scale_window: 1000
+
+train:
+  ckpt_save_dir: './crnn_resnet34_server_adj'
+  pred_cast_fp32: False # let CTCLoss cast internally
+  ema: True # added
+  dataset_sink_mode: False
+  dataset:
+    type: LMDBDataset
+    dataset_root: /path/to/data_lmdb_release/
+    data_dir: training/
+    # label_file: # not required when using LMDBDataset
+    sample_ratio: 1.0
+    shuffle: True
+    transform_pipeline:
+      - DecodeImage:
+          img_mode: RGB # changed
+          to_float32: False
+      - RecCTCLabelEncode:
+          max_text_len: *max_text_len
+          character_dict_path: *character_dict_path
+          use_space_char: *use_space_char
+          lower: *lower
+      - RecResizeNormImg:
+          image_shape: [32, 100] # H, W
+          infer_mode: *infer_mode
+          character_dict_path: *character_dict_path
+          padding: True # aspect ratio will be preserved if true. changed
+          norm_before_pad: True # changed
+      - ToCHWImage:
+    #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
+    output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
+    net_input_column_index: [0] # input indices for network forward func in output_columns
+    label_column_index: [1] # input indices marked as label
+    #keys_for_loss: 4 # num labels for loss func
+
+  loader:
+      shuffle: True
+      batch_size: *batch_size
+      drop_remainder: True
+      max_rowsize: 12
+      num_workers: 8
+
+eval:
+  ckpt_load_path: ./crnn_resnet34_server_adj/best.ckpt
+  dataset_sink_mode: False
+  dataset:
+    type: LMDBDataset
+    dataset_root: /path/to/data_lmdb_release/
+    data_dir: validation/
+    # label_file: # not required when using LMDBDataset
+    sample_ratio: 1.0
+    shuffle: False
+    transform_pipeline:
+      - DecodeImage:
+          img_mode: RGB # changed
+          to_float32: False
+      - RecCTCLabelEncode:
+          max_text_len: *max_text_len
+          character_dict_path: *character_dict_path
+          use_space_char: *use_space_char
+          lower: *lower
+      - RecResizeNormImg:
+          image_shape: [32, 100] # H, W
+          infer_mode: *infer_mode
+          character_dict_path: *character_dict_path
+          padding: True # aspect ratio will be preserved if true. changed
+          norm_before_pad: True # changed
+      - ToCHWImage:
+    #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
+    output_columns: ['image', 'text_padded', 'text_length']  # TODO return text string padding w/ fixed length, and a scaler to indicate the length
+    net_input_column_index: [0] # input indices for network forward func in output_columns
+    label_column_index: [1, 2] # input indices marked as label
+
+  loader:
+      shuffle: False # TODO: tbc
+      batch_size: 64
+      drop_remainder: False
+      max_rowsize: 12
+      num_workers: 8
\ No newline at end of file

From fbf174f60d199c8754843533fda4a0ef7ed9796d Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Tue, 20 Jun 2023 09:31:06 +0800
Subject: [PATCH 05/12] add back rotate transform

---
 configs/rec/crnn/crnn_resnet34_server.yaml |  4 ++--
 mindocr/data/transforms/rec_transforms.py  | 26 ++++++++++++++++++++++
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/configs/rec/crnn/crnn_resnet34_server.yaml b/configs/rec/crnn/crnn_resnet34_server.yaml
index 756266868..5932284b9 100644
--- a/configs/rec/crnn/crnn_resnet34_server.yaml
+++ b/configs/rec/crnn/crnn_resnet34_server.yaml
@@ -71,7 +71,7 @@ loss_scaler:
   scale_window: 1000
 
 train:
-  ckpt_save_dir: './crnn_resnet34_server_adj'
+  ckpt_save_dir: './crnn_resnet34_server'
   pred_cast_fp32: False # let CTCLoss cast internally
   ema: True # added
   dataset_sink_mode: False
@@ -112,7 +112,7 @@ train:
       num_workers: 8
 
 eval:
-  ckpt_load_path: ./crnn_resnet34_server_adj/best.ckpt
+  ckpt_load_path: ./crnn_resnet34_server/best.ckpt
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
diff --git a/mindocr/data/transforms/rec_transforms.py b/mindocr/data/transforms/rec_transforms.py
index d019e1b9b..3fd0db2d0 100644
--- a/mindocr/data/transforms/rec_transforms.py
+++ b/mindocr/data/transforms/rec_transforms.py
@@ -14,6 +14,7 @@
     "RecResizeNormImg",
     "RecResizeNormForInfer",
     "SVTRRecResizeImg",
+    "Rotate90IfVertical",
     "ClsLabelEncode",
 ]
 
@@ -494,6 +495,31 @@ def __call__(self, data):
         return data
 
 
+class Rotate90IfVertical:
+    """Rotate the image by 90 degree when the height/width ratio is larger than the given threshold.
+    Note: It needs to be called before image resize."""
+
+    def __init__(self, threshold: float = 1.5, direction: str = "counterclockwise", **kwargs):
+        self.threshold = threshold
+
+        if direction == "counterclockwise":
+            self.flag = cv2.ROTATE_90_COUNTERCLOCKWISE
+        elif direction == "clockwise":
+            self.flag = cv2.ROTATE_90_CLOCKWISE
+        else:
+            raise ValueError("Unsupported direction")
+
+    def __call__(self, data):
+        img = data["image"]
+
+        h, w, _ = img.shape
+        if h / w > self.threshold:
+            img = cv2.rotate(img, self.flag)
+
+        data["image"] = img
+        return data
+
+
 class ClsLabelEncode(object):
     def __init__(self, label_list, **kwargs):
         self.label_list = label_list

From 3c9ac0cb3344b351f59c14447ed301d0c8dec479 Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Tue, 20 Jun 2023 14:40:11 +0800
Subject: [PATCH 06/12] add docstrings

---
 configs/rec/crnn/crnn_resnet34.yaml        |  12 +-
 configs/rec/crnn/crnn_resnet34_server.yaml |  10 +-
 mindocr/data/transforms/rec_transforms.py  | 130 ++++++++++++---------
 3 files changed, 87 insertions(+), 65 deletions(-)

diff --git a/configs/rec/crnn/crnn_resnet34.yaml b/configs/rec/crnn/crnn_resnet34.yaml
index 1325467c1..0f9cace2d 100644
--- a/configs/rec/crnn/crnn_resnet34.yaml
+++ b/configs/rec/crnn/crnn_resnet34.yaml
@@ -73,14 +73,14 @@ train:
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
-    dataset_root: path/to/data_lmdb_release/ # Optional, if set, dataset_root will be used as a prefix for data_dir
+    dataset_root: /home/konghuanqi/datasets/data_lmdb_release/ # Optional, if set, dataset_root will be used as a prefix for data_dir
     data_dir: training/
     # label_file: # not required when using LMDBDataset
     sample_ratio: 1.0
     shuffle: True
     transform_pipeline:
       - DecodeImage:
-          img_mode: BGR
+          img_mode: RGB
           to_float32: False
       - RecCTCLabelEncode:
           max_text_len: *max_text_len
@@ -92,11 +92,7 @@ train:
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
           padding: False # aspect ratio will be preserved if true.
-      - NormalizeImage:  # different from paddle (paddle wrongly normalize BGR image with RGB mean/std from ImageNet for det, and simple rescale to [-1, 1] in rec.
-          bgr_to_rgb: True
-          is_hwc: True
-          mean : [127.0, 127.0, 127.0]
-          std : [127.0, 127.0, 127.0]
+          norm_before_pad: False
       - ToCHWImage:
     #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
     output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
@@ -116,7 +112,7 @@ eval:
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
-    dataset_root: path/to/data_lmdb_release/
+    dataset_root: /home/konghuanqi/datasets/data_lmdb_release/
     data_dir: validation/
     # label_file: # not required when using LMDBDataset
     sample_ratio: 1.0
diff --git a/configs/rec/crnn/crnn_resnet34_server.yaml b/configs/rec/crnn/crnn_resnet34_server.yaml
index 5932284b9..47104a9b5 100644
--- a/configs/rec/crnn/crnn_resnet34_server.yaml
+++ b/configs/rec/crnn/crnn_resnet34_server.yaml
@@ -3,7 +3,7 @@ system:
   distribute: True
   amp_level: 'O3'
   seed: 42
-  log_interval: 100
+  log_interval: 1000
   val_while_train: True
   drop_overflow_update: False
 
@@ -14,7 +14,7 @@ common:
   infer_mode: &infer_mode False
   use_space_char: &use_space_char True
   lower: &lower False
-  batch_size: &batch_size 64
+  batch_size: &batch_size 32
 
 model:
   type: rec
@@ -77,7 +77,7 @@ train:
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
-    dataset_root: /path/to/data_lmdb_release/
+    dataset_root: /home/konghuanqi/datasets/data_lmdb_release/
     data_dir: training/
     # label_file: # not required when using LMDBDataset
     sample_ratio: 1.0
@@ -116,7 +116,7 @@ eval:
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
-    dataset_root: /path/to/data_lmdb_release/
+    dataset_root: /home/konghuanqi/datasets/data_lmdb_release/
     data_dir: validation/
     # label_file: # not required when using LMDBDataset
     sample_ratio: 1.0
@@ -147,4 +147,4 @@ eval:
       batch_size: 64
       drop_remainder: False
       max_rowsize: 12
-      num_workers: 8
\ No newline at end of file
+      num_workers: 8
diff --git a/mindocr/data/transforms/rec_transforms.py b/mindocr/data/transforms/rec_transforms.py
index 3fd0db2d0..241bd75a0 100644
--- a/mindocr/data/transforms/rec_transforms.py
+++ b/mindocr/data/transforms/rec_transforms.py
@@ -248,19 +248,25 @@ def str2idx(text: str, label_dict: Dict[str, int], max_text_len: int = 23, lower
 
 
 # TODO: reorganize the code for different resize transformation in rec task
-def resize_norm_img(img,
-                    image_shape,
-                    padding=True,
-                    norm_before_pad=False,
-                    mean=[127.0, 127.0, 127.0],
-                    std=[127.0, 127.0, 127.0],
-                    interpolation=cv2.INTER_LINEAR):
+def resize_norm_img(
+    img,
+    image_shape,
+    padding=True,
+    norm_before_pad=False,
+    mean=[127.0, 127.0, 127.0],
+    std=[127.0, 127.0, 127.0],
+    interpolation=cv2.INTER_LINEAR,
+):
     """
     resize image
     Args:
         img: shape (H, W, C)
         image_shape: image shape after resize, in (C, H, W)
-        padding: if Ture, resize while preserving the H/W ratio, then pad the blank.
+        padding (bool): if Ture, resize while preserving the H/W ratio, then pad the blank.
+        norm_before_pad (bool): if True, normalize the image array before padding.
+        mean: shape (3), mean value for normalization.
+        std: shape (3), std value for normalization.
+        interpolation: image interpolation mode.
 
     """
     imgH, imgW = image_shape
@@ -268,8 +274,7 @@ def resize_norm_img(img,
     w = img.shape[1]
     c = img.shape[2]
     if not padding:
-        resized_image = cv2.resize(
-            img, (imgW, imgH), interpolation=interpolation)
+        resized_image = cv2.resize(img, (imgW, imgH), interpolation=interpolation)
         resized_w = imgW
     else:
         ratio = w / float(h)
@@ -298,19 +303,25 @@ def resize_norm_img(img,
 
 
 # TODO: check diff from resize_norm_img
-def resize_norm_img_chinese(img,
-                            image_shape,
-                            norm_before_pad=False,
-                            mean=[127.0, 127.0, 127.0],
-                            std=[127.0, 127.0, 127.0],
-                            interpolation=cv2.INTER_LINEAR):
-    '''
+def resize_norm_img_chinese(
+    img,
+    image_shape,
+    norm_before_pad=False,
+    mean=[127.0, 127.0, 127.0],
+    std=[127.0, 127.0, 127.0],
+    interpolation=cv2.INTER_LINEAR,
+):
+    """
     resize image with aspect-ratio keeping and padding
     Args:
         img: shape (H, W, C)
         image_shape: image shape after resize, in (C, H, W)
+        norm_before_pad (bool): if True, normalize the image array before padding.
+        mean: shape (3), mean value for normalization.
+        std: shape (3), std value for normalization.
+        interpolation: image interpolation mode.
 
-    '''
+    """
     imgH, imgW = image_shape
     # todo: change to 0 and modified image shape
     max_wh_ratio = imgW * 1.0 / imgH
@@ -340,21 +351,32 @@ def resize_norm_img_chinese(img,
 
 
 class RecResizeNormImg(object):
-    ''' adopted from paddle
+    """adopted from paddle
     Resize and normalize image, and pad image if needed.
 
     Args:
-        norm_before_pad: If True, perform normalization before padding (by doing so, the padding values will beall zero. Good practice.). Otherwise, per  Default: False
-    '''
-    def __init__(self,
-                 image_shape,
-                 infer_mode=False,
-                 character_dict_path=None,
-                 padding=True,
-                 norm_before_pad=False,
-                 mean=[127.0, 127.0, 127.0],
-                 std=[127.0, 127.0, 127.0],
-                 **kwargs):
+        image_shape: image shape after resize, in (C, H, W)
+        padding (bool): if Ture, resize while preserving the H/W ratio, then pad the blank.
+        norm_before_pad (bool): if True, normalize the image array before padding.
+        mean: shape (3), mean value for normalization.
+        std: shape (3), std value for normalization.
+        interpolation: image interpolation mode.
+        norm_before_pad: If True, perform normalization before padding \
+            (by doing so, the padding values will beall zero. Good practice.). \
+                Otherwise, per  Default: False
+    """
+
+    def __init__(
+        self,
+        image_shape,
+        infer_mode=False,
+        character_dict_path=None,
+        padding=True,
+        norm_before_pad=False,
+        mean=[127.0, 127.0, 127.0],
+        std=[127.0, 127.0, 127.0],
+        **kwargs,
+    ):
         self.image_shape = image_shape
         self.infer_mode = infer_mode
         self.character_dict_path = character_dict_path
@@ -364,39 +386,43 @@ def __init__(self,
         self.std = np.array(std, dtype="float32")
 
     def __call__(self, data):
-        img = data['image']
+        img = data["image"]
         if self.infer_mode and self.character_dict_path is not None:
-            norm_img, valid_ratio = resize_norm_img_chinese(img,
-                                                            self.image_shape,
-                                                            self.norm_before_pad,
-                                                            self.mean,
-                                                            self.std
-                                                            )
+            norm_img, valid_ratio = resize_norm_img_chinese(
+                img, self.image_shape, self.norm_before_pad, self.mean, self.std
+            )
         else:
-            norm_img, valid_ratio = resize_norm_img(img,
-                                                    self.image_shape,
-                                                    self.padding,
-                                                    self.norm_before_pad,
-                                                    self.mean,
-                                                    self.std,
-                                                    )
-        data['image'] = norm_img
-        data['valid_ratio'] = valid_ratio
+            norm_img, valid_ratio = resize_norm_img(
+                img,
+                self.image_shape,
+                self.padding,
+                self.norm_before_pad,
+                self.mean,
+                self.std,
+            )
+        data["image"] = norm_img
+        data["valid_ratio"] = valid_ratio
         return data
 
 
 # TODO: remove infer_mode and character_dict_path if they are not necesary
 class RecResizeImg(RecResizeNormImg):
-    '''
+    """
     This is to make compatible with older version code that uses RecResizeImg, which is to be updated.
 
     TODO: replace RecResizeImg followed by NormlaizeImage in yaml files with RecResizeNormImg op.
-    '''
+    """
+
     def __init__(self, image_shape, infer_mode=False, character_dict_path=None, padding=True, **kwargs):
         super.__init__(
-                image_shape, infer_mode, character_dict_path, padding, norm_befoer_pad=False,
-                mean=[0., 0., 0.], std=[1., 1., 1.],
-                )
+            image_shape,
+            infer_mode,
+            character_dict_path,
+            padding,
+            norm_befoer_pad=False,
+            mean=[0.0, 0.0, 0.0],
+            std=[1.0, 1.0, 1.0],
+        )
 
 
 class SVTRRecResizeImg(object):
@@ -478,7 +504,7 @@ def __call__(self, data):
 
         # TODO: norm before padding
 
-        data['shape_list'] = [h, w, resize_h / h, resize_w / w] # TODO: reformat, currently align to det
+        data["shape_list"] = [h, w, resize_h / h, resize_w / w]  # TODO: reformat, currently align to det
         if self.norm_before_pad:
             resized_img = self.norm(resized_img)
 

From 57da9e2268ef253676d4040f6788500572431222 Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Tue, 20 Jun 2023 14:43:48 +0800
Subject: [PATCH 07/12] rebase some unnecessary changes

---
 configs/rec/crnn/crnn_resnet34.yaml        | 12 ++++++++----
 configs/rec/crnn/crnn_resnet34_server.yaml |  8 ++++----
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/configs/rec/crnn/crnn_resnet34.yaml b/configs/rec/crnn/crnn_resnet34.yaml
index 0f9cace2d..1325467c1 100644
--- a/configs/rec/crnn/crnn_resnet34.yaml
+++ b/configs/rec/crnn/crnn_resnet34.yaml
@@ -73,14 +73,14 @@ train:
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
-    dataset_root: /home/konghuanqi/datasets/data_lmdb_release/ # Optional, if set, dataset_root will be used as a prefix for data_dir
+    dataset_root: path/to/data_lmdb_release/ # Optional, if set, dataset_root will be used as a prefix for data_dir
     data_dir: training/
     # label_file: # not required when using LMDBDataset
     sample_ratio: 1.0
     shuffle: True
     transform_pipeline:
       - DecodeImage:
-          img_mode: RGB
+          img_mode: BGR
           to_float32: False
       - RecCTCLabelEncode:
           max_text_len: *max_text_len
@@ -92,7 +92,11 @@ train:
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
           padding: False # aspect ratio will be preserved if true.
-          norm_before_pad: False
+      - NormalizeImage:  # different from paddle (paddle wrongly normalize BGR image with RGB mean/std from ImageNet for det, and simple rescale to [-1, 1] in rec.
+          bgr_to_rgb: True
+          is_hwc: True
+          mean : [127.0, 127.0, 127.0]
+          std : [127.0, 127.0, 127.0]
       - ToCHWImage:
     #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
     output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
@@ -112,7 +116,7 @@ eval:
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
-    dataset_root: /home/konghuanqi/datasets/data_lmdb_release/
+    dataset_root: path/to/data_lmdb_release/
     data_dir: validation/
     # label_file: # not required when using LMDBDataset
     sample_ratio: 1.0
diff --git a/configs/rec/crnn/crnn_resnet34_server.yaml b/configs/rec/crnn/crnn_resnet34_server.yaml
index 47104a9b5..ab2c0c6e3 100644
--- a/configs/rec/crnn/crnn_resnet34_server.yaml
+++ b/configs/rec/crnn/crnn_resnet34_server.yaml
@@ -3,7 +3,7 @@ system:
   distribute: True
   amp_level: 'O3'
   seed: 42
-  log_interval: 1000
+  log_interval: 100
   val_while_train: True
   drop_overflow_update: False
 
@@ -14,7 +14,7 @@ common:
   infer_mode: &infer_mode False
   use_space_char: &use_space_char True
   lower: &lower False
-  batch_size: &batch_size 32
+  batch_size: &batch_size 64
 
 model:
   type: rec
@@ -77,7 +77,7 @@ train:
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
-    dataset_root: /home/konghuanqi/datasets/data_lmdb_release/
+    dataset_root: /path/to/data_lmdb_release/
     data_dir: training/
     # label_file: # not required when using LMDBDataset
     sample_ratio: 1.0
@@ -116,7 +116,7 @@ eval:
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
-    dataset_root: /home/konghuanqi/datasets/data_lmdb_release/
+    dataset_root: /path/to/data_lmdb_release/
     data_dir: validation/
     # label_file: # not required when using LMDBDataset
     sample_ratio: 1.0

From 804af55c102f19a4b5fb8d90079cb9cc2b0e9a2e Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Tue, 20 Jun 2023 14:56:48 +0800
Subject: [PATCH 08/12] replace RecResizeImg followed by NormlaizeImage in yaml
 files with RecResizeNormImg op.

---
 configs/rec/crnn/crnn_icdar15.yaml         |  8 ++------
 configs/rec/crnn/crnn_resnet34.yaml        | 10 +++-------
 configs/rec/crnn/crnn_resnet34_ch.yaml     | 14 +++++---------
 configs/rec/crnn/crnn_resnet34_server.yaml | 12 ++++++------
 configs/rec/crnn/crnn_vgg7.yaml            | 10 +++-------
 configs/rec/rare/rare_resnet34.yaml        |  8 ++------
 configs/rec/rare/rare_resnet34_ch.yaml     | 12 ++++--------
 mindocr/data/transforms/rec_transforms.py  |  2 --
 8 files changed, 25 insertions(+), 51 deletions(-)

diff --git a/configs/rec/crnn/crnn_icdar15.yaml b/configs/rec/crnn/crnn_icdar15.yaml
index 18139f435..358a1f31b 100644
--- a/configs/rec/crnn/crnn_icdar15.yaml
+++ b/configs/rec/crnn/crnn_icdar15.yaml
@@ -96,16 +96,12 @@ train:
           character_dict_path: *character_dict_path
           use_space_char: *use_space_char
           lower: True
-      - RecResizeImg: # different from paddle (paddle converts image from HWC to CHW and rescale to [-1, 1] after resize.
+      - RecResizeNormImg:
           image_shape: [32, 100] # H, W
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
           padding: False # aspect ratio will be preserved if true.
-      - NormalizeImage:  # different from paddle (paddle wrongly normalize BGR image with RGB mean/std from ImageNet for det, and simple rescale to [-1, 1] in rec.
-          bgr_to_rgb: True
-          is_hwc: True
-          mean : [127.0, 127.0, 127.0]
-          std : [127.0, 127.0, 127.0]
+          norm_before_pad: False
       - ToCHWImage:
     #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
     output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
diff --git a/configs/rec/crnn/crnn_resnet34.yaml b/configs/rec/crnn/crnn_resnet34.yaml
index 1325467c1..bc37c7ea5 100644
--- a/configs/rec/crnn/crnn_resnet34.yaml
+++ b/configs/rec/crnn/crnn_resnet34.yaml
@@ -80,23 +80,19 @@ train:
     shuffle: True
     transform_pipeline:
       - DecodeImage:
-          img_mode: BGR
+          img_mode: RGB
           to_float32: False
       - RecCTCLabelEncode:
           max_text_len: *max_text_len
           character_dict_path: *character_dict_path
           use_space_char: *use_space_char
           lower: True
-      - RecResizeImg: # different from paddle (paddle converts image from HWC to CHW and rescale to [-1, 1] after resize.
+      - RecResizeNormImg:
           image_shape: [32, 100] # H, W
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
           padding: False # aspect ratio will be preserved if true.
-      - NormalizeImage:  # different from paddle (paddle wrongly normalize BGR image with RGB mean/std from ImageNet for det, and simple rescale to [-1, 1] in rec.
-          bgr_to_rgb: True
-          is_hwc: True
-          mean : [127.0, 127.0, 127.0]
-          std : [127.0, 127.0, 127.0]
+          norm_before_pad: False
       - ToCHWImage:
     #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
     output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
diff --git a/configs/rec/crnn/crnn_resnet34_ch.yaml b/configs/rec/crnn/crnn_resnet34_ch.yaml
index bf954cbae..6465a7cd4 100644
--- a/configs/rec/crnn/crnn_resnet34_ch.yaml
+++ b/configs/rec/crnn/crnn_resnet34_ch.yaml
@@ -84,7 +84,7 @@ train:
     max_text_len: *max_text_len
     transform_pipeline:
       - DecodeImage:
-          img_mode: BGR
+          img_mode: RGB
           to_float32: False
       - RecCTCLabelEncode:
           max_text_len: *max_text_len
@@ -94,16 +94,12 @@ train:
       - Rotate90IfVertical:
           threshold: 2.0
           direction: counterclockwise
-      - RecResizeImg:
-          image_shape: [32, 320]
+      - RecResizeNormImg:
+          image_shape: [32, 320] # H, W
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
-          padding: True
-      - NormalizeImage:
-          bgr_to_rgb: True
-          is_hwc: True
-          mean: [127.0, 127.0, 127.0]
-          std: [127.0, 127.0, 127.0]
+          padding: True # aspect ratio will be preserved if true.
+          norm_before_pad: False
       - ToCHWImage:
     output_columns: ["image", "text_seq"]
     net_input_column_index: [0]
diff --git a/configs/rec/crnn/crnn_resnet34_server.yaml b/configs/rec/crnn/crnn_resnet34_server.yaml
index ab2c0c6e3..7518981ea 100644
--- a/configs/rec/crnn/crnn_resnet34_server.yaml
+++ b/configs/rec/crnn/crnn_resnet34_server.yaml
@@ -84,7 +84,7 @@ train:
     shuffle: True
     transform_pipeline:
       - DecodeImage:
-          img_mode: RGB # changed
+          img_mode: RGB
           to_float32: False
       - RecCTCLabelEncode:
           max_text_len: *max_text_len
@@ -95,8 +95,8 @@ train:
           image_shape: [32, 100] # H, W
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
-          padding: True # aspect ratio will be preserved if true. changed
-          norm_before_pad: True # changed
+          padding: True # aspect ratio will be preserved if true.
+          norm_before_pad: True
       - ToCHWImage:
     #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
     output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
@@ -123,7 +123,7 @@ eval:
     shuffle: False
     transform_pipeline:
       - DecodeImage:
-          img_mode: RGB # changed
+          img_mode: RGB
           to_float32: False
       - RecCTCLabelEncode:
           max_text_len: *max_text_len
@@ -134,8 +134,8 @@ eval:
           image_shape: [32, 100] # H, W
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
-          padding: True # aspect ratio will be preserved if true. changed
-          norm_before_pad: True # changed
+          padding: True # aspect ratio will be preserved if true.
+          norm_before_pad: True
       - ToCHWImage:
     #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
     output_columns: ['image', 'text_padded', 'text_length']  # TODO return text string padding w/ fixed length, and a scaler to indicate the length
diff --git a/configs/rec/crnn/crnn_vgg7.yaml b/configs/rec/crnn/crnn_vgg7.yaml
index 5647e3421..a5a750463 100644
--- a/configs/rec/crnn/crnn_vgg7.yaml
+++ b/configs/rec/crnn/crnn_vgg7.yaml
@@ -81,23 +81,19 @@ train:
     shuffle: True
     transform_pipeline:
       - DecodeImage:
-          img_mode: BGR
+          img_mode: RGB
           to_float32: False
       - RecCTCLabelEncode:
           max_text_len: *max_text_len
           character_dict_path: *character_dict_path
           use_space_char: *use_space_char
           lower: True
-      - RecResizeImg: # different from paddle (paddle converts image from HWC to CHW and rescale to [-1, 1] after resize.
+      - RecResizeNormImg:
           image_shape: [32, 100] # H, W
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
           padding: False # aspect ratio will be preserved if true.
-      - NormalizeImage:  # different from paddle (paddle wrongly normalize BGR image with RGB mean/std from ImageNet for det, and simple rescale to [-1, 1] in rec.
-          bgr_to_rgb: True
-          is_hwc: True
-          mean : [127.0, 127.0, 127.0]
-          std : [127.0, 127.0, 127.0]
+          norm_before_pad: False
       - ToCHWImage:
     #  the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
     output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
diff --git a/configs/rec/rare/rare_resnet34.yaml b/configs/rec/rare/rare_resnet34.yaml
index d910b7c21..85609b5ea 100644
--- a/configs/rec/rare/rare_resnet34.yaml
+++ b/configs/rec/rare/rare_resnet34.yaml
@@ -83,16 +83,12 @@ train:
           character_dict_path: *character_dict_path
           use_space_char: *use_space_char
           lower: True
-      - RecResizeImg: # different from paddle (paddle converts image from HWC to CHW and rescale to [-1, 1] after resize.
+      - RecResizeNormImg:
           image_shape: [32, 100] # H, W
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
           padding: False # aspect ratio will be preserved if true.
-      - NormalizeImage: # different from paddle (paddle wrongly normalize BGR image with RGB mean/std from ImageNet for det, and simple rescale to [-1, 1] in rec.
-          bgr_to_rgb: True
-          is_hwc: True
-          mean: [127.0, 127.0, 127.0]
-          std: [127.0, 127.0, 127.0]
+          norm_before_pad: False
       - ToCHWImage:
     output_columns: ["image", "text_seq"]
     net_input_column_index: [0, 1] # input indices for network forward func in output_columns
diff --git a/configs/rec/rare/rare_resnet34_ch.yaml b/configs/rec/rare/rare_resnet34_ch.yaml
index 624c70b3d..5bd8fd705 100644
--- a/configs/rec/rare/rare_resnet34_ch.yaml
+++ b/configs/rec/rare/rare_resnet34_ch.yaml
@@ -93,16 +93,12 @@ train:
       - Rotate90IfVertical:
           threshold: 2.0
           direction: counterclockwise
-      - RecResizeImg:
-          image_shape: [32, 320]
+      - RecResizeNormImg:
+          image_shape: [32, 320] # H, W
           infer_mode: *infer_mode
           character_dict_path: *character_dict_path
-          padding: True
-      - NormalizeImage:
-          bgr_to_rgb: True
-          is_hwc: True
-          mean: [127.0, 127.0, 127.0]
-          std: [127.0, 127.0, 127.0]
+          padding: True # aspect ratio will be preserved if true.
+          norm_before_pad: False
       - ToCHWImage:
     output_columns: ["image", "text_seq"]
     net_input_column_index: [0, 1]
diff --git a/mindocr/data/transforms/rec_transforms.py b/mindocr/data/transforms/rec_transforms.py
index 241bd75a0..778b212ba 100644
--- a/mindocr/data/transforms/rec_transforms.py
+++ b/mindocr/data/transforms/rec_transforms.py
@@ -409,8 +409,6 @@ def __call__(self, data):
 class RecResizeImg(RecResizeNormImg):
     """
     This is to make compatible with older version code that uses RecResizeImg, which is to be updated.
-
-    TODO: replace RecResizeImg followed by NormlaizeImage in yaml files with RecResizeNormImg op.
     """
 
     def __init__(self, image_shape, infer_mode=False, character_dict_path=None, padding=True, **kwargs):

From 81cd5ca6897ef918bf36983a949e0d0ac5a821c7 Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Wed, 21 Jun 2023 14:43:17 +0800
Subject: [PATCH 09/12] bugfix

---
 mindocr/data/transforms/rec_transforms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mindocr/data/transforms/rec_transforms.py b/mindocr/data/transforms/rec_transforms.py
index 778b212ba..1b0a3621d 100644
--- a/mindocr/data/transforms/rec_transforms.py
+++ b/mindocr/data/transforms/rec_transforms.py
@@ -328,7 +328,7 @@ def resize_norm_img_chinese(
     h, w = img.shape[0], img.shape[1]
     c = img.shape[2]
     ratio = w * 1.0 / h
-    max_wh_ratio = min(max(max_wh_ratio, ratio), max_wh_ratio)
+    max_wh_ratio = max(max_wh_ratio, ratio)
     imgW = int(imgH * max_wh_ratio)
     if math.ceil(imgH * ratio) > imgW:
         resized_w = imgW

From e1aca38a5f87a7f8504ee29de06964db340b37ea Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Mon, 26 Jun 2023 11:33:25 +0800
Subject: [PATCH 10/12] bugfix

---
 mindocr/data/transforms/rec_transforms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mindocr/data/transforms/rec_transforms.py b/mindocr/data/transforms/rec_transforms.py
index 1b0a3621d..f3664e262 100644
--- a/mindocr/data/transforms/rec_transforms.py
+++ b/mindocr/data/transforms/rec_transforms.py
@@ -412,7 +412,7 @@ class RecResizeImg(RecResizeNormImg):
     """
 
     def __init__(self, image_shape, infer_mode=False, character_dict_path=None, padding=True, **kwargs):
-        super.__init__(
+        super().__init__(
             image_shape,
             infer_mode,
             character_dict_path,

From 4b0d8405f5a7cfdcbcab28bf8e34217da68d41e7 Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Wed, 28 Jun 2023 10:05:03 +0800
Subject: [PATCH 11/12] update readme

---
 configs/rec/crnn/README.md    | 20 +++++++++++---------
 configs/rec/crnn/README_CN.md | 20 +++++++++++---------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/configs/rec/crnn/README.md b/configs/rec/crnn/README.md
index 950ade180..52c3a0298 100644
--- a/configs/rec/crnn/README.md
+++ b/configs/rec/crnn/README.md
@@ -39,19 +39,21 @@ According to our experiments, the training (following the steps in [Model Traini
 
 <div align="center">
 
-| **Model** | **Context**  | **Backbone** | **Train Dataset** | **Model Params** | **Batch size per card** | **Graph train 8P (s/epoch)** | **Graph train 8P (ms/step)** | **Graph train 8P (FPS)** | **Avg Eval Accuracy** | **Recipe** | **Download** |
-| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
-| CRNN      | D910x8-MS1.8-G | VGG7 | MJ+ST | 8.72 M | 16 |  2488.82 | 22.06 | 5802.71 | 82.03%  | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)  | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-573dbd61.mindir)   |
-| CRNN      | D910x8-MS1.8-G | ResNet34_vd | MJ+ST | 24.48 M | 64 |  2157.18 | 76.48 | 6694.84 | 84.45% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-eb10a0c9.mindir) |
+| **Model** | **Context**  | **Backbone** | **Train Dataset** | **Num Classes** | **Model Params** | **Batch size per card** | **Graph train (s/epoch)** | **Graph train (ms/step)** | **Graph train (FPS)** | **Avg Eval Accuracy** | **Recipe** | **Download** |
+| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
+| CRNN      | D910x8-MS1.8-G | VGG7 | MJ+ST | 37 | 8.72 M | 16 |  2488.82 | 22.06 | 5802.71 | 82.03%  | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)  | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-573dbd61.mindir)   |
+| CRNN      | D910x8-MS1.8-G | ResNet34_vd | MJ+ST | 37 | 24.48 M | 64 |  2157.18 | 76.48 | 6694.84 | 84.45% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-eb10a0c9.mindir) |
+| CRNN      | D910x4-MS2.0-G | ResNet34_vd | MJ+ST | 96 | 24.51 M | 64 | 4292.18 | 76.08 | 3364.72 | 83.50% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34_server.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_server-e0d66c0c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_server-e0d66c0c-55748731.mindir) |
 </div>
 
 - Detailed accuracy results for each benchmark dataset (IC03, IC13, IC15, IIIT, SVT, SVTP, CUTE):
   <div align="center">
 
-  | **Model** | **Backbone** | **IC03_860** | **IC03_867** | **IC13_857** | **IC13_1015** | **IC15_1811** | **IC15_2077** | **IIIT5k_3000** | **SVT** | **SVTP** | **CUTE80** | **Average** |
-  | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: |
-  | CRNN | VGG7 | 94.53% | 94.00% | 92.18% | 90.74% | 71.95% | 66.06% | 84.10% | 83.93% | 73.33% | 69.44% | 82.03% |
-  | CRNN | ResNet34_vd | 94.42% | 94.23% | 93.35% | 92.02% | 75.92% | 70.15% | 87.73% | 86.40% | 76.28% | 73.96% | 84.45% |
+  | **Model** | **Backbone** | **Num Classes** | **IC03_860** | **IC03_867** | **IC13_857** | **IC13_1015** | **IC15_1811** | **IC15_2077** | **IIIT5k_3000** | **SVT** | **SVTP** | **CUTE80** | **Average** |
+  | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: |
+  | CRNN | VGG7 | 37 | 94.53% | 94.00% | 92.18% | 90.74% | 71.95% | 66.06% | 84.10% | 83.93% | 73.33% | 69.44% | 82.03% |
+  | CRNN | ResNet34_vd | 37 | 94.42% | 94.23% | 93.35% | 92.02% | 75.92% | 70.15% | 87.73% | 86.40% | 76.28% | 73.96% | 84.45% |
+  | CRNN | ResNet34_vd | 96 | 94.65% | 94.70% | 94.28% | 93.20% | 72.5% | 63.94% | 87.63% | 86.09% | 74.42% | 73.61% | 83.50% |
   </div>
 
 ### Inference Perf.
@@ -70,7 +72,7 @@ The inference performance is tested on Mindspore Lite, please take a look at [Mi
 **Notes:**
 - Context: Training context denoted as {device}x{pieces}-{MS mode}, where mindspore mode can be G-graph mode or F-pynative mode with ms function. For example, D910x8-MS1.8-G is for training on 8 pieces of Ascend 910 NPU using graph mode based on Minspore version 1.8.
 - To reproduce the result on other contexts, please ensure the global batch size is the same.
-- The characters supported by model are lowercase English characters from a to z and numbers from 0 to 9. More explanation on dictionary, please refer to [4. Character Dictionary](#4-character-dictionary).
+- The number of classes of the model is determined by the dictionary used for training. The default dictionary contains lowercase English characters from a to z and digits from 0 to 9. More explanation on dictionary, please refer to [4. Character Dictionary](#4-character-dictionary).
 - The models are trained from scratch without any pre-training. For more dataset details of training and evaluation, please refer to [Dataset Download & Dataset Usage](#312-dataset-download) section.
 - The input Shapes of MindIR of CRNN_VGG7 and CRNN_ResNet34_vd are both (1, 3, 32, 100).
 
diff --git a/configs/rec/crnn/README_CN.md b/configs/rec/crnn/README_CN.md
index 77f508807..f6b407f33 100644
--- a/configs/rec/crnn/README_CN.md
+++ b/configs/rec/crnn/README_CN.md
@@ -39,20 +39,22 @@ Table Format:
 
 <div align="center">
 
-| **模型** | **环境配置** | **骨干网络** | **训练集** | **参数量** | **单卡批量** | **图模式8卡训练 (s/epoch)** | **图模式8卡训练 (ms/step)** | **图模式8卡训练 (FPS)** | **平均评估精度** | **配置文件** | **模型权重下载** |
-| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
-| CRNN      | D910x8-MS1.8-G | VGG7 | MJ+ST | 8.72 M | 16 |  2488.82 | 22.06 | 5802.71 | 82.03%  | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-573dbd61.mindir) |
-| CRNN      | D910x8-MS1.8-G | ResNet34_vd | MJ+ST | 24.48 M | 64 |  2157.18 | 76.48 | 6694.84 | 84.45% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-eb10a0c9.mindir) |
+| **模型** | **环境配置** | **骨干网络** | **训练集** | **类别数** | **参数量** | **单卡批量** | **图模式8卡训练 (s/epoch)** | **图模式8卡训练 (ms/step)** | **图模式8卡训练 (FPS)** | **平均评估精度** | **配置文件** | **模型权重下载** |
+| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :------: |
+| CRNN      | D910x8-MS1.8-G | VGG7 | MJ+ST | 37 |8.72 M | 16 |  2488.82 | 22.06 | 5802.71 | 82.03%  | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-573dbd61.mindir) |
+| CRNN      | D910x8-MS1.8-G | ResNet34_vd | MJ+ST | 37 | 24.48 M | 64 |  2157.18 | 76.48 | 6694.84 | 84.45% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-eb10a0c9.mindir) |
+| CRNN      | D910x4-MS2.0-G | ResNet34_vd | MJ+ST | 96 | 24.51 M | 64 | 4292.18 | 76.08 | 3364.72 | 83.50% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34_server.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_server-e0d66c0c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34_server-e0d66c0c-55748731.mindir) |
 </div>
 
 - 在各个基准数据集（IC03，IC13，IC15，IIIT，SVT，SVTP，CUTE）上的准确率：
 
   <div align="center">
 
-  | **模型** | **骨干网络** | **IC03_860** | **IC03_867** | **IC13_857** | **IC13_1015** | **IC15_1811** | **IC15_2077** | **IIIT5k_3000** | **SVT** | **SVTP** | **CUTE80** | **平均准确率** |
-  | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: |
-  | CRNN | VGG7 | 94.53% | 94.00% | 92.18% | 90.74% | 71.95% | 66.06% | 84.10% | 83.93% | 73.33% | 69.44% | 82.03% |
-  | CRNN | ResNet34_vd | 94.42% | 94.23% | 93.35% | 92.02% | 75.92% | 70.15% | 87.73% | 86.40% | 76.28% | 73.96% | 84.45% |
+  | **模型** | **骨干网络** | **类别数** | **IC03_860** | **IC03_867** | **IC13_857** | **IC13_1015** | **IC15_1811** | **IC15_2077** | **IIIT5k_3000** | **SVT** | **SVTP** | **CUTE80** | **平均准确率** |
+  | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: | :------: |
+  | CRNN | VGG7 | 37 | 94.53% | 94.00% | 92.18% | 90.74% | 71.95% | 66.06% | 84.10% | 83.93% | 73.33% | 69.44% | 82.03% |
+  | CRNN | ResNet34_vd | 37 |94.42% | 94.23% | 93.35% | 92.02% | 75.92% | 70.15% | 87.73% | 86.40% | 76.28% | 73.96% | 84.45% |
+    | CRNN | ResNet34_vd | 96 | 94.65% | 94.70% | 94.28% | 93.20% | 72.5% | 63.94% | 87.63% | 86.09% | 74.42% | 73.61% | 83.50% |
   </div>
 
 
@@ -72,7 +74,7 @@ Table Format:
 **注意:**
 - 环境配置：训练的环境配置表示为 {处理器}x{处理器数量}-{MS模式}，其中 Mindspore 模式可以是 G-graph 模式或 F-pynative 模式。例如，D910x8-MS1.8-G 用于使用图形模式在8张昇腾910 NPU上依赖Mindspore1.8版本进行训练。
 - 如需在其他环境配置重现训练结果，请确保全局批量大小与原配置文件保持一致。
-- 模型所能识别的字符都是默认的设置，即所有英文小写字母a至z及数字0至9，详细请看[4. 字符词典](#4-字符词典)
+- 模型的类别数由用于训练的字典决定。默认字典包含小写英文字符从a到z和数字从0到9，详细请看[4. 字符词典](#4-字符词典)
 - 模型都是从头开始训练的，无需任何预训练。关于训练和测试数据集的详细介绍，请参考[数据集下载及使用](#312-数据集下载)章节。
 - CRNN_VGG7和CRNN_ResNet34_vd的MindIR导出时的输入Shape均为(1, 3, 32, 100)。
 

From 4f87096d00055f0d0b358800f3d044fa5b07eb7a Mon Sep 17 00:00:00 2001
From: katekong <hqkate94@gmail.com>
Date: Wed, 28 Jun 2023 10:08:56 +0800
Subject: [PATCH 12/12] minor fix

---
 configs/rec/crnn/README_CN.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configs/rec/crnn/README_CN.md b/configs/rec/crnn/README_CN.md
index f6b407f33..be16c29f5 100644
--- a/configs/rec/crnn/README_CN.md
+++ b/configs/rec/crnn/README_CN.md
@@ -39,7 +39,7 @@ Table Format:
 
 <div align="center">
 
-| **模型** | **环境配置** | **骨干网络** | **训练集** | **类别数** | **参数量** | **单卡批量** | **图模式8卡训练 (s/epoch)** | **图模式8卡训练 (ms/step)** | **图模式8卡训练 (FPS)** | **平均评估精度** | **配置文件** | **模型权重下载** |
+| **模型** | **环境配置** | **骨干网络** | **训练集** | **类别数** | **参数量** | **单卡批量** | **图模式训练 (s/epoch)** | **图模式训练 (ms/step)** | **图模式训练 (FPS)** | **平均评估精度** | **配置文件** | **模型权重下载** |
 | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :------: |
 | CRNN      | D910x8-MS1.8-G | VGG7 | MJ+ST | 37 |8.72 M | 16 |  2488.82 | 22.06 | 5802.71 | 82.03%  | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c-573dbd61.mindir) |
 | CRNN      | D910x8-MS1.8-G | ResNet34_vd | MJ+ST | 37 | 24.48 M | 64 |  2157.18 | 76.48 | 6694.84 | 84.45% | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07-eb10a0c9.mindir) |