Skip to content

Commit c7781d7

Browse files
committed
Update README.md
Update README.md Update README.md
1 parent 4182170 commit c7781d7

File tree

4 files changed

+57
-54
lines changed

4 files changed

+57
-54
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
__pycache__/
44
classification/convertor/
55
segmentation/convertor/
6+
detection/convertor/
67
checkpoint_dir/
78
demo/
89
pretrained/

README.md

Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -125,12 +125,12 @@ Some other projects related to InternImage include the pretraining algorithm "M3
125125
<br>
126126
<div>
127127

128-
| name | pretrain | resolution | #param | download |
129-
| :------------: | :------------------: | :--------: | :----: | :---------------------------------------------------------------------------------------------------: |
130-
| InternImage-L | IN-22K | 384x384 | 223M | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22k_192to384.pth) |
131-
| InternImage-XL | IN-22K | 384x384 | 335M | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22k_192to384.pth) |
132-
| InternImage-H | Joint 427M -> IN-22K | 384x384 | 1.08B | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_jointto22k_384.pth) |
133-
| InternImage-G | Joint 427M -> IN-22K | 384x384 | 3B | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_pretrainto22k_384.pth) |
128+
| name | pretrain | resolution | #param | download |
129+
| :------------: | :------------------: | :--------: | :----: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
130+
| InternImage-L | IN-22K | 384x384 | 223M | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22k_192to384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_l_22k_384) |
131+
| InternImage-XL | IN-22K | 384x384 | 335M | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22k_192to384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_xl_22k_384) |
132+
| InternImage-H | Joint 427M -> IN-22K | 384x384 | 1.08B | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_jointto22k_384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_h_jointto22k_384) |
133+
| InternImage-G | Joint 427M -> IN-22K | 384x384 | 3B | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_pretrainto22k_384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_g_jointto22k_384) |
134134

135135
</div>
136136

@@ -141,15 +141,15 @@ Some other projects related to InternImage include the pretraining algorithm "M3
141141
<br>
142142
<div>
143143

144-
| name | pretrain | resolution | acc@1 | #param | FLOPs | download |
145-
| :------------: | :------------------: | :--------: | :---: | :----: | :---: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------: |
146-
| InternImage-T | IN-1K | 224x224 | 83.5 | 30M | 5G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_t_1k_224.pth) \| [cfg](configs/without_lr_decay/internimage_t_1k_224.yaml) |
147-
| InternImage-S | IN-1K | 224x224 | 84.2 | 50M | 8G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_s_1k_224.pth) \| [cfg](configs/without_lr_decay/internimage_s_1k_224.yaml) |
148-
| InternImage-B | IN-1K | 224x224 | 84.9 | 97M | 16G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_b_1k_224.pth) \| [cfg](configs/without_lr_decay/internimage_b_1k_224.yaml) |
149-
| InternImage-L | IN-22K | 384x384 | 87.7 | 223M | 108G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22kto1k_384.pth) \| [cfg](configs/without_lr_decay/internimage_l_22kto1k_384.yaml) |
150-
| InternImage-XL | IN-22K | 384x384 | 88.0 | 335M | 163G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22kto1k_384.pth) \| [cfg](configs/without_lr_decay/internimage_xl_22kto1k_384.yaml) |
151-
| InternImage-H | Joint 427M -> IN-22K | 640x640 | 89.6 | 1.08B | 1478G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_22kto1k_640.pth) \| [cfg](configs/without_lr_decay/internimage_h_22kto1k_640.yaml) |
152-
| InternImage-G | Joint 427M -> IN-22K | 512x512 | 90.1 | 3B | 2700G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_22kto1k_512.pth) \| [cfg](configs/without_lr_decay/internimage_g_22kto1k_512.yaml) |
144+
| name | pretrain | resolution | acc@1 | #param | FLOPs | download |
145+
| :------------: | :------------------: | :--------: | :---: | :----: | :---: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
146+
| InternImage-T | IN-1K | 224x224 | 83.5 | 30M | 5G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_t_1k_224.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_t_1k_224) \| [cfg](configs/without_lr_decay/internimage_t_1k_224.yaml) |
147+
| InternImage-S | IN-1K | 224x224 | 84.2 | 50M | 8G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_s_1k_224.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_s_1k_224) \| [cfg](configs/without_lr_decay/internimage_s_1k_224.yaml) |
148+
| InternImage-B | IN-1K | 224x224 | 84.9 | 97M | 16G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_b_1k_224.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_b_1k_224) \| [cfg](configs/without_lr_decay/internimage_b_1k_224.yaml) |
149+
| InternImage-L | IN-22K | 384x384 | 87.7 | 223M | 108G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22kto1k_384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_l_22kto1k_384) \| [cfg](configs/without_lr_decay/internimage_l_22kto1k_384.yaml) |
150+
| InternImage-XL | IN-22K | 384x384 | 88.0 | 335M | 163G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22kto1k_384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_xl_22kto1k_384) \| [cfg](configs/without_lr_decay/internimage_xl_22kto1k_384.yaml) |
151+
| InternImage-H | Joint 427M -> IN-22K | 640x640 | 89.6 | 1.08B | 1478G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_22kto1k_640.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_h_22kto1k_640) \| [cfg](configs/without_lr_decay/internimage_h_22kto1k_640.yaml) |
152+
| InternImage-G | Joint 427M -> IN-22K | 512x512 | 90.1 | 3B | 2700G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_22kto1k_512.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_g_22kto1k_512) \| [cfg](configs/without_lr_decay/internimage_g_22kto1k_512.yaml) |
153153

154154
</div>
155155

@@ -202,7 +202,7 @@ Some other projects related to InternImage include the pretraining algorithm "M3
202202
</details>
203203

204204
<details>
205-
<summary> Main Results of FPS </summary>
205+
<summary> Main Results of FPS </summary>
206206
<br>
207207
<div>
208208

@@ -270,10 +270,11 @@ For more details on building custom ops, please referring to [this document](htt
270270
If this work is helpful for your research, please consider citing the following BibTeX entry.
271271

272272
```bibtex
273-
@article{wang2022internimage,
274-
title={InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions},
273+
@inproceedings{wang2023internimage,
274+
title={Internimage: Exploring large-scale vision foundation models with deformable convolutions},
275275
author={Wang, Wenhai and Dai, Jifeng and Chen, Zhe and Huang, Zhenhang and Li, Zhiqi and Zhu, Xizhou and Hu, Xiaowei and Lu, Tong and Lu, Lewei and Li, Hongsheng and others},
276-
journal={arXiv preprint arXiv:2211.05778},
277-
year={2022}
276+
booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
277+
pages={14408--14419},
278+
year={2023}
278279
}
279280
```

README_CN.md

Lines changed: 20 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -124,12 +124,12 @@ InternImage 是一个由上海人工智能实验室、清华大学等机构的
124124
<br>
125125
<div>
126126

127-
| name | pretrain | resolution | #param | download |
128-
| :------------: | :----------: | :--------: | :----: | :---------------------------------------------------------------------------------------------------: |
129-
| InternImage-L | ImageNet-22K | 384x384 | 223M | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22k_192to384.pth) |
130-
| InternImage-XL | ImageNet-22K | 384x384 | 335M | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22k_192to384.pth) |
131-
| InternImage-H | Joint 427M | 384x384 | 1.08B | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_jointto22k_384.pth) |
132-
| InternImage-G | Joint 427M | 384x384 | 3B | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_pretrainto22k_384.pth) |
127+
| name | pretrain | resolution | #param | download |
128+
| :------------: | :----------: | :--------: | :----: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
129+
| InternImage-L | ImageNet-22K | 384x384 | 223M | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22k_192to384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_l_22k_384) |
130+
| InternImage-XL | ImageNet-22K | 384x384 | 335M | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22k_192to384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_xl_22k_384) |
131+
| InternImage-H | Joint 427M | 384x384 | 1.08B | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_jointto22k_384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_h_jointto22k_384) |
132+
| InternImage-G | Joint 427M | 384x384 | 3B | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_pretrainto22k_384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_g_jointto22k_384) |
133133

134134
</div>
135135

@@ -140,15 +140,15 @@ InternImage 是一个由上海人工智能实验室、清华大学等机构的
140140
<br>
141141
<div>
142142

143-
| name | pretrain | resolution | acc@1 | #param | FLOPs | download |
144-
| :------------: | :----------: | :--------: | :---: | :----: | :---: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------: |
145-
| InternImage-T | ImageNet-1K | 224x224 | 83.5 | 30M | 5G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_t_1k_224.pth) \| [cfg](configs/without_lr_decay/internimage_t_1k_224.yaml) |
146-
| InternImage-S | ImageNet-1K | 224x224 | 84.2 | 50M | 8G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_s_1k_224.pth) \| [cfg](configs/without_lr_decay/internimage_s_1k_224.yaml) |
147-
| InternImage-B | ImageNet-1K | 224x224 | 84.9 | 97M | 16G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_b_1k_224.pth) \| [cfg](configs/without_lr_decay/internimage_b_1k_224.yaml) |
148-
| InternImage-L | ImageNet-22K | 384x384 | 87.7 | 223M | 108G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22kto1k_384.pth) \| [cfg](configs/without_lr_decay/internimage_l_22kto1k_384.yaml) |
149-
| InternImage-XL | ImageNet-22K | 384x384 | 88.0 | 335M | 163G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22kto1k_384.pth) \| [cfg](configs/without_lr_decay/internimage_xl_22kto1k_384.yaml) |
150-
| InternImage-H | Joint 427M | 640x640 | 89.6 | 1.08B | 1478G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_22kto1k_640.pth) \| [cfg](configs/without_lr_decay/internimage_h_22kto1k_640.yaml) |
151-
| InternImage-G | Joint 427M | 512x512 | 90.1 | 3B | 2700G | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_22kto1k_512.pth) \| [cfg](configs/without_lr_decay/internimage_g_22kto1k_512.yaml) |
143+
| name | pretrain | resolution | acc@1 | #param | FLOPs | download |
144+
| :------------: | :----------: | :--------: | :---: | :----: | :---: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
145+
| InternImage-T | ImageNet-1K | 224x224 | 83.5 | 30M | 5G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_t_1k_224.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_t_1k_224) \| [cfg](configs/without_lr_decay/internimage_t_1k_224.yaml) |
146+
| InternImage-S | ImageNet-1K | 224x224 | 84.2 | 50M | 8G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_s_1k_224.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_s_1k_224) \| [cfg](configs/without_lr_decay/internimage_s_1k_224.yaml) |
147+
| InternImage-B | ImageNet-1K | 224x224 | 84.9 | 97M | 16G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_b_1k_224.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_b_1k_224) \| [cfg](configs/without_lr_decay/internimage_b_1k_224.yaml) |
148+
| InternImage-L | ImageNet-22K | 384x384 | 87.7 | 223M | 108G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_l_22kto1k_384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_l_22kto1k_384) \| [cfg](configs/without_lr_decay/internimage_l_22kto1k_384.yaml) |
149+
| InternImage-XL | ImageNet-22K | 384x384 | 88.0 | 335M | 163G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_xl_22kto1k_384.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_xl_22kto1k_384) \| [cfg](configs/without_lr_decay/internimage_xl_22kto1k_384.yaml) |
150+
| InternImage-H | Joint 427M | 640x640 | 89.6 | 1.08B | 1478G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_h_22kto1k_640.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_h_22kto1k_640) \| [cfg](configs/without_lr_decay/internimage_h_22kto1k_640.yaml) |
151+
| InternImage-G | Joint 427M | 512x512 | 90.1 | 3B | 2700G | [pth](https://huggingface.co/OpenGVLab/InternImage/resolve/main/internimage_g_22kto1k_512.pth) \| [hf](https://huggingface.co/OpenGVLab/internimage_g_22kto1k_512) \| [cfg](configs/without_lr_decay/internimage_g_22kto1k_512.yaml) |
152152

153153
</div>
154154

@@ -269,10 +269,11 @@ pip install -e .
269269
若这个工作对您的研究有帮助,请参考如下 BibTeX 对我们的工作进行引用。
270270

271271
```bibtex
272-
@article{wang2022internimage,
273-
title={InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions},
272+
@inproceedings{wang2023internimage,
273+
title={Internimage: Exploring large-scale vision foundation models with deformable convolutions},
274274
author={Wang, Wenhai and Dai, Jifeng and Chen, Zhe and Huang, Zhenhang and Li, Zhiqi and Zhu, Xizhou and Hu, Xiaowei and Lu, Tong and Lu, Lewei and Li, Hongsheng and others},
275-
journal={arXiv preprint arXiv:2211.05778},
276-
year={2022}
275+
booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
276+
pages={14408--14419},
277+
year={2023}
277278
}
278279
```

0 commit comments

Comments
 (0)