Skip to content

Commit e733433

Browse files
authored
Update README.md
1 parent 88cf801 commit e733433

File tree

1 file changed

+46
-0
lines changed

1 file changed

+46
-0
lines changed

README.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,52 @@ This is the official PyTorch implementation for the following EMNLP 2021 paper f
88
![CodeT5 demo](codet5.gif)
99

1010
## Updates
11+
**Oct 25, 2021**
12+
13+
We release a CodeT5-base fine-tuned checkpoint ([Salesforce/codet5-base-multi-sum](https://huggingface.co/Salesforce/codet5-base-multi-sum)) for multi-lingual code summarzation. Below is how to use this model:
14+
15+
```python
16+
from transformers import RobertaTokenizer, T5ForConditionalGeneration
17+
18+
if __name__ == '__main__':
19+
tokenizer = RobertaTokenizer.from_pretrained('Salesforce/codet5-base')
20+
model = T5ForConditionalGeneration.from_pretrained('Salesforce/codet5-base-multi-sum')
21+
22+
text = """def svg_to_image(string, size=None):
23+
if isinstance(string, unicode):
24+
string = string.encode('utf-8')
25+
renderer = QtSvg.QSvgRenderer(QtCore.QByteArray(string))
26+
if not renderer.isValid():
27+
raise ValueError('Invalid SVG data.')
28+
if size is None:
29+
size = renderer.defaultSize()
30+
image = QtGui.QImage(size, QtGui.QImage.Format_ARGB32)
31+
painter = QtGui.QPainter(image)
32+
renderer.render(painter)
33+
return image"""
34+
35+
input_ids = tokenizer(text, return_tensors="pt").input_ids
36+
37+
generated_ids = model.generate(input_ids, max_length=20)
38+
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
39+
# this prints: "Convert a SVG string to a QImage."
40+
```
41+
42+
It significantly outperforms previous methods on code summarization in the [CodeXGLUE benchmark](https://github.com/microsoft/CodeXGLUE/tree/main/Code-Text/code-to-text):
43+
| Model | Ruby | Javascript | Go | Python | Java | PHP | Overall |
44+
| ----------- | :-------: | :--------: | :-------: | :-------: | :-------: | :-------: | :-------: |
45+
| Seq2Seq | 9.64 | 10.21 | 13.98 | 15.93 | 15.09 | 21.08 | 14.32 |
46+
| Transformer | 11.18 | 11.59 | 16.38 | 15.81 | 16.26 | 22.12 | 15.56 |
47+
| [RoBERTa](https://arxiv.org/pdf/1907.11692.pdf) | 11.17 | 11.90 | 17.72 | 18.14 | 16.47 | 24.02 | 16.57 |
48+
| [CodeBERT](https://arxiv.org/pdf/2002.08155.pdf) | 12.16 | 14.90 | 18.07 | 19.06 | 17.65 | 25.16 | 17.83 |
49+
| [PLBART](https://aclanthology.org/2021.naacl-main.211.pdf) | 14.11 |15.56 | 18.91 | 19.30 | 18.45 | 23.58 | 18.32 |
50+
| [CodeT5-base-multi-sum](https://arxiv.org/abs/2109.00859) | **15.24** | **16.18** | **19.95** | **20.42** | **20.26** | **26.10** | **19.69** |
51+
52+
53+
**Oct 18, 2021**
54+
55+
We add a [model card](https://github.com/salesforce/CodeT5/blob/main/CodeT5_model_card.pdf) for CodeT5! Please reach out if you have any questions about it.
56+
1157
**Sep 24, 2021**
1258

1359
CodeT5 is now in [hugginface](https://huggingface.co/)!

0 commit comments

Comments
 (0)