Skip to content

Commit 38d775d

Browse files
committed
Fix unittest -- Scholarly API now has openAccess Licensing info
1 parent 430f6d8 commit 38d775d

File tree

1 file changed

+27
-19
lines changed

1 file changed

+27
-19
lines changed
Lines changed: 27 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,28 @@
1-
---
2-
title: 'StereoSet: Measuring stereotypical bias in pretrained language models'
3-
venue: Annual Meeting of the Association for Computational Linguistics
4-
names: Moin Nadeem, Anna Bethke, Siva Reddy
5-
tags:
6-
- Annual Meeting of the Association for Computational Linguistics
7-
link: https://arxiv.org/abs/2004.09456
8-
categories: Publications
9-
10-
---
11-
12-
*{{ page.names }}*
13-
14-
**{{ page.venue }}**
15-
16-
{% include display-publication-links.html pub=page %}
17-
18-
## Abstract
19-
1+
---
2+
title: 'StereoSet: Measuring stereotypical bias in pretrained language models'
3+
venue: Annual Meeting of the Association for Computational Linguistics
4+
openAccessPdf:
5+
url: https://aclanthology.org/2021.acl-long.416.pdf
6+
status: HYBRID
7+
license: CCBY
8+
disclaimer: 'Notice: This abstract is extracted from the open access paper or abstract
9+
available at https://arxiv.org/abs/2004.09456, which is subject to the license
10+
by the author or copyright owner provided with this content. Please go to the
11+
source to verify the license and copyright information for your use.'
12+
names: Moin Nadeem, Anna Bethke, Siva Reddy
13+
tags:
14+
- Annual Meeting of the Association for Computational Linguistics
15+
link: https://arxiv.org/abs/2004.09456
16+
categories: Publications
17+
18+
---
19+
20+
*{{ page.names }}*
21+
22+
**{{ page.venue }}**
23+
24+
{% include display-publication-links.html pub=page %}
25+
26+
## Abstract
27+
2028
A stereotype is an over-generalized belief about a particular group of people, e.g., Asians are good at math or African Americans are athletic. Such beliefs (biases) are known to hurt target groups. Since pretrained language models are trained on large real-world data, they are known to capture stereotypical biases. It is important to quantify to what extent these biases are present in them. Although this is a rapidly growing area of research, existing literature lacks in two important aspects: 1) they mainly evaluate bias of pretrained language models on a small set of artificial sentences, even though these models are trained on natural data 2) current evaluations focus on measuring bias without considering the language modeling ability of a model, which could lead to misleading trust on a model even if it is a poor language model. We address both these problems. We present StereoSet, a large-scale natural English dataset to measure stereotypical biases in four domains: gender, profession, race, and religion. We contrast both stereotypical bias and language modeling ability of popular models like BERT, GPT-2, RoBERTa, and XLnet. We show that these models exhibit strong stereotypical biases. Our data and code are available at https://stereoset.mit.edu.

0 commit comments

Comments
 (0)