@@ -16,7 +16,7 @@ Key features:
1616 to end users).
1717* (optionally) Checks deliverability: Does the domain name resolve? And you can override the default DNS resolver.
1818* Supports internationalized domain names and (optionally)
19- internationalized local parts.
19+ internationalized local parts, but blocks unsafe characters .
2020* Normalizes email addresses (super important for internationalized
2121 addresses! see below).
2222
@@ -172,12 +172,28 @@ The second sort of internationalization is internationalization in the
172172* local* part of the address (before the @-sign). In non-internationalized
173173email addresses, only English letters, numbers, and some punctuation
174174(` ._!#$%&'^``*+-=~/?{|} ` ) are allowed. In internationalized email address
175- local parts, all Unicode characters are allowed by this library, although
176- it's possible that not all characters will be allowed by all mail systems.
177-
178- To deliver email to addresses with Unicode, non-English characters, your mail
175+ local parts, a wider range of Unicode characters are allowed.
176+
177+ A surprisingly large number of Unicode characters are not safe to display,
178+ especially when the email address is concatenated with other text, so this
179+ library tries to protect you by not permitting resvered, non-, private use,
180+ formatting (which can be used to alter the display order of characters),
181+ whitespace, and control characters, and combining characters
182+ as the first character (so that they cannot combine with something outside
183+ of the email address string). See https://qntm.org/safe and https://trojansource.codes/
184+ for relevant prior work. (Other than whitespace, these are checks that
185+ you should be applying to nearly all user inputs in a security-sensitive
186+ context.)
187+
188+ These character checks are performed after Unicode normalization (see below),
189+ so you are only fully protected if you replace all user-provided email addresses
190+ with the normalized email address string returned by this library. This does not
191+ guard against the well known problem that many Unicode characters look alike
192+ (or are identical), which can be used to fool humans reading displayed text.
193+
194+ Email addresses with these non-ASCII characters require that your mail
179195submission library and the mail servers along the route to the destination,
180- including your own outbound mail server, must all support the
196+ including your own outbound mail server, all support the
181197[ SMTPUTF8 (RFC 6531)] ( https://tools.ietf.org/html/rfc6531 ) extension.
182198Support for SMTPUTF8 varies. See the ` allow_smtputf8 ` parameter.
183199
0 commit comments