|
| 1 | +# UNICODE2ASCII(3) |
| 2 | + |
| 3 | +## NAME |
| 4 | +unicode2ascii - Unicode to Ascii Python library |
| 5 | + |
| 6 | +## SYNOPSIS |
| 7 | +**import unicode2ascii** |
| 8 | + |
| 9 | +*Boolean* unicode2ascii.**is_unicode_category**(String *character*, String *category*) |
| 10 | + |
| 11 | +*Boolean* unicode2ascii.**is_unicode_letter**(String *character*) |
| 12 | + |
| 13 | +*Boolean* unicode2ascii.**is_unicode_mark**(String *character*) |
| 14 | + |
| 15 | +*Boolean* unicode2ascii.**is_unicode_number**(String *character*) |
| 16 | + |
| 17 | +*Boolean* unicode2ascii.**is_unicode_punctuation**(String *character*) |
| 18 | + |
| 19 | +*Boolean* unicode2ascii.**is_unicode_symbol**(String *character*) |
| 20 | + |
| 21 | +*Boolean* unicode2ascii.**is_unicode_separator**(String *character*) |
| 22 | + |
| 23 | +*Boolean* unicode2ascii.**is_unicode_other**(String *character*) |
| 24 | + |
| 25 | +*String* unicode2ascii.**unicode_category**(String *category*) |
| 26 | + |
| 27 | +*String* unicode2ascii.**unicode_to_ascii_character**(String *character*, [String *default* = '']) |
| 28 | + |
| 29 | +*String* unicode2ascii.**unicode_to_ascii_string**(String *string*, [String *default* = '']) |
| 30 | + |
| 31 | +unicode2ascii.**analyze_unicode_character**(String *character*) |
| 32 | + |
| 33 | +## DESCRIPTION |
| 34 | +The **is_unicode_category**() function returns True if *character* belongs to the *category* Unicode category or False if not. |
| 35 | + |
| 36 | +All the other **is_unicode_XXX**() functions return True if *character* belongs to the XXX category. |
| 37 | + |
| 38 | +The **unicode_category**() function return a one-line description of the specified *category*. |
| 39 | + |
| 40 | +The **unicode_to_ascii_character**() function returns the ASCII equivalent of an unicode *character*, or an unchanged non-Unicode character. |
| 41 | +If there is no ASCII equivalent, it returns the *default* string ("" if not provided). |
| 42 | + |
| 43 | +The **unicode_to_ascii_string**() does the same for all the characters in the *string*. |
| 44 | + |
| 45 | +The **analyze_unicode_character**() function returns all available information about the Unicode *character*. |
| 46 | + |
| 47 | +## ENVIRONMENT |
| 48 | +The UNICODE2ASCII_DEBUG environment variable can be set to any value to enable debug mode. |
| 49 | + |
| 50 | +## SEE ALSO |
| 51 | +[unicode2ascii(1)](https://github.com/HubTou/unicode2ascii/blob/main/UNICODE2ASCII.1.md) |
| 52 | +[iconv(3)](https://www.freebsd.org/cgi/man.cgi?query=iconv&sektion=3) |
| 53 | + |
| 54 | +## STANDARDS |
| 55 | +The **unicode2ascii** library tries to follow the PEP 8 style guide for Python code. |
| 56 | + |
| 57 | +## HISTORY |
| 58 | +This library was made for [The PNU project](https://github.com/HubTou/PNU). |
| 59 | + |
| 60 | +## LICENSE |
| 61 | +This library is available under the [3-clause BSD license](https://opensource.org/licenses/BSD-3-Clause). |
| 62 | + |
| 63 | +## AUTHORS |
| 64 | +[Hubert Tournier](https://github.com/HubTou) |
| 65 | + |
| 66 | +## CAVEATS |
| 67 | +So far, only the following Unicode character sets are processed for missing ASCII equivalents: |
| 68 | +* C0 control characters |
| 69 | +* C1 control characters |
| 70 | +* Basic Latin characters |
| 71 | +* Latin-1 Supplement |
| 72 | +* Latin Extended-A |
| 73 | +* Latin Extended-B |
| 74 | +* Latin Extended Additional |
| 75 | +* IPA Extensions |
| 76 | +* Spacing Modifier Letters |
| 77 | +* Unicode symbols |
| 78 | +* General Punctuation |
| 79 | +* Number Forms |
| 80 | + |
0 commit comments