Pinned Loading
-
enabling-languages/python-i18n
enabling-languages/python-i18n PublicRandom notes on Python internationalisation
Jupyter Notebook 17
-
enabling-languages/library-i18n
enabling-languages/library-i18n PublicExploration of internationalisation issues for libraries.
Jupyter Notebook 1
-
Grapheme tokenisation in Python
Grapheme tokenisation in Python 1# Grapheme tokenisation in Python
23When working with tokenisation and break iterators, it is sometimes necessary to work at the character, syllable, line, or sentence levels. Character level tokenisation is an interesting case. By character, I mean a user perceivable unit of text, which the Unicode standard would refer to as a grapheme. The usual way I see developers handling character level tokenisation of English is via list comprehension or typecasting a string to a list:
45```py
-
-
-
enabling-languages/australian_indigenous
enabling-languages/australian_indigenous PublicKeyboard layouts and web support for Aboriginal and Torres Straight Island languages
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.