Skip to content

MeirKriheli/python-bidi

Python BiDi

Bi-directional (BiDi) layout for Python providing 2 implementations:

  • V5 of the algorithm implemented with Python.
  • Wrapper the unicode-bidi Rust crate.

Package documentation

For the python implementation, and compatible with previous versions, use:

from bidi.algorithm import get_display

For the newer Rust based one, which seems to implement higher version of the algorithm (albeit with some missing see #25), use the top level import:

from bidi import get_display

API

The algorithm starts with a single entry point get_display (see above for selecting the implementaion).

Required arguments:

  • str_or_bytes: The string or bytes (i.e.: storage). If it's bytes use the optional argument encoding to specify it's encoding.

Optional arguments:

  • encoding: If unicode_or_str is a string, specifies the encoding. The algorithm uses unicodedata which requires unicode. This encoding will be used to decode and encode back to string before returning (default: "utf-8").
  • base_dir: 'L' or 'R', override the calculated base_level.
  • debug: True to display the Unicode levels as seen by the algorithm (default: False).

The Python implementaion adds one more optional argument:

  • upper_is_rtl: True to treat upper case chars as strong 'R' for debugging (default: False).

It returns the display layout, either as str or encoding encoded bytes (depending on the type of str_or_bytes').

Example:

>>> from bidi import get_display
>>> # keep as list with char per line to prevent browsers from changing display order
>>> HELLO_HEB = "".join([
...     "ש",
...     "ל",
...     "ו",
...     "ם"
... ])
>>>
>>> HELLO_HEB_DISPLAY = "".join([
...     "ם",
...     "ו",
...     "ל",
...     "ש",
... ])
>>>
>>> get_display(HELLO_HEB) == HELLO_HEB_DISPLAY
True

CLI

pybidi is a command line utility (calling bidi.main) for running the display algorithm. The script can get a string as a parameter or read text from stdin.

Usage:

$ pybidi -h
usage: pybidi [-h] [-e ENCODING] [-u] [-d] [-b {L,R}] [-r] [-v]

options:
-h, --help            show this help message and exit
-e ENCODING, --encoding ENCODING
                        Text encoding (default: utf-8)
-u, --upper-is-rtl    Treat upper case chars as strong 'R' for debugging (default: False), Ignored in Rust algo
-d, --debug           Output to stderr steps taken with the algorithm
-b {L,R}, --base-dir {L,R}
                        Override base direction [L|R]
-r, --rust            Use the Rust unicode-bidi implemention instead of the Python one
-v, --version         show program's version number and exit

Examples:

$ pybidi -u 'Your string here'
$ cat ~/Documents/example.txt | pybidi

Installation

At the command line (assuming you're using some virtualenv):

pip install python-bidi

Running tests

To run the tests:

pip install nox
nox