Skip to content

An implementation targeting high performance for frequently reading/writing operations for multi-byte string.

License

Notifications You must be signed in to change notification settings

jfcherng/php-mb-string

Repository files navigation

php-mb-string

GitHub Workflow Status (branch) Packagist Packagist Version Project license GitHub stars Donate to this project using Paypal

A high performance multibyte sting implementation for frequently reading/writing operations.

Why I Write This Package?

Consider that you have a LONG multibyte string and you want to do lots of following operations on it.

  • Random reading/writing such as $char = $str[5]; or $str[5] = '許';.
  • Replacement such as str_replace($search, $replace, $str);.
  • Insertion such as substr_replace($insert, $str, $position, 0);.
  • Get substring such as substr($str, $start, $length);.

Because strings in PHP are not UTF-8, to do operations above safely, you have to either use mb_*() functions or calculate the index by yourself. Using mb_*() functions frequently can be a performance loss because it has to re-decode the source string basing on the given encoding every time when you call it. The longer the string is, the severer the problem becomes.

Instead, this class internally stores the string in its UTF-32 form, which is fixed-width (1 char always occupies 4 bytes) so we are able to perform speedy random accesses. With the power of random access, we could use str_*() functions to do the job internally.

Installation

composer require jfcherng/php-mb-string

Example

See tests/MbStringTest.php.

Benchmark

See benchmark/_results.txt.

What Are You Doing With This Package?

I develop this for a PHP diff package, jfcherng/php-diff.

About

An implementation targeting high performance for frequently reading/writing operations for multi-byte string.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages