diff --git a/CHANGES.SIMD.rst b/CHANGES.SIMD.rst new file mode 100644 index 00000000000..033b16dc652 --- /dev/null +++ b/CHANGES.SIMD.rst @@ -0,0 +1,144 @@ +Changelog (Pillow-SIMD) +======================= + +9.0.0.post1 +----------- + +- Fixed possible overflow in LUT processing +- Restored compatibility with Visual C Compiler + + +7.0.0.post4 +----------- + +- Filter: fixed wrong offset handling for 3x3 single-band version + +7.0.0.post3 +----------- + +- ColorLUT: fixed potential access violation, up to 2x faster + +7.0.0.post2 +----------- + +- ColorLUT: SSE4 & AVX2 + +7.0.0.post1 & 6.2.2.post1 & 6.1.0.post1 & 6.0.0.post2 +----------------------------------------------------- + +- Bands: access violation in getband in some environments + +7.0.0.post0 +----------- + +- Reduce: SSE4 + +6.0.0.post1 +----------- + +- GCC 9.0+: fixed unaligned read for ``_**_cvtepu8_epi32`` functions. + +6.0.0.post0 and 5.3.0.post1 +--------------------------- + +- Resampling: Correct max coefficient calculation. Some rare combinations of + initial and requested sizes lead to black lines. + +4.3.0.post0 +----------- + +- Float-based filters, single-band: 3x3 SSE4, 5x5 SSE4 +- Float-based filters, multi-band: 3x3 SSE4 & AVX2, 5x5 SSE4 +- Int-based filters, multi-band: 3x3 SSE4 & AVX2, 5x5 SSE4 & AVX2 +- Box blur: fast path for radius < 1 +- Alpha composite: fast div approximation +- Color conversion: RGB to L SSE4, fast div in RGBa to RGBA +- Resampling: optimized coefficients loading +- Split and get_channel: SSE4 + +3.4.1.post1 +----------- + +- Critical memory error for some combinations of source/destination + sizes is fixed. + +3.4.1.post0 +----------- + +- A lot of optimizations in resampling including 16-bit + intermediate color representation and heavy unrolling. + +3.3.2.post0 +----------- + +- Maintenance release + +3.3.0.post2 +----------- + +- Fixed error in RGBa -> RGBA conversion + +3.3.0.post1 +----------- + +Alpha compositing +~~~~~~~~~~~~~~~~~ + +- SSE4 and AVX2 fixed-point full loading implementation. + Up to 4.6x faster. + +3.3.0.post0 +----------- + +Resampling +~~~~~~~~~~ + +- SSE4 and AVX2 fixed-point full loading horizontal pass. +- SSE4 and AVX2 fixed-point full loading vertical pass. + +Conversion +~~~~~~~~~~ + +- RGBA -> RGBa SSE4 and AVX2 fixed-point full loading implementations. + Up to 2.6x faster. +- RGBa -> RGBA AVX2 implementation using gather instructions. + Up to 5x faster. + + +3.2.0.post3 +----------- + +Resampling +~~~~~~~~~~ + +- SSE4 and AVX2 float full loading horizontal pass. +- SSE4 float full loading vertical pass. + + +3.2.0.post2 +----------- + +Resampling +~~~~~~~~~~ + +- SSE4 and AVX2 float full loading horizontal pass. +- SSE4 float per-pixel loading vertical pass. + + +2.9.0.post1 +----------- + +Resampling +~~~~~~~~~~ + +- SSE4 and AVX2 float per-pixel loading horizontal pass. +- SSE4 float per-pixel loading vertical pass. +- SSE4: Up to 2x for downscaling. Up to 3.5x for upscaling. +- AVX2: Up to 2.7x for downscaling. Up to 3.5x for upscaling. + + +Box blur +~~~~~~~~ + +- Simple SSE4 fixed-point implementations with per-pixel loading. +- Up to 2.1x faster. diff --git a/PyPI.rst b/PyPI.rst new file mode 100644 index 00000000000..e599b429340 --- /dev/null +++ b/PyPI.rst @@ -0,0 +1,6 @@ + +`Pillow-SIMD repo and readme `_ + +`Pillow-SIMD changelog `_ + +`Pillow documentation `_ diff --git a/README.md b/README.md index af1ca57c25b..488ee595558 100644 --- a/README.md +++ b/README.md @@ -1,121 +1,130 @@ -

- Pillow logo -

- -# Pillow - -## Python Imaging Library (Fork) - -Pillow is the friendly PIL fork by [Jeffrey A. Clark (Alex) and -contributors](https://github.com/python-pillow/Pillow/graphs/contributors). -PIL is the Python Imaging Library by Fredrik Lundh and Contributors. -As of 2019, Pillow development is -[supported by Tidelift](https://tidelift.com/subscription/pkg/pypi-pillow?utm_source=pypi-pillow&utm_medium=readme&utm_campaign=enterprise). - - - - - - - - - - - - - - - - - - -
docs - Documentation Status -
tests - GitHub Actions build status (Lint) - GitHub Actions build status (Test Linux and macOS) - GitHub Actions build status (Test Windows) - GitHub Actions build status (Test MinGW) - GitHub Actions build status (Test Cygwin) - GitHub Actions build status (Test Docker) - AppVeyor CI build status (Windows) - GitHub Actions wheels build status (Wheels) - Travis CI wheels build status (aarch64) - Code coverage - Fuzzing Status -
package - Zenodo - Tidelift - Newest PyPI version - Number of PyPI downloads - OpenSSF Best Practices -
social - Join the chat at https://gitter.im/python-pillow/Pillow - Follow on https://twitter.com/PythonPillow - Follow on https://fosstodon.org/@pillow -
- -## Overview - -The Python Imaging Library adds image processing capabilities to your Python interpreter. - -This library provides extensive file format support, an efficient internal representation, and fairly powerful image processing capabilities. - -The core image library is designed for fast access to data stored in a few basic pixel formats. It should provide a solid foundation for a general image processing tool. - -## More Information - -- [Documentation](https://pillow.readthedocs.io/) - - [Installation](https://pillow.readthedocs.io/en/latest/installation.html) - - [Handbook](https://pillow.readthedocs.io/en/latest/handbook/index.html) -- [Contribute](https://github.com/python-pillow/Pillow/blob/main/.github/CONTRIBUTING.md) - - [Issues](https://github.com/python-pillow/Pillow/issues) - - [Pull requests](https://github.com/python-pillow/Pillow/pulls) -- [Release notes](https://pillow.readthedocs.io/en/stable/releasenotes/index.html) -- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - - [Pre-fork](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst#pre-fork) - -## Report a Vulnerability - -To report a security vulnerability, please follow the procedure described in the [Tidelift security policy](https://tidelift.com/docs/security). +# Pillow-SIMD + +Pillow-SIMD is highly optimized version of [Pillow library][original-docs] +for x86 architecture (mainly Intel and AMD CPUs). + +Pillow-SIMD is "following" Pillow which means it is +drop-in replacements for Pillow of the same version. +For example, `Pillow-SIMD 3.2.0.post3` is a drop-in replacement for +`Pillow 3.2.0`, and `Pillow-SIMD 3.3.3.post0` — for `Pillow 3.3.3`. + +For more information on the original Pillow, please refer to: +[read the documentation][original-docs], +[check the changelog][original-changelog] and +[find out how to contribute][original-contribute]. + + +## Why SIMD + +There are multiple ways to tweak image processing performance. +To name a few, such ways can be: utilizing better algorithms, optimizing existing implementations, +using more processing power and/or resources. +One of the great examples of using a more efficient algorithm is [replacing][gaussian-blur-changes] +a convolution-based Gaussian blur with a sequential-box one. + +Such examples are rather rare, though. It is also known, that certain processes might be optimized +by using parallel processing to run the respective routines. +But a more practical key to optimizations might be making things work faster +using the resources at hand. For instance, SIMD computing might be the case. + +SIMD stands for "single instruction, multiple data" and its essence is +in performing the same operation on multiple data points simultaneously +by using multiple processing elements. +Common CPU SIMD instruction sets are MMX, SSE-SSE4, AVX, AVX2, AVX512, NEON. + +Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) or AVX2 support. + + +## Status + +Pillow-SIMD project is production-ready. +The project is supported by Uploadcare, a SAAS for cloud-based image storing and processing. + +[![Uploadcare][uploadcare.logo]][uploadcare.com] + +In fact, Uploadcare has been running Pillow-SIMD since 2015. + +The following image operations are currently SIMD-accelerated: + +- Resize (convolution-based resampling): SSE4, AVX2 +- Gaussian and box blur: SSE4 +- Alpha composition: SSE4, AVX2 +- RGBA → RGBa (alpha premultiplication): SSE4, AVX2 +- RGBa → RGBA (division by alpha): SSE4, AVX2 +- RGB → L (grayscale): SSE4 +- 3x3 and 5x5 kernel filters: SSE4, AVX2 +- Split and get_channel: SSE4 + + +## Benchmarks + +Tons of tests can be found on the [Pillow Performance][pillow-perf-page] page. +There are benchmarks against different versions of Pillow and Pillow-SIMD +as well as ImageMagick, Skia, OpenCV and IPP. + +The results show that for resizing Pillow is always faster than ImageMagick, +Pillow-SIMD, in turn, is even faster than the original Pillow by the factor of 4-6. +In general, Pillow-SIMD with AVX2 is always **16 to 40 times faster** than +ImageMagick and outperforms Skia, the high-speed graphics library used in Chromium. + + +## Why Pillow itself is so fast + +No cheats involved. We've used identical high-quality resize and blur methods for the benchmark. +Outcomes produced by different libraries are in almost pixel-perfect agreement. +The difference in measured rates is only provided with the performance of every involved algorithm. + + +## Why Pillow-SIMD is even faster + +Because of the SIMD computing, of course. But there's more to it: +heavy loops unrolling, specific instructions, which aren't available for scalar data types. + + +## Why do not contribute SIMD to the original Pillow + +Well, it's not that simple. First of all, the original Pillow supports +a large number of architectures, not just x86. +But even for x86 platforms, Pillow is often distributed via precompiled binaries. +In order for us to integrate SIMD into the precompiled binaries +we'd need to execute runtime CPU capabilities checks. +To compile the code this way we need to pass the `-mavx2` option to the compiler. +But with the option included, a compiler will inject AVX instructions even +for SSE functions (i.e. interchange them) since every SSE instruction has its AVX equivalent. +So there is no easy way to compile such library, especially with setuptools. + + +## Installation + +If there's a copy of the original Pillow installed, it has to be removed first +with `$ pip uninstall -y pillow`. +Please install [prerequisites](https://pillow.readthedocs.io/en/stable/installation.html#building-from-source) for your platform. +The installation itself is simple just as running `$ pip install pillow-simd`, +and if you're using SSE4-capable CPU everything should run smoothly. +If you'd like to install the AVX2-enabled version, +you need to pass the additional flag to a C compiler. +The easiest way to do so is to define the `CC` variable during the compilation. + +```bash +$ pip uninstall pillow +$ CC="cc -mavx2" pip install -U --force-reinstall pillow-simd +``` + + +## Contributing to Pillow-SIMD + +Please be aware that Pillow-SIMD and Pillow are two separate projects. +Please submit bugs and improvements not related to SIMD to the [original Pillow][original-issues]. +All bugfixes to the original Pillow will then be transferred to the next Pillow-SIMD version automatically. + + + [original-homepage]: https://python-pillow.org/ + [original-docs]: https://pillow.readthedocs.io/ + [original-issues]: https://github.com/python-pillow/Pillow/issues/new + [original-changelog]: https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst + [original-contribute]: https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md + [gaussian-blur-changes]: https://pillow.readthedocs.io/en/stable/releasenotes/2.7.0.html#gaussian-blur-and-unsharp-mask + [pillow-perf-page]: https://python-pillow.github.io/pillow-perf/ + [pillow-perf-repo]: https://github.com/python-pillow/pillow-perf + [uploadcare.com]: https://uploadcare.com/?utm_source=github&utm_medium=description&utm_campaign=pillow-simd + [uploadcare.logo]: https://ucarecdn.com/8eca784b-bbe5-4f7e-8cdf-98d75aab8cec/logotransparent.svg diff --git a/setup.cfg b/setup.cfg index d6057f1599d..53f9f4663d8 100644 --- a/setup.cfg +++ b/setup.cfg @@ -1,9 +1,8 @@ [metadata] -name = Pillow +name = Pillow-SIMD description = Python Imaging Library (Fork) -long_description = file: README.md -long_description_content_type = text/markdown -url = https://python-pillow.org +long_description = file: PyPI.rst +url = https://github.com/uploadcare/pillow-simd author = Jeffrey A. Clark (Alex) author_email = aclark@aclark.net license = HPND @@ -27,7 +26,7 @@ classifiers = keywords = Imaging project_urls = Documentation=https://pillow.readthedocs.io - Source=https://github.com/python-pillow/Pillow + Source=https://github.com/uploadcare/pillow-simd Funding=https://tidelift.com/subscription/pkg/pypi-pillow?utm_source=pypi-pillow&utm_medium=pypi Release notes=https://pillow.readthedocs.io/en/stable/releasenotes/index.html Changelog=https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst diff --git a/setup.py b/setup.py index 07d6c66d655..aa9095f72eb 100755 --- a/setup.py +++ b/setup.py @@ -980,7 +980,7 @@ def debug_build(): for src_file in _LIB_IMAGING: files.append(os.path.join("src/libImaging", src_file + ".c")) ext_modules = [ - Extension("PIL._imaging", files), + Extension("PIL._imaging", files, extra_compile_args=["-msse4"]), Extension("PIL._imagingft", ["src/_imagingft.c"]), Extension("PIL._imagingcms", ["src/_imagingcms.c"]), Extension("PIL._webp", ["src/_webp.c"]), diff --git a/src/PIL/_version.py b/src/PIL/_version.py index d94d3593440..4eb2d54f530 100644 --- a/src/PIL/_version.py +++ b/src/PIL/_version.py @@ -1,2 +1,2 @@ # Master version for Pillow -__version__ = "9.5.0" +__version__ = "9.5.0.post0"