forked from python-pillow/Pillow
-
Notifications
You must be signed in to change notification settings - Fork 86
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
SIMD. fix list (+6 squashed commits)
Squashed commits: [c45b871] update for Pillow-SIMD 3.4.0 [bedd83f] no alpha compositing in this release [e8fe730] update results for latest version add Skia results [a16ff97] add SIMD changes [82ffbd6] fix readme (+4 squashed commits) Squashed commits: [85677f9] fix error [f44ebb1] update results for unrolled implementation [83968c3] fix #4 [cd73c51] update link (+11 squashed commits) Squashed commits: [5882178] correct spelling [a0e5956] Why Pillow-SIMD is even faster [108e72e] Why Pillow itself is so fast [e8eeda1] spelling fixes [e816e9c] spelling [d2eefef] methodology, why not contributed [2e55786] installation and conclusion [9f6415e] more info [67e55b7] more benchmarks test files [471d4c5] remove spaces [904d89d] add performance tests [4fe17fe] simple readme SIMD. clarify Following fork SIMD. update readme SIMD. update versions in readme SIMD. Changes
- Loading branch information
Showing
2 changed files
with
266 additions
and
104 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
Changelog (Pillow-SIMD) | ||
======================= | ||
|
||
3.4.1.post1 | ||
----------- | ||
|
||
- Critical memory error for some combinations of source/destination | ||
sizes is fixed. | ||
|
||
3.4.1.post0 | ||
----------- | ||
|
||
- A lot of optimizations in resampling including 16-bit | ||
intermediate color representation and heavy unrolling. | ||
|
||
3.3.2.post0 | ||
----------- | ||
|
||
- Maintenance release | ||
|
||
3.3.0.post2 | ||
----------- | ||
|
||
- Fixed error in RGBa -> RGBA convertion | ||
|
||
3.3.0.post1 | ||
----------- | ||
|
||
Alpha compositing | ||
~~~~~~~~~~~~~~~~~ | ||
|
||
- SSE4 and AVX2 fixed-point full loading implementation. | ||
Up to 4.6x faster. | ||
|
||
3.3.0.post0 | ||
----------- | ||
|
||
Resampling | ||
~~~~~~~~~~ | ||
|
||
- SSE4 and AVX2 fixed-point full loading horizontal pass. | ||
- SSE4 and AVX2 fixed-point full loading vertical pass. | ||
|
||
Convertion | ||
~~~~~~~~~~ | ||
|
||
- RGBA -> RGBa SSE4 and AVX2 fixed-point full loading implementations. | ||
Up to 2.6x faster. | ||
- RGBa -> RGBA AVX2 implementation using gather instructions. | ||
Up to 5x faster. | ||
|
||
|
||
3.2.0.post3 | ||
----------- | ||
|
||
Resampling | ||
~~~~~~~~~~ | ||
|
||
- SSE4 and AVX2 float full loading horizontal pass. | ||
- SSE4 float full loading vertical pass. | ||
|
||
|
||
3.2.0.post2 | ||
----------- | ||
|
||
Resampling | ||
~~~~~~~~~~ | ||
|
||
- SSE4 and AVX2 float full loading horizontal pass. | ||
- SSE4 float per-pixel loading vertical pass. | ||
|
||
|
||
2.9.0.post1 | ||
----------- | ||
|
||
Resampling | ||
~~~~~~~~~~ | ||
|
||
- SSE4 and AVX2 float per-pixel loading horizontal pass. | ||
- SSE4 float per-pixel loading vertical pass. | ||
- SSE4: Up to 2x for downscaling. Up to 3.5x for upscaling. | ||
- AVX2: Up to 2.7x for downscaling. Up to 3.5x for upscaling. | ||
|
||
|
||
Box blur | ||
~~~~~~~~ | ||
|
||
- Simple SSE4 fixed-point implementations with per-pixel loading. | ||
- Up to 2.1x faster. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,104 +1,177 @@ | ||
<p align="center"> | ||
<img width="248" height="250" src="https://raw.githubusercontent.com/python-pillow/pillow-logo/master/pillow-logo-248x250.png" alt="Pillow logo"> | ||
</p> | ||
|
||
# Pillow | ||
|
||
## Python Imaging Library (Fork) | ||
|
||
Pillow is the friendly PIL fork by [Alex Clark and | ||
Contributors](https://github.com/python-pillow/Pillow/graphs/contributors). | ||
PIL is the Python Imaging Library by Fredrik Lundh and Contributors. | ||
As of 2019, Pillow development is | ||
[supported by Tidelift](https://tidelift.com/subscription/pkg/pypi-pillow?utm_source=pypi-pillow&utm_medium=readme&utm_campaign=enterprise). | ||
|
||
<table> | ||
<tr> | ||
<th>docs</th> | ||
<td> | ||
<a href="https://pillow.readthedocs.io/?badge=latest"><img | ||
alt="Documentation Status" | ||
src="https://readthedocs.org/projects/pillow/badge/?version=latest"></a> | ||
</td> | ||
</tr> | ||
<tr> | ||
<th>tests</th> | ||
<td> | ||
<a href="https://travis-ci.org/python-pillow/Pillow"><img | ||
alt="Travis CI build status (Linux)" | ||
src="https://img.shields.io/travis/python-pillow/Pillow/master.svg?label=Linux%20build"></a> | ||
<a href="https://travis-ci.org/python-pillow/pillow-wheels"><img | ||
alt="Travis CI build status (macOS)" | ||
src="https://img.shields.io/travis/python-pillow/pillow-wheels/master.svg?label=macOS%20build"></a> | ||
<a href="https://ci.appveyor.com/project/python-pillow/Pillow"><img | ||
alt="AppVeyor CI build status (Windows)" | ||
src="https://img.shields.io/appveyor/build/python-pillow/Pillow/master.svg?label=Windows%20build"></a> | ||
<a href="https://github.com/python-pillow/Pillow/actions?query=workflow%3ALint"><img | ||
alt="GitHub Actions build status (Lint)" | ||
src="https://github.com/python-pillow/Pillow/workflows/Lint/badge.svg"></a> | ||
<a href="https://github.com/python-pillow/Pillow/actions?query=workflow%3ATest"><img | ||
alt="GitHub Actions build status (Test Linux and macOS)" | ||
src="https://github.com/python-pillow/Pillow/workflows/Test/badge.svg"></a> | ||
<a href="https://github.com/python-pillow/Pillow/actions?query=workflow%3A%22Test+Windows%22"><img | ||
alt="GitHub Actions build status (Test Windows)" | ||
src="https://github.com/python-pillow/Pillow/workflows/Test%20Windows/badge.svg"></a> | ||
<a href="https://github.com/python-pillow/Pillow/actions?query=workflow%3A%22Test+Docker%22"><img | ||
alt="GitHub Actions build status (Test Docker)" | ||
src="https://github.com/python-pillow/Pillow/workflows/Test%20Docker/badge.svg"></a> | ||
<a href="https://codecov.io/gh/python-pillow/Pillow"><img | ||
alt="Code coverage" | ||
src="https://codecov.io/gh/python-pillow/Pillow/branch/master/graph/badge.svg"></a> | ||
</td> | ||
</tr> | ||
<tr> | ||
<th>package</th> | ||
<td> | ||
<a href="https://zenodo.org/badge/latestdoi/17549/python-pillow/Pillow"><img | ||
alt="Zenodo" | ||
src="https://zenodo.org/badge/17549/python-pillow/Pillow.svg"></a> | ||
<a href="https://tidelift.com/subscription/pkg/pypi-pillow?utm_source=pypi-pillow&utm_medium=badge"><img | ||
alt="Tidelift" | ||
src="https://tidelift.com/badges/package/pypi/Pillow?style=flat"></a> | ||
<a href="https://pypi.org/project/Pillow/"><img | ||
alt="Newest PyPI version" | ||
src="https://img.shields.io/pypi/v/pillow.svg"></a> | ||
<a href="https://pypi.org/project/Pillow/"><img | ||
alt="Number of PyPI downloads" | ||
src="https://img.shields.io/pypi/dm/pillow.svg"></a> | ||
</td> | ||
</tr> | ||
<tr> | ||
<th>social</th> | ||
<td> | ||
<a href="https://gitter.im/python-pillow/Pillow?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge"><img | ||
alt="Join the chat at https://gitter.im/python-pillow/Pillow" | ||
src="https://badges.gitter.im/python-pillow/Pillow.svg"></a> | ||
<a href="https://twitter.com/PythonPillow"><img | ||
alt="Follow on https://twitter.com/PythonPillow" | ||
src="https://img.shields.io/badge/tweet-on%20Twitter-00aced.svg"></a> | ||
</td> | ||
</tr> | ||
</table> | ||
|
||
## Overview | ||
|
||
The Python Imaging Library adds image processing capabilities to your Python interpreter. | ||
|
||
This library provides extensive file format support, an efficient internal representation, and fairly powerful image processing capabilities. | ||
|
||
The core image library is designed for fast access to data stored in a few basic pixel formats. It should provide a solid foundation for a general image processing tool. | ||
|
||
## More Information | ||
|
||
- [Documentation](https://pillow.readthedocs.io/) | ||
- [Installation](https://pillow.readthedocs.io/en/latest/installation.html) | ||
- [Handbook](https://pillow.readthedocs.io/en/latest/handbook/index.html) | ||
- [Contribute](https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md) | ||
- [Issues](https://github.com/python-pillow/Pillow/issues) | ||
- [Pull requests](https://github.com/python-pillow/Pillow/pulls) | ||
- [Changelog](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst) | ||
- [Pre-fork](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst#pre-fork) | ||
|
||
## Report a Vulnerability | ||
|
||
To report a security vulnerability, please follow the procedure described in the [Tidelift security policy](https://tidelift.com/docs/security). | ||
# Pillow-SIMD | ||
|
||
Pillow-SIMD is "following" Pillow fork (which is PIL fork itself). | ||
"Following" means than Pillow-SIMD versions are 100% compatible | ||
drop-in replacement for Pillow with the same version number. | ||
For example, `Pillow-SIMD 3.2.0.post3` is drop-in replacement for | ||
`Pillow 3.2.0` and `Pillow-SIMD 3.3.3.post0` for `Pillow 3.3.3`. | ||
|
||
For more information about original Pillow, please | ||
[read the documentation][original-docs], | ||
[check the changelog][original-changelog] and | ||
[find out how to contribute][original-contribute]. | ||
|
||
|
||
## Why SIMD | ||
|
||
There are many ways to improve the performance of image processing. | ||
You can use better algorithms for the same task, you can make better | ||
implementation for current algorithms, or you can use more processing unit | ||
resources. It is perfect when you can just use more efficient algorithm like | ||
when gaussian blur based on convolutions [was replaced][gaussian-blur-changes] | ||
by sequential box filters. But a number of such improvements are very limited. | ||
It is also very tempting to use more processor unit resources | ||
(via parallelization) when they are available. But it is handier just | ||
to make things faster on the same resources. And that is where SIMD works better. | ||
|
||
SIMD stands for "single instruction, multiple data". This is a way to perform | ||
same operations against the huge amount of homogeneous data. | ||
Modern CPU have different SIMD instructions sets like | ||
MMX, SSE-SSE4, AVX, AVX2, AVX512, NEON. | ||
|
||
Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) | ||
and AVX2 support. | ||
|
||
|
||
## Status | ||
|
||
[![Uploadcare][uploadcare.logo]][uploadcare.com] | ||
|
||
Pillow-SIMD can be used in production. Pillow-SIMD has been operating on | ||
[Uploadcare][uploadcare.com] servers for more than 1 year. | ||
Uploadcare is SAAS for image storing and processing in the cloud | ||
and the main sponsor of Pillow-SIMD project. | ||
|
||
Currently, following operations are accelerated: | ||
|
||
- Resize (convolution-based resampling): SSE4, AVX2 | ||
- Gaussian and box blur: SSE4 | ||
- Alpha composition: SSE4, AVX2 | ||
- RGBA → RGBa (alpha premultiplication): SSE4, AVX2 | ||
- RGBa → RGBA (division by alpha): AVX2 | ||
|
||
See [CHANGES](CHANGES.SIMD.rst). | ||
|
||
|
||
## Benchmarks | ||
|
||
The numbers in the table represent processed megapixels of source RGB 2560x1600 | ||
image per second. For example, if resize of 2560x1600 image is done | ||
in 0.5 seconds, the result will be 8.2 Mpx/s. | ||
|
||
- Skia 53 | ||
- ImageMagick 6.9.3-8 Q8 x86_64 | ||
- Pillow 3.4.1 | ||
- Pillow-SIMD 3.4.1.post1 | ||
|
||
Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 | ||
------------------------|---------|------|-------|----------|----------|-------- | ||
**Resize to 16x16** | Bilinear| 41.37| 317.28| 1282.85| 1601.85| 809.49 | ||
| Bicubic | 20.58| 174.85| 712.95| 900.65| 453.10 | ||
| Lanczos | 14.17| 117.58| 438.60| 544.89| 292.57 | ||
**Resize to 320x180** | Bilinear| 29.46| 195.21| 863.40| 1057.81| 592.76 | ||
| Bicubic | 15.75| 118.79| 503.75| 504.76| 327.68 | ||
| Lanczos | 10.80| 79.59| 312.05| 384.92| 196.92 | ||
**Resize to 1920x1200** | Bilinear| 17.80| 68.39| 215.15| 268.29| 192.30 | ||
| Bicubic | 9.99| 49.23| 170.41| 210.62| 112.84 | ||
| Lanczos | 6.95| 37.71| 130.00| 162.57| 104.76 | ||
**Resize to 7712x4352** | Bilinear| 2.54| 8.38| 22.81| 29.17| 20.58 | ||
| Bicubic | 1.60| 6.57| 18.23| 23.94| 16.52 | ||
| Lanczos | 1.09| 5.20| 14.90| 20.40| 12.05 | ||
**Blur** | 1px | 6.60| 16.94| 35.16| | | ||
| 10px | 2.28| 16.94| 35.47| | | ||
| 100px | 0.34| 16.93| 35.53| | | ||
|
||
|
||
### Some conclusion | ||
|
||
Pillow is always faster than ImageMagick. And Pillow-SIMD is faster | ||
than Pillow in 4—5 times. In general, Pillow-SIMD with AVX2 always | ||
**16-40 times faster** than ImageMagick and overperforms Skia, | ||
high-speed graphics library used in Chromium, up to 2 times. | ||
|
||
### Methodology | ||
|
||
All tests were performed on Ubuntu 14.04 64-bit running on | ||
Intel Core i5 4258U with AVX2 CPU on the single thread. | ||
|
||
ImageMagick performance was measured with command-line tool `convert` with | ||
`-verbose` and `-bench` arguments. I use command line because | ||
I need to test the latest version and this is the easiest way to do that. | ||
|
||
All operations produce exactly the same results. | ||
Resizing filters compliance: | ||
|
||
- PIL.Image.BILINEAR == Triangle | ||
- PIL.Image.BICUBIC == Catrom | ||
- PIL.Image.LANCZOS == Lanczos | ||
|
||
In ImageMagick, the radius of gaussian blur is called sigma and the second | ||
parameter is called radius. In fact, there should not be additional parameters | ||
for *gaussian blur*, because if the radius is too small, this is *not* | ||
gaussian blur anymore. And if the radius is big this does not give any | ||
advantages but makes operation slower. For the test, I set the radius | ||
to sigma × 2.5. | ||
|
||
Following script was used for testing: | ||
https://gist.github.com/homm/f9b8d8a84a57a7e51f9c2a5828e40e63 | ||
|
||
|
||
## Why Pillow itself is so fast | ||
|
||
There are no cheats. High-quality resize and blur methods are used for all | ||
benchmarks. Results are almost pixel-perfect. The difference is only effective | ||
algorithms. Resampling in Pillow was rewritten in version 2.7 with | ||
minimal usage of floating point numbers, precomputed coefficients and | ||
cache-awareness transposition. This result was improved in 3.3 & 3.4 with | ||
integer-only arithmetics and other optimizations. | ||
|
||
|
||
## Why Pillow-SIMD is even faster | ||
|
||
Because of SIMD, of course. But this is not all. Heavy loops unrolling, | ||
specific instructions, which not available for scalar. | ||
|
||
|
||
## Why do not contribute SIMD to the original Pillow | ||
|
||
Well, that's not simple. First of all, Pillow supports a large number | ||
of architectures, not only x86. But even for x86 platforms, Pillow is often | ||
distributed via precompiled binaries. To integrate SIMD in precompiled binaries | ||
we need to do runtime checks of CPU capabilities. | ||
To compile the code with runtime checks we need to pass `-mavx2` option | ||
to the compiler. But with that option compiller will inject AVX instructions | ||
enev for SSE functions, because every SSE instruction has AVX equivalent. | ||
So there is no easy way to compile such library, especially with setuptools. | ||
|
||
|
||
## Installation | ||
|
||
In general, you need to do `pip install pillow-simd` as always and if you | ||
are using SSE4-capable CPU everything should run smoothly. | ||
Do not forget to remove original Pillow package first. | ||
|
||
If you want the AVX2-enabled version, you need to pass the additional flag to C | ||
compiler. The easiest way to do that is define `CC` variable while compilation. | ||
|
||
```bash | ||
$ pip uninstall pillow | ||
$ CC="cc -mavx2" pip install -U --force-reinstall pillow-simd | ||
``` | ||
|
||
|
||
## Contributing to Pillow-SIMD | ||
|
||
Pillow-SIMD and Pillow are two separate projects. | ||
Please submit bugs and improvements not related to SIMD to | ||
[original Pillow][original-issues]. All bugs and fixes in Pillow | ||
will appear in next Pillow-SIMD version automatically. | ||
|
||
|
||
[original-docs]: http://pillow.readthedocs.io/ | ||
[original-issues]: https://github.com/python-pillow/Pillow/issues/new | ||
[original-changelog]: https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst | ||
[original-contribute]: https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md | ||
[gaussian-blur-changes]: http://pillow.readthedocs.io/en/3.2.x/releasenotes/2.7.0.html#gaussian-blur-and-unsharp-mask | ||
[uploadcare.com]: https://uploadcare.com/?utm_source=github&utm_medium=description&utm_campaign=pillow-simd | ||
[uploadcare.logo]: https://ucarecdn.com/dc4b8363-e89f-402f-8ea8-ce606664069c/-/preview/ |