-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image Processing docs #388
Merged
Merged
Changes from 9 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
3a07830
Docs for Harris and Hessian
simmplecoder e14f182
Fill values, links and images
simmplecoder b7a7bb5
Replace Mathjax with code blocks
simmplecoder a4f56a3
Remove section on affine transformation
simmplecoder 65b1ae7
Add basic explanation of convolution
simmplecoder fba0c7a
Convert markdown to rst
simmplecoder 7d24cc1
Add some more relevant papers
simmplecoder 16d95aa
Move to new concept name
simmplecoder 512362d
Fix mistakes in docs
simmplecoder b5ff9f2
Add ip docs to index.rst
simmplecoder 8aa7968
Fix formatting
simmplecoder File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
Affine region detectors | ||
----------------------- | ||
|
||
What is being detected? | ||
~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Affine region is basically any region of the image | ||
that is stable under affine transformations. It can be | ||
edges under affinity conditions, corners (small patch of an image) | ||
or any other stable features. | ||
|
||
-------------- | ||
|
||
Available detectors | ||
~~~~~~~~~~~~~~~~~~~ | ||
|
||
At the moment, the following detectors are implemented | ||
|
||
- Harris detector | ||
|
||
- Hessian detector | ||
|
||
-------------- | ||
|
||
Algorithm steps | ||
~~~~~~~~~~~~~~~ | ||
|
||
Harris and Hessian | ||
^^^^^^^^^^^^^^^^^^ | ||
|
||
Both are derived from a concept called Moravec window. Lets have a look | ||
at the image below: | ||
|
||
.. figure:: ./Moravec-window-corner.png | ||
:alt: Moravec window corner case | ||
|
||
Moravec window corner case | ||
|
||
As can be noticed, moving the yellow window in any direction will cause | ||
very big change in intensity. Now, lets have a look at the edge case: | ||
|
||
.. figure:: ./Moravec-window-edge.png | ||
:alt: Moravec window edge case | ||
|
||
Moravec window edge case | ||
|
||
In this case, intensity change will happen only when moving in | ||
particular direction. | ||
|
||
This is the key concept in understanding how the two corner detectors | ||
work. | ||
|
||
The algorithms have the same structure: | ||
|
||
1. Compute image derivatives | ||
|
||
2. Compute Weighted sum | ||
|
||
3. Compute response | ||
|
||
4. Threshold (optional) | ||
|
||
Harris and Hessian differ in what **derivatives they compute**. Harris | ||
computes the following derivatives: | ||
|
||
``HarrisMatrix = [(dx)^2, dxdy], [dxdy, (dy)^2]`` | ||
|
||
*(note that ``d(x^2)`` and ``(dy^2)`` are **numerical** powers, not | ||
gradient again).* | ||
|
||
The three distinct terms of a matrix can be separated into three images, | ||
to simplify implementation. Hessian, on the other hand, computes second | ||
order derivatives: | ||
|
||
``HessianMatrix = [dxdx, dxdy][dxdy, dydy]`` | ||
|
||
**Weighted sum** is the same for both. Usually Gaussian blur | ||
matrix is used as weights, because corners should have hill like | ||
curvature in gradients, and other weights might be noisy. | ||
Basically overlay weights matrix over a corner, compute sum of | ||
``s[i,j]=image[x + i, y + j] * weights[i, j]`` for ``i, j`` | ||
from zero to weight matrix dimensions, then move the window | ||
and compute again until all of the image is covered. | ||
|
||
**Response computation** is a matter of choice. Given the general form | ||
of both matrices above | ||
|
||
``[a, b][c, d]`` | ||
|
||
One of the response functions is | ||
|
||
``response = det - k * trace^2 = a * c - b * d - k * (a + d)^2`` | ||
|
||
``k`` is called discrimination constant. Usual values are ``0.04`` - | ||
``0.06``. | ||
|
||
The other is simply determinant | ||
|
||
``response = det = a * c - b * d`` | ||
|
||
**Thresholding** is optional, but without it the result will be | ||
extremely noisy. For complex images, like the ones of outdoors, for | ||
Harris it will be in order of 100000000 and for Hessian will be in order | ||
of 10000. For simpler images values in order of 100s and 1000s should be | ||
enough. The numbers assume ``uint8_t`` gray image. | ||
|
||
To get deeper explanation please refer to following **paper**: | ||
|
||
`Harris, Christopher G., and Mike Stephens. "A combined corner and edge | ||
detector." In Alvey vision conference, vol. 15, no. 50, pp. 10-5244. | ||
1988. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.434.4816&rep=rep1&type=pdf>`__ | ||
|
||
`Mikolajczyk, Krystian, and Cordelia Schmid. "An affine invariant interest point detector." In European conference on computer vision, pp. 128-142. Springer, Berlin, Heidelberg, 2002. <https://hal.inria.fr/inria-00548252/document>`__ | ||
|
||
`Mikolajczyk, Krystian, Tinne Tuytelaars, Cordelia Schmid, Andrew Zisserman, Jiri Matas, Frederik Schaffalitzky, Timor Kadir, and Luc Van Gool. "A comparison of affine region detectors." International journal of computer vision 65, no. 1-2 (2005): 43-72. <https://hal.inria.fr/inria-00548528/document>`__ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
Basics | ||
------ | ||
|
||
Here are basic concepts that might help to understand documentation | ||
written in this folder: | ||
|
||
Convolution | ||
~~~~~~~~~~~ | ||
|
||
The simplest way to look at this is "tweaking the input so that it would | ||
look like the shape provided". What exact tweaking is applied depends on | ||
the kernel. | ||
|
||
-------------- | ||
|
||
Filters, kernels, weights | ||
~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Those three words usually mean the same thing, unless context is clear | ||
about a different usage. Simply put, they are matrices, that are used to | ||
achieve certain effects on the image. Lets consider a simple one, 3 by 3 | ||
Scharr filter | ||
|
||
``ScharrX = [1,0,-1][1,0,-1][1,0,-1]`` | ||
|
||
The filter above, when convolved with a single channel image | ||
(intensity/luminance strength), will produce a gradient in X | ||
(horizontal) direction. There is filtering that cannot be done with a | ||
kernel though, and one good example is median filter (mean is the | ||
arithmetic mean, whereas median will be the center element of a sorted | ||
array). | ||
|
||
-------------- | ||
|
||
Derivatives | ||
~~~~~~~~~~~ | ||
|
||
A derivative of an image is a gradient in one of two directions: x | ||
(horizontal) and y (vertical). To compute a derivative, one can use | ||
Scharr, Sobel and other gradient filters. | ||
|
||
-------------- | ||
|
||
Curvature | ||
~~~~~~~~~ | ||
|
||
The word, when used alone, will mean the curvature that would be | ||
generated if values of an image would be plotted in 3D graph. X and Z | ||
axises (which form horizontal plane) will correspond to X and Y indices | ||
of an image, and Y axis will correspond to value at that pixel. By | ||
little stretch of an imagination, filters (another names are kernels, | ||
weights) could be considered an image (or any 2D matrix). A mean filter | ||
would draw a flat plane, whereas Gaussian filter would draw a hill that | ||
gets sharper depending on it's sigma value. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason you don't follow GIL's markup for headings?
GIL (should) follows markup for sections recommended by Python guide for documenting and equivalent Sphinx reStructuredText Primer, that is:
However, in future, this may and we may decide to distinguish chapters and parts as well.is intentional.