-
Notifications
You must be signed in to change notification settings - Fork 48
OpenCV basics: reading, converting, transforming images
This section showcases code snippets to use basic image based OpenCV functionalities. Using the famous Lena image for this purpose.
OpenCV provides range of functionalities. You can require them as required.
local cv = require 'cv'
require 'cv.imgcodecs' -- reading/writing images
require 'cv.imgproc' -- image processing
require 'cv.highgui' -- GUI
require 'cv.videoio' -- Video input/output
-- cv.ml and cv.flann return a separate table,
-- while other submodules just update 'cv' table
cv.ml = require 'cv.ml' -- Machine Learning
...
OpenCV reads image in row major format and shape is (height, width, channels) unless the image is loaded as grayscale or it is grayscale and loaded with cv.IMREAD_UNCHANGED
flag, in that case the shape is (height, width). Functions cv.imread
and cv.imwrite
reverses the channel order. If the image is in RGB on disk then after reading it becomes BGR (in memory) and vice-versa for image writing.
To load image as it is on disk use cv.IMREAD_UNCHANGED
flag.
loadType = cv.IMREAD_UNCHANGED
src = cv.imread{imagePath, loadType}
print(src:size())
512
512
3
[torch.LongStorage of size 3]
To load the image as color image use cv.IMREAD_COLOR
flag.
--loadType: cv.IMREAD_COLOR, loads (always) 3 channel image.
loadType = cv.IMREAD_COLOR
src = cv.imread{imagePath, loadType}
print(src:size())
512
512
3
[torch.LongStorage of size 3]
You can use cv.IMREAD_GRAYSCALE
to load image as grayscale. In this case the color is converted to grayscale. For this conversion the channels of the image are assumed to be in RGB order.
loadType = cv.IMREAD_GRAYSCALE
src = cv.imread{imagePath, loadType}
print(src:size())
512
512
[torch.LongStorage of size 2]
For saving image to disk use cv.imwrite
. Image compression is defined by the extension of the imagePath
.
cv.imwrite{imagePath, src}
Third argument to the function can compression specific parameter. E.g if compression is JPEG then the parameter is JPEG compression quality. If not provided then default values are used.
OpenCV provides optimized color conversion functions. Following are couple of examples.
Convert BGR to YUV
dst = src:clone()
cv.cvtColor{src, dst, cv.COLOR_BGR2YUV}
print(dst:size())
512
512
3
[torch.LongStorage of size 3]
Convert to grayscale
dst = cv.cvtColor{src, cv.COLOR_BGR2GRAY}
print(dst:size())
512
512
[torch.LongStorage of size 2]
Here is an exmaple to resize an image to fixed size.
dst = cv.resize{src, {1024, 1024}, interpolation=cv.INTER_CUBIC}
print(dst:size())
1024
1024
3
[torch.LongStorage of size 3]
We can also resize an image using scaling factot. You can use different scaling factor for height and width.
scaleX = 0.25
scaleY = 0.35
dst = cv.resize{src, fx=scaleX, fy=scaleY, interpolation=cv.INTER_AREA}
print(dst:size())
179
128
3
[torch.LongStorage of size 3]
Affine transformation is one of the most widely used image processing function. We will go through this functionality using following image as an example.
Source Image
Affine transformation is a two step process.
- Get affine rotation/scaling matrix.
height = src:size(1)
width = src:size(2)
-- rotate counter clockwise about center (in image coordinate system)
center = cv.Point2f{width/2, height/2}
angle = 45 -- in degrees
scale = 0.5
-- get rotation matrix
M = cv.getRotationMatrix2D{center, angle, scale}
print(M:size())
2
3
[torch.LongStorage of size 2]
Transformation matrix M provided by OpenCV has only rotation and scaling. You can add translation by adding [translationX translationY]
to the last column of M.
- Transforming Image (Affine Warp) using Rotation Matrix.
dsize = cv.Size{width, height} -- if not provided or zero then uses source image's size
dst = cv.warpAffine{src, M, dsize, flags=cv.INTER_LINEAR}
print(dst:size())
512
512
3
[torch.LongStorage of size 3]
Affine Transformed Image