Using OpenCV and PointNet for object classification validation
Images in their purest form can only provide information in a two-dimensional space, which lacks the information that can be found in the three-dimensional space of the real world. Information such as the volumetric properties of the object is lost, and these factors are often a method used to help classify complex objects. To leverage the versatility that images provide and transform them into a medium with more detail, photogrammetry is a popular method to do the task. With programs like COLMAP and Meshroom, the user simply puts in the images of an object, and the program attempts to produce a 3D version of the object using keypoint detection. When paired up with 3D object classification, images can be transformed into 3D objects that can be used for robust classification. While both programs produce 3D objects with color, they require specialized hardware such as a dedicated GPU to run on them. Even without the need for dedicated GPUs, these programs still take a long time to perform reconstruction, These factors alone put a limit on the practicality of photogrammetry in places where image recognition is not adequate. I created a lightweight photogrammetry package that can run on a wide range of form factors, from laptops to Raspberry Pis, and can perform up to 26 times faster than COLMAP while producing up to 21 times more vertices. When leveraging the object classification network to validate the quality of the mesh, we can get an accuracy rate of up to 27% higher than COLMAP. This highlights that a high-quality and lightweight photogrammetry package is possible in the age of edge computing.
- Python 3.9+ based reconstruction system
- Can be configured to use ORB or SIFT
- CPU based reconstruction
- Uses Alpha Surface for mesh reconstruction
- Uses OpenCV for image processing
- Use PointNet for object classification validation tests
- Can sync with Arduino rigs for automated reconstruction
- Arduino - Contains Arduino code for turntable and rail
- PAR - Contains Python Automated Reconstruction (PAR) code for photogrammetry
- PointNet - Contains PointNet code for object classification
pip3 install opencv-python numpy matplotlib open3d-python scipy tensorflow
Note: Open3D might not be available on Linux, therefore surface reconstruction may not work. You will also need to compile TensorFlow for your environment.
Optional: Follow instructions to assemble Arduino rail or turntable and pair it with PAR.
import PAR
PAR.calibration(checkerboard_x, checkerboard_y, image_path, output_matrix_path, output_distortion_path)
Performs Camera Calibration on checkerboard images. Crucial for reconstruction.
checkerboard_x
: Number of checkerboard squares on x axischeckerboard_y
: Number of checkerboard squares on y axisimage_path
: Path to image folder containing checkerboard images (should include wildcard at the end)output_matrix_path
: Path to output camera matrix fileoutput_distortion_path
: Path to output camera distortion coefficients file
Undistort images using calibration matrix and distortion coefficients. It is not needed for reconstruction.
PAR.undistort(image_path, output_path, calibration_matrix_path, distortion_coefficients_path)
image_path
: Path to image folder containing images to be undistorted (should include wildcard at the end)output_path
: Path to output folder for undistorted imagescalibration_matrix_path
: Path to camera calibration matrix filedistortion_coefficients_path
: Path to camera distortion coefficients file
Finds triangulated points from two images. Performs ORB or SIFT feature detection and matching and passes that through the essential matrix in order to find the triangulated points.
PAR.triangulate(image_path_1, image_path_2, calibration_matrix_path, distortion_coefficients_path, alogrithm, output_path)
image_path_1
: Path to image 1image_path_2
: Path to image 2calibration_matrix_path
: Path to camera calibration matrix filedistortion_coefficients_path
: Path to camera distortion coefficients filealogrithm
: Algorithm to use for feature detection and matching. Options are ORB or SIFToutput_path
: Path to output file for triangulated points
Finds triangulated points from sequential set of images. Performs BRISK, ORB, or SIFT feature detection and matching and passes that through the essential matrix in order to find the triangulated points. Returns a numpy list of triangulated points or point clouds.
PAR.multitriangulate(images_path, calibration_matrix_path, distortion_coefficients_path, keypoint_algorithm, recon_algorithm, ply)
images_path
: Path to image folder containing images to be triangulated (should include wildcard at the end)calibration_matrix_path
: Path to camera calibration matrix filedistortion_coefficients_path
: Path to camera distortion coefficients filekeypoint_algorithm
: Algorithm to use for feature detection and matching. Options are ORB, BRISK, or SIFTrecon_algorithm
: Algorithm to use for surface reconstruction. Options are either ALPHA or POISSONply
: Option to output point cloud as PLY file instead of numpy. False by default.
The author would like to give thanks to Jingnan Shi, Prof. Luca Carlone, Scott Balicki, Dr. Tina Kapur, and Dr. Alex Golby for their insight and mentoring. The author would also like to acknowledge Isa Gonzalez, Caleb Kohn, Ben Jacobson, Kimberly Nguyen, Sidney Trzepacz, Parker Hastings, Zoe Colimon, and Quyen Vo for their help with obtaining data for this project. This work was partially funded by the NSF CAREER award “Certifiable Perception for Autonomous Cyber-Physical Systems”