A final project for CS497: Computer Vision at Western Colorado University. Our goal is to control a DJI Tello EDU drone via hand gestures.
Our process and attempts:
- Optical flow flight path replication from a recorded video - we found limitations with the camera and model in conjunction with large movements
- Follow a user, but this had implications of human tracking/identification, and tracking in 3D space requires far more time/cognition than we had time for
- Self-constructed gesture recognition CNN, but our training data was limited and known methodology was lacking
- Mediapipe supported recognition, this was very successful and allowed us to focus on integrating outputs from the model with flight controls
Results, and stuff we learned:
- Training data management
- Spiradicness of a live motor controlled device
- System state management
- Handling recognizing actions from a generic human
Signed; Marcos Ordonos, Jake Teeter, and Andrew Nyland
These scripts handle recording training data and preprocessing the data to make it more useful, or more advantageous for a neural network to be applied.
- create.py: Initial blurring and processing of recorded images from the drone of hand gestures
- preprocessing.py: generates one-to-many of images
- generated images are brightness adjusted and edges detected
- droneCNN_image.py: saves a CNN model on recorded training images
- droneCNN_oordinates: Script to train model on cvs coordinates
- drone.py: Tests label identification of the CNN on a live feed from the drone
- drone_t.py: Same as drone.py but with mediapipe integration
- my_main.py: Do hand recognition and send commands to drone.
- test1.py: Basic flight path of the drone; takes off, flies while tracing a 1m cube, rotates 90 degrees, and repeats that flight path before landing
- test2.py: Same as test1.py with video streaming
- test3.py: Same as test2.py with optical flow implemented
- getspecs.py: Basic situational information about the drone, no controls
- recframes.py: Records frames from live stream for training data, no flight
- 'SPACE' for start/stop recording, this is represented in the menubar of the live feed view
- 'q' or 'Q' to exit the program
- uses counter.txt to count recordings and saves frames to a folder as individual images
- counter.txt: a necessary text file for recframes.py, if used by a new user we recommend filling this file with a string of '-1' (as such, it will start at 0)
Below is a sample output for getspecs.py:
[INFO] tello.py - 122 - Tello instance was initialized. Host: '192.168.10.1'. Port: '8889'.
[INFO] tello.py - 437 - Send command: 'command'
[INFO] tello.py - 461 - Response command: 'ok'
Battery: 100
Temperature: (55, 59)
Barometer: 232850.0
RPY: 0, 0, -153
Key: battery is in percent, temperature is a tuple of minimum and maximum temperatures in degrees celsius, barometer is in centimeters from sea level, and RPY (roll, pitch, yaw) is in degrees.
Below is a state machine representation of recframes.py: The title state refers to the title of the live feed preview window generated by opencv.
- recImages: frames recorded from the drone, not cleaned yet
- mygestures: some processed frames cleaned from recImages
- inimgs: source folder by name and structure for preprocess.py
- new_mine: mediapipe images of some hand shapes
- Project git repository
- Link to access image on ONEdRIVE (Western Colorado University account required)