commaai-speed-challenge

🚘 Solving comma.ai's speed prediction challenge.

Overview

My solution is based on a Computer Vision technique known as Motion History Images. This technique was proposed by Aaron Bobick and James Davis in the late 90's. The main idea was to capture the movement in images throughout some predefined time period.

In particular, an MHI is constructed by assigning a value $\tau$ to the pixels that exhibit movement with respect to the previous frame. And decaying the previous values of those pixels that don't. This is expressed using the following formula:

where $D(x,y,t)$ is a binary image describing whether there is motion at time $t$ or not.

The motion threshold, for computing the binary images, and $\tau$ are hyperparameters.

Originally, Bobick and Davis applied this technique to human action recognition. However, I decided to apply it to this problem under the hypothesis that the MHI's could describe the change in motion of the vehicle, and therefore be able to predict its speed. My theory was that if the vehicle is moving slowly then the pixels in the MHI would exhibit more decayment (i.e. more "gray areas"). On the other hand, if the vehicle is moving faster then the MHI would look brighter since more pixels would be assigned the max value of $\tau$ . The following image shows an example of an MHI for the training video.

Model Architecture

I used a pre-trained ResNet-18 model, modifying the last layer to output a single real value (instead of the Softmax classification). This value is the predicted speed.

One more important aspect to note is that since the network takes in color (3-channel) images, the MHI is replicated across the 3 channels. As shown in the figure above.

Results

The model achieves an MSE of ~5.22 on a dev set of frames obtained from a random split of the training video.

Since the correct speeds for the testing video aren't available, I can't report the MSE. I produced a new video, from the original test.mp4, with all frames tagged with their speed. After inspecting it visually, it seems that the model might not actually perform as well as I had expected though.

Future Work

One of the main issues with my approach is that the MHI's don't take into account the direction of movement. When the vehicle is stopped and cars are crossing in front of it, the motion is captured in the MHI. This causes the model to predict speed values incorrectly. One potential way of addressing this is by using Directed Motion History Images; an extension to the original technique that takes direction into account.

On the other hand, as mentioned above, the idea of using Motion History Images didn't prove as fruitful as I had hoped. Therefore, another avenue for future research is to move away completely from this technique. An idea that could be explored, for example, is feeding in Optical Flow images instead.

References

The Representation and Recognition of Action Using Temporal Templates by Aaron Bobick and James Davis, 1997.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
images		images
model		model
output		output
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
speed_challenge.ipynb		speed_challenge.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

commaai-speed-challenge

Overview

Model Architecture

Results

Future Work

References

About

Languages

rey-allan/commaai-speed-challenge

Folders and files

Latest commit

History

Repository files navigation

commaai-speed-challenge

Overview

Model Architecture

Results

Future Work

References

About

Topics

Resources

Stars

Watchers

Forks

Languages