Behavioral Cloning

This project was done as part of Udacity's Self-Driving Car Nanodegree Program. The model performance has been tested on for resolution of 320x240, and graphic quality selected as 'fastest'.

To see the model performance click the following links:

* model was only trained on track 1 data

Goals and Objective

The goals / steps of this project are the following:

  • Use the simulator to collect data of good driving behavior
  • Build, a convolution neural network in Keras that predicts steering angles from images
  • Train and validate the model with a training and validation set
  • Test that the model successfully drives around track one without leaving the road

Code Structure

My project includes the following files:

  • containing the script to create, train and save the convolution neural network. The file shows the pipeline I used for training and validating the model, and it contains comments to explain how the code works. Change the location of DATA_PATH and LABEL_PATH as per the location of data in your machine.
  • for driving the car in autonomous mode
  • model.h5 containing a trained convolution neural network
  • contains code for Image Generator and data augmentation
  • DataVisialization.ipynb contains code for images being used in the writeup
  • (also in summarizing the reports

Using the Udacity provided simulator and my file, the car can be driven autonomously around the track by executing

python model.h5

Model Architecture

To begin with, I used AlexNet architecture with last layer as fully connected layer with one unit. I also used experimented with the model as described by (github link). These models performed well, but did not genralize to the the second track. Also, the model parameter space was bigger than my GPU (GeForce 840M) was able to handle, and the training was quite slow on CPU, and hence playing with parameters was quite time inefficient.

Finally, I used model similar to one described in this paper from nvidia.

The model includes ELU (Exponential Linear Units) layers to introduce nonlinearity, and the data is resized and normalized in the model using a Keras lambda layer, and dropout were used at various stages to avoid overfitting.

Following image describes model architecture (the lambda_1 layer is the layer used for resizing the image, and lambda_2 is the layer used for normalizing images).

Model Defination

The complete model defination code can be found at line 20 in

Training and Validation Data

Directory Structure

The model was trained using data provided by Udacity.

├── IMG/
└── driving_log.csv

IMG folder contains central, right and left frame of the driving and each row in driving_log.csv sheet correlates these images with the steering angle, throttle, brake, and speed of the car.

Training and Validation Data Split

The data provided by Udacity was split so as to use 80% of data as training set and rest of the data was used for validation. The validation data was used to ensure that the hyperparameters chosen for the model does not overfit.

Data Preprocessing

When the model was trained with the raw data as provided by Udacity, the car had tendency to go straight and lost the track particularly at turnings. This led me to explore for various data processing and data augmentation techniques (data augmentation techniques are discussed in next section).

  • Cropping Image

    The original image was cropped to remove redundant top portion (sky and other details which is not required to decide steering angle). Also the bottom of the image displaying car hood was cropped out. The code for the same is at line 18 of

    Cropped Image

  • Reduction of Low Steering Angle Data

    A quick look at the histogram of the steering angle shows that the data is biased towards low steering angles (which is expected as the car would mostly be driving straight).

    Steering Angle

    To combat the issue, about 70% of the randomly selected low steering angle were dropped from the training data (check line 72 in and corresponding function defination in line 7 of

Image Generators

Image generators were used to generate training batches in realtime (this was to combat high memory usage if all the images were pre-cached in memory). The code for the same can be found at line 71 of The code is well commented and a few steps involved in image generator are explained in the data augmentation section.

Data Augmentation

Despite the removal of low steering data, the car would deviate from the track at a few places. Since the training data provided by Udacity is focused on driving down the middle of the road, the models did not learn what to do if it gets off to the side of the road.

Approach 1: Collecting More Data

To teach the car what to do when it’s off on the side of the road, I generated recovery data i.e. collecting data such that it captures the behavior to follow when the car deviates from the track. I recorded data when the car is driving from the side of the road back toward the center line.

The approach didn't work as the data collected did not have smooth steering angle across the laps (I did not have fine control over the steering angle when running the simulator using keyboard).

Approach 2: Image Transformations

Applying image transformation techniques to the existing data can be used to increase the volume of data available for training. Moreover, these makes the model less prone to overfitting.

Following techniques were used for image augmentations:

  • Flipping Image

    Mirroring the image and reversing the steering angle gives equally valid image for training. In the image generator, 50% of the images were flipped.

    Flipped Images

  • Using Left and Right Camera Images

    The left camera image has to move right to get to center, and right camera has to move left. Adding a small angle .25 to the left camera and subtract a small angle of 0.25 from the right camera does the trick.

    Corrected Left and Right Camera Images

  • Applying Horizontal and Vertical Shifts

    The camera images were horizontally/vertically shifted to simulate the effect of car being at different positions on the road, and an offset corresponding to the shift was added to the steering angle (line 58 of

    Translated Image

* The approach used for image transformation are provided in this paper by Nvidia and this blog post by Vivek Yadav.

Parameter Tuning

The model used an adam optimizer for minimizing Mean Squared Error as loss function. The *initial learning rate was choosen as 0.001 as the model did not converge well with the default learning rate of 0.01 ( line 25). The samples per epochs were decided on basis of the lenght of training data, and epochs used for training were 50 keeping in account that the model did not overfit (this was ensured by keeping check on validation loss during training).

Model Generalization

The model was trained using images obtained from track 1 alone and it worked without any further tuning for track2. The fact that the model worked on a track unseen by it speaks about the generalization of the model.

What more can be done?

  • The training data can be augmented for brightness and hue jitter.
  • Random patch of black tiles can be overlayed on the training data to simulate shadow and make data less prone to the effect of the same.
  • The image cropping can be done as part of model to utilize CUDA acceleration
  • The model does not perform well with higher resolution even if the images are resized to current input size during preprocessing (mostly because of the additional lag due to extra preprocessing or the image resizing requires particular type of interpolation strategy so as to meet current input specifications).


