Skip to content

Final project for the Machine Learning Nanodegree on Udacity - recognizing house numbers with tensorflow

Notifications You must be signed in to change notification settings

ilyusha/udacity-final-project

Repository files navigation

This project requires the following libraries:

  • Tensorflow (at least 0.10.0, but less than 1.0)
  • PIL
  • web.py
  • werkzeug
  • h5py

Top-level modules:

  • flags.py: Various flags used throughout the system
  • generate.py: Module used to generate synthetic data
  • image.py: Various image-processing utility functions
  • numberlocator.py: Module to train and test the number locator
  • readnumber.py: Module to train the digit recognizer model; also contains the primary function locate_and_read_number
  • server.py: Web application code
  • synthetic_models.py: Library of some of the neural network configurations attempted during training on synthetic data
  • svhn_models.py: Library of some of the neural network configurations attempted during training on SVHN data

The "inputs" directory contains modules that deal with pre-processing input data, as well as utility class for iterating over large datasets:

  • inputs/datasource.py: Classes encapsulating generating input data for the classifiers
  • inputs/batch.py: Classes for batch iterating over input data
  • svhn.py: Functions for dealing with SVHN metadata
  • sample_digit_data.py: Functions for dealing with synthetic data generated for digit classifiers
  • sample_length_data.py: Functions for dealing with synthetic data generated for a stand-alone length classifier

The "models" directory contains classes representing various neural network configurations:

  • models/base.py: Primary base class for all neural networks. Contains the core classifier code.
  • models/multilayer.py: Multi-Layer Perceptron class
  • models/convolution.py: CNN class. Contains variations for single and multi-logit variations

Other directories:

  • "classifiers" contains the Tensorflow checkpoint and metadata files necessary for loading neural network parameters.
  • "uploaded" contains all images uploaded to the web application.
  • "templates" contain template files used by web.py to render the web application.

Input data:

  • SVHN data can be downloaded from http://ufldl.stanford.edu/housenumbers/. Prior to be being used, metadata must extracted by running "svhn.py parse [train|test|extra]".
  • Synthetic data can be generated by the "generate.py" module. Command is "generate.py [by_digit|by_length] {# of images per label}".

Training for the number locator can be kicked off with "numberlocator.py train". "numberlocator.py locate {imgfile}" will locate potential number bounding boxes on the provided image.

Digit recognizer training can be started with "readnumber.py --train --[synthetic|svhn] --[joint|digit {1-5}|length]"

  • The "--joint" option will train a multi-logit network to output the sequence length and all digit positions at once
  • The "--digit {1-5}" option will train the network for the specified digit position
  • The "--length" otion will train the network on the digit sequence length

The entire algorithm can be invoked via command line using "readnumber.py --run {imgfile}"

Deploying the web application The application can be deployed by simply starting server.py. This launches a develoment server on 0.0.0.0:8080, though it does not perform well under high load. Alternatively, server.py can be run through any WSGI-compatible webserver. For example, it can be served with uWSGI by running "uwsgi --http :8080 --wsgi-file server.py"

About

Final project for the Machine Learning Nanodegree on Udacity - recognizing house numbers with tensorflow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published