-
Notifications
You must be signed in to change notification settings - Fork 0
/
Tensorflow-Random-Net-Image-Classification.Rmd
249 lines (196 loc) · 10 KB
/
Tensorflow-Random-Net-Image-Classification.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
---
title: "Tensorflow-Random-Net-Image-Classification"
author: "Dale Kube"
date: "4/17/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(reticulate)
use_python('/usr/bin/python3', required=T)
```
This code uses many weak dense neural networks in a "random net" to achieve better classification and generalization. The commonly known random forest algorithm develops many weak decision trees; however, this example shows how a data scientist can take the same approach with neural networks. Instead of saying "random forest", one might call it a "random net" to reflect the use of neural networks.
This example is motivated by Jason Brownlee’s article called ‘How to Develop an Ensemble of Deep Learning Models in Keras’ found here: https://machinelearningmastery.com/model-averaging-ensemble-for-deep-learning-neural-networks/. Brownlee’s example represents a bagging ensemble. The random net, in this publication, goes a step further with the random selection of a subset of flattened pixel features for each neural network.
GitHub: https://github.com/dalekube/neural-networks/blob/master/Tensorflow-MLP-Random-Net-Image-Classification.py
```{python pkgImports, echo=FALSE}
import pandas as pd
import numpy as np
from matplotlib import pyplot
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, GaussianNoise, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.datasets import load_digits
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
```
## Load the data (handwritten digits from sklearn)
The handwritten digits dataset from sklearn is loaded using the load_digits() function from sklearn. The data set is documented here: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html. A 10% hold-out set is defined for future evaluation after the random net and weak learners are trained. This allows us to evaluate the performance of each model using unseen observations.
```{python loadData, results='hide'}
# Load the handwritten digits dataset from sklearn
data = load_digits()
# Define label encoder
le = LabelEncoder()
le.fit(data.target)
n_digits = len(data.target_names)
# Separate a 10% hold out data set for future predictions
# After the random forest is trained
train_x, new_x, train_y, new_y = train_test_split(data.data, data.target, stratify=data.target, test_size=0.1, random_state=77)
train_x = pd.DataFrame(train_x)
new_x = pd.DataFrame(new_x)
```
## Data set totals
```{python dataTotals}
# Data set totals
print("Training Samples =", '{:,}'.format(len(train_x)))
print("Hold Out Samples =", '{:,}'.format(len(new_x)))
```
## Display four random images
```{python imageDisplay}
# Display four random images from the data
# in a 2x2 plot grid
imgs = np.random.randint(0,len(data.images)-1,4)
fig, ax = pyplot.subplots(2,2)
ax[0,0].title.set_text('Digit = ' + str(data.target[imgs[0]]))
ax[0,0].imshow(data.images[imgs[0]])
ax[0,1].title.set_text('Digit = ' + str(data.target[imgs[1]]))
ax[0,1].imshow(data.images[imgs[1]])
ax[1,0].title.set_text('Digit = ' + str(data.target[imgs[2]]))
ax[1,0].imshow(data.images[imgs[2]])
ax[1,1].title.set_text('Digit = ' + str(data.target[imgs[3]]))
ax[1,1].imshow(data.images[imgs[3]])
```
## Function to train a single neural network (weak learner)
Define the function that accepts a random subset of training and testing data with a random subset of features (flattened pixel vectors) and trains an individual multi-layer perceptron network (also called a dense network). The function is called multiple times over the training process.
```{python weakLearner, results='hide'}
# Define function to evaluate a single MLP model
def train_model(rf_train_x, rf_train_y, rf_test_x, rf_test_y, n_features):
# Encode targets
rf_train_y = to_categorical(rf_train_y, num_classes=n_digits)
rf_test_y = to_categorical(rf_test_y, num_classes=n_digits)
# Define sequential model
model = Sequential()
model.add(Dense(50,input_dim=n_features))
model.add(Activation('relu'))
model.add(Dropout(0.05))
model.add(Dense(15))
model.add(Activation('relu'))
model.add(GaussianNoise(0.10))
model.add(Dense(n_digits))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer=Adam(0.01), metrics=['accuracy'])
# Define callbacks
early_stop = EarlyStopping(monitor="val_loss", patience=3, verbose=1, restore_best_weights=True)
# Fit model
model.fit(rf_train_x.values, rf_train_y, epochs=100, batch_size=1000,
verbose=0, validation_data=(rf_test_x.values, rf_test_y),
callbacks=[early_stop])
# Evaluate model
_, test_acc = model.evaluate(rf_test_x.values, rf_test_y, verbose=0)
return model, test_acc
```
## Execute multiple train/test splits
Finally, we execute multiple splits and train the individual weak learners. This example trains 31 weak learners in total. A performance summary is provided as a benchmark once the training round is complete. The code calculates the average and standard deviation classification accuracy across the weak learners.
```{python fitModels, results='hide'}
# Execute multiple train/test splits
n_splits = 51
scores, models, cols = list(), list(), list()
for i in range(n_splits):
# Select a random subset of features
# Take the standard square root
n = int(round(np.sqrt(train_x.shape[1])))
rf_train_x = train_x.sample(n,axis=1)
n_features = rf_train_x.shape[1]
# Split data with 75% random bagging fraction
rf_train_x, rf_test_x, rf_train_y, rf_test_y = train_test_split(rf_train_x, train_y, test_size=0.50, random_state=77)
# Train the model
model, test_acc = train_model(rf_train_x, rf_train_y, rf_test_x, rf_test_y, n_features)
print('Round %.0f Complete > %.3f' % (i+1,test_acc))
scores.append(test_acc)
models.append(model)
cols.append(rf_train_x.columns)
```
## Evaluate the random net on the hold out set
Consecutively evaluate the ensembled performance of the weak learners on the hold out set by adding the resulting probabilities from each model. The class with the highest probability is used as the final prediction for the random net and measured against the true class in the hold out set. This allows us to observe the increase in performance as the weak learners produce a consensus prediction.
```{python randomNetEval, results='hide'}
# evaluate a specific number of members in an ensemble
def evaluate_n_members(models, n_models, new_x, new_y):
# Select a subset of models
# Isolate relevant columns
subset = models[:n_models]
subset_cols = cols[:n_models]
# Make predictions
yhats = list()
for i in range(0,len(subset)):
new_x_subset = new_x[subset_cols[i]]
yhati = subset[i].predict(new_x_subset.values)
yhati = np.array(yhati)
yhats.append(yhati)
# Compute bootstrapped accuracy
summed = np.sum(yhats, axis=0)
yhat = np.argmax(summed, axis=1)
return accuracy_score(new_y, yhat)
# Evaluate the random net on hold out set
single_scores, ensemble_scores = list(), list()
for i in range(1,n_splits+1):
# Compute ensemble score
ensemble_score = evaluate_n_members(models, i, new_x, new_y)
# Compute single score
new_y_enc = to_categorical(new_y, num_classes=n_digits)
new_subset = new_x[cols[i-1]]
_, single_score = models[i-1].evaluate(new_subset.values, new_y_enc, verbose=0)
# Print Results
print('> %d: single=%.3f, random net=%.3f' % (i, single_score, ensemble_score))
ensemble_scores.append(ensemble_score)
single_scores.append(single_score)
```
## Weak Learner Performance
```{python weakLearnerPerformance}
# Performance for the best weak learner
print('Maximum Weak Learner Accuracy %.3f' % max(scores))
```
## Ensemble Performance (Random Net)
```{python ensemblePerformance}
# Random Net Performance
print('Random Net Accuracy %.3f' % ensemble_scores[-1])
```
## Performance Plot
Plot the additive performance of the random net as each weak learner is included. This can be visually compared to the individual scores of the weak learners in the same sequential order.
```{python performancePlot, results='hide'}
# Plot individual scores vs. ensembled scores
x_axis = [i for i in range(1, n_splits+1)]
pyplot.plot(x_axis, single_scores, marker='o', linestyle='None')
pyplot.plot(x_axis, ensemble_scores, marker='o')
pyplot.title('Classification Performance\nRandom Net v. Individual Weak Learners')
pyplot.legend(['Weak Learners','Random Net'])
pyplot.ylabel('Classification Accuracy')
pyplot.xlabel('Weak Learner (Neural Net)')
pyplot.show()
```
## Single Strong Neural Network Benchmark
Train a single neural network using all of the training data and features to produce a benchmark. The benchmark comparison offers evidence that the additional generalization from the random net will produce better performance.
```{python benchmark}
# Encode targets
rf_train_y = to_categorical(train_y, num_classes=n_digits)
rf_new_y = to_categorical(new_y, num_classes=n_digits)
# Define sequential model
model = Sequential()
model.add(Dense(50,input_dim=train_x.shape[1]))
model.add(Activation('relu'))
model.add(Dropout(0.05))
model.add(Dense(15))
model.add(Activation('relu'))
model.add(GaussianNoise(0.10))
model.add(Dense(n_digits))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer=Adam(0.01), metrics=['accuracy'])
# Define callbacks
early_stop = EarlyStopping(monitor="val_loss", patience=3, verbose=1, restore_best_weights=True)
# Fit model
history = model.fit(train_x, rf_train_y, epochs=100, batch_size=1000,
verbose=0, validation_data=(new_x, rf_new_y),
callbacks=[early_stop])
print('Single Strong Neural Network Accuracy =', '{:.2%}'.format(history.history['accuracy'][-1]))
```