FluxML · mdvsh · Dec 19, 2019 · Dec 19, 2019 · Dec 20, 2019
diff --git a/other/housing/housing.jl b/other/housing/housing.jl
@@ -1,12 +1,32 @@
+# # Machine Learning Problem : Housing Dataset
+#
+# The housing problem functions as a starting point in Machine Learning.
+# We'll be demonstrating the use of Julia's [Flux Package](https://fluxml.ai/)
+# to do this problem.
+#
+# The data replicates the housing data example from the Knet.jl readme. Although we
+# could have reused more of Flux (see the mnist example), the library's
+# abstractions are very lightweight and don't force you into any particular
+# strategy.
+# 
+# [This](http://www.mit.edu/~6.s085/notes/lecture3.pdf) might help you know more about the fundamentals of what 
+# we're about to do. If you don't understand something there which is also not mentioned here in this file, 
+# you may overlook that (or search it up on google to quench your curiosity :-)
+
 using Flux.Tracker, Statistics, DelimitedFiles
 using Flux.Tracker: Params, gradient, update!
 using DelimitedFiles, Statistics
 using Flux: gpu
 
-# This replicates the housing data example from the Knet.jl readme. Although we
-# could have reused more of Flux (see the mnist example), the library's
-# abstractions are very lightweight and don't force you into any particular
-# strategy.
+# ## Getting the data and other pre-processing.
+# We'll start by getting <code>housing.data</code> and splitting it into 
+# training and test sets. 
+# Training Dataset is the sample of data used to **fit** the model while
+# Test Dataset is the sample of data used to provide an unbiased evaluation 
+# of a final model fit on the training dataset.
+
+# Our aim is to predict the price of the house. In this dataset, the last
+# feature is the price and would therefore be our target.
 
 cd(@__DIR__)
 
@@ -16,30 +36,60 @@ isfile("housing.data") ||
 
 rawdata = readdlm("housing.data")'
 
-# The last feature is our target -- the price of the house.
-split_ratio = 0.1 # For the train test split
+#-
+
+# Specifying the split ratio and **x** and **y**
+split_ratio = 0.1 
 
 x = rawdata[1:13,:] |> gpu
 y = rawdata[14:14,:] |> gpu
 
-# Normalise the data
+# ### Normalising
+# What is the need ? <br>
+# Normalization is a technique often applied as part of data preparation for machine learning. 
+# The goal of normalization is to change the values of numeric columns in the dataset to a common scale,
+# without distorting differences in the ranges of values. For machine learning, every dataset does not require normalization.
+# It is required only when features have different ranges like in this case.
+
 x = (x .- mean(x, dims = 2)) ./ std(x, dims = 2)
 
-# Split into train and test sets
+# ### Splitting into test and training sets.
+
 split_index = floor(Int,size(x,2)*split_ratio)
 x_train = x[:,1:split_index]
 y_train = y[:,1:split_index]
 x_test = x[:,split_index+1:size(x,2)]
 y_test = y[:,split_index+1:size(x,2)]
 
-# The model
+# ## The Model 
+# Here comes everyone's favourite part : implementing a machine learning model.
+# 
+# We'll now define the Weight (W) and the Bias (b) terms. They are our hyperparameter which
+# we tune to enhance our predictions during gradient descent. 
+# To get an intution about how gradientDescent actually works, check out Andrew Ng's awesome explaination 
+# here. [Video 1: Intution](https://www.youtube.com/watch?v=rIVLE3condE) | 
+# [Video 2: The Algorithm](https://www.youtube.com/watch?v=yFPLyDwVifc)
+
 W = param(randn(1,13)/10) |> gpu
 b = param([0.]) |> gpu
 
+# Here are our prediction and loss functions.
+# - The prediction functions returns our prediction of the price of the house as 
+# suggested by our 2 hyperparameters: W and b.
+# - MSE is the average of the squared error that is used as the loss function for least squares regression.
+# It is the sum, over all the data points, of the square of the difference between the predicted and actual target
+# variables, divided by the number of data points.
+#
+# Loss functions evaluate how well your algorithm models your dataset. 
+# If predictions are off, the loss function is high. If they're good, it'll be low.
+
 predict(x) = W*x .+ b
 meansquarederror(ŷ, y) = sum((ŷ .- y).^2)/size(y, 2)
 loss(x, y) = meansquarederror(predict(x), y)
 
+# ### Gradient Descent 
+# Optimizing our parameters to get accurate prediction. Learn more from the links I mentioned above.
+
 η = 0.1
 θ = Params([W, b])
 
@@ -51,6 +101,15 @@ for i = 1:10
   @show loss(x_train, y_train)
 end
 
-# Predict the RMSE on the test set
+# ## Predictions
+# Now we're in a position to know how well our program works on the given data.
+
 err = meansquarederror(predict(x_test),y_test)
 println(err)
+
+# The prepared model might not very good for predicting the housing prices and may have high error.
+# One can improve the prediction results using many other possible machine learning algorithms and techniques.
+# If this was your first ML project in Flux, Congrats! 
+# 
+# You should have gotten a gist of basic ML functionality in Flux Package using Julia by now.
+