-
Notifications
You must be signed in to change notification settings - Fork 30
13. Regularization: Sparsity
Antonio Erdeljac edited this page Mar 17, 2019
·
1 revision
Topic: Sparsity
Course: GMLC
Date: 17 March 2019
Professor: Not specified
-
Many dimensions in a model (such as feature vectors) take much RAM memory
-
Zeroing out features close to 0 saves RAM memory
-
L1 vs. L2
-
L2 penalizes weight squared
-
L1 penalizes |weight|
-
Derivative of L2 is 2 * weight
-
Derivative of L1 is k (independent constant)
-
-
Convex optimization
- Using techniques such as gradient descent to find the minimum of a convex function
- Know when to use L1 and L2 based on their encouragements to model’s weights
-
L1 is used to save RAM Memory in a process where it encourages weights close to 0 to be exactly 0
-
L2 is used to bring weight values close to 0, but not exactly 0