13. Regularization: Sparsity

Jump to bottom

Antonio Erdeljac edited this page Mar 17, 2019 · 1 revision

Regularization: Sparisty

Topic: Sparsity

Course: GMLC

Date: 17 March 2019

Professor: Not specified

Resources

Key Points

Many dimensions in a model (such as feature vectors) take much RAM memory
Zeroing out features close to 0 saves RAM memory
L1 vs. L2
- L2 penalizes weight squared
- L1 penalizes |weight|
- Derivative of L2 is 2 * weight
- Derivative of L1 is k (independent constant)
Convex optimization
- Using techniques such as gradient descent to find the minimum of a convex function

Check your understanding

Know when to use L1 and L2 based on their encouragements to model’s weights

Summary of Notes

L1 is used to save RAM Memory in a process where it encourages weights close to 0 to be exactly 0
L2 is used to bring weight values close to 0, but not exactly 0