Skip to content

13. Regularization: Sparsity

Antonio Erdeljac edited this page Mar 17, 2019 · 1 revision

Regularization: Sparisty


Topic: Sparsity

Course: GMLC

Date: 17 March 2019  

Professor: Not specified


Resources


Key Points


  • Many dimensions in a model (such as feature vectors) take much RAM memory

  • Zeroing out features close to 0 saves RAM memory

  • L1 vs. L2

    • L2 penalizes weight squared

    • L1 penalizes |weight|

    • Derivative of L2 is 2 * weight

    • Derivative of L1 is k (independent constant)

  • Convex optimization

    • Using techniques such as gradient descent  to find the minimum of a convex function

Check your understanding


  • Know when to use L1 and L2 based on their encouragements to model’s weights

Summary of Notes


  • L1 is used to save RAM Memory in a process where it encourages weights close to 0 to be exactly 0

  • L2 is used to bring weight values close to 0, but not exactly 0