aka least absolute shrinkage and selection operator
Like Ridge regression, but using l1 norm, it gives sparse models. It's a form of Feature selection
Lasso can be optimized by using Sub-gradient descent