# regularized learning algorthm

There will have many problems when training machine learning algorithm.

`under fitting problem`

Feature Polynomial too low for fitting target training set, unable to meet training set’s performance, let alone future data.`overfitting problem`

Feature polynomial too high for fitting target training set, means fit training set’s performance too well, but unable to predict future data.

Regularization could ameliorate or to reduce over-fitting problem.

### premise

- $m$ is the number of training set records.
- $n$ is the number of features
- $\lambda$ is penality value for reducing high polynomial features’ effect, with larger this value is, smaller the effect is, but learning algorithm turn out to under fitting if $\lambda$ too high.

## regularized Linear regression

\[J(\theta)=\frac{1}{2m}\left[ \sum^{m}_{i=1} (h_{\theta}(x^{(i)})-y^{(i)})^2 + \lambda \sum^{n}_{j=1} \theta^2_j \right]\]Cost function

\[\begin{aligned} \theta_0 &:=\theta_0-\alpha \frac{1}{m}\sum^{m}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}_0 \\ \theta_j &:=\theta_j-\alpha \left[\frac{1}{m}\sum^{m}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}_j + \frac{\lambda}{m}\theta_j \right] \quad \text{(j=1,2,3...,n)} \end{aligned}\]Gradient descend

## regularized logistic regression

\[J(\theta)=-\left[ \frac{1}{m} \sum^{m}_{i=1} y^{(i)} \log h_\theta(x^{(i)})+(1-y^{(i)}) \log(1-h_\theta(x^{(i)})) \right] + \frac{\lambda}{2m}\sum^{n}_{j=1} \theta^2_j\]Cost function

\[\begin{aligned} \theta_0 &:=\theta_0-\alpha \frac{1}{m}\sum^{m}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}_0 \\ \theta_j &:=\theta_j-\alpha \left[\frac{1}{m}\sum^{m}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}_j + \frac{\lambda}{m}\theta_j \right] \quad \text{(j=1,2,3...,n)} \\ \end{aligned} \\ h_{\theta}(x)=\frac{1}{1+e^{-\theta^{T}x}}\]Gradient descend