regularized learning algorthm

There will have many problems when training machine learning algorithm.

under fitting problem
Feature Polynomial too low for fitting target training set, unable to meet training set’s performance, let alone future data.
overfitting problem
Feature polynomial too high for fitting target training set, means fit training set’s performance too well, but unable to predict future data.

Regularization could ameliorate or to reduce over-fitting problem.

premise

$m$ is the number of training set records.
$n$ is the number of features
$\lambda$ is penality value for reducing high polynomial features’ effect, with larger this value is, smaller the effect is, but learning algorithm turn out to under fitting if $\lambda$ too high.

regularized Linear regression

Cost function

$$
J(\theta)=\frac{1}{2m}\left[ \sum^{m}{i=1} (h{\theta}(x^{(i)})-y^{(i)})^2 + \lambda \sum^{n}_{j=1} \theta^2_j \right]
$$

Gradient descend

$$
\begin{aligned}
\theta_0 &:=\theta_0-\alpha \frac{1}{m}\sum^{m}{i=1}(h{\theta}(x^{(i)})-y^{(i)})x^{(i)}0 \
\theta_j &:=\theta_j-\alpha \left[\frac{1}{m}\sum^{m}{i=1}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}_j + \frac{\lambda}{m}\theta_j \right] \quad \text{(j=1,2,3…,n)}
\end{aligned}
$$

regularized logistic regression

Cost function

$$
J(\theta)=-\left[ \frac{1}{m} \sum^{m}{i=1} y^{(i)} \log h_\theta(x^{(i)})+(1-y^{(i)}) \log(1-h_\theta(x^{(i)})) \right] + \frac{\lambda}{2m}\sum^{n}{j=1} \theta^2_j
$$

Gradient descend

$$
\begin{aligned}
\theta_0 &:=\theta_0-\alpha \frac{1}{m}\sum^{m}{i=1}(h{\theta}(x^{(i)})-y^{(i)})x^{(i)}0 \
\theta_j &:=\theta_j-\alpha \left[\frac{1}{m}\sum^{m}{i=1}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}j + \frac{\lambda}{m}\theta_j \right] \quad \text{(j=1,2,3…,n)} \
\end{aligned} \
h{\theta}(x)=\frac{1}{1+e^{-\theta^{T}x}}
$$

development

#machine learning

regularized learning algorthm

https://rug.al/2014/2014-07-27-regularized-learning-algorthm/

Author

Rugal Bernstein

Posted on

July 27, 2014

Licensed under

install and configure hive Previous

logistic regression Next