# logistic regression

## premise

- $X$ is a
`matrix`

which has`m`

rows and`n`

columns, that means it is a $m \times n$ matrix, represents for training set. - $\theta$ is a $1 \times n$
`vector`

, stands for hypothesis parameter. - $y$ is a $m \times 1$
`vector`

, stands for real value of training set. - $\alpha$ named
`learning rate`

for defining learning or descending speed.

# 1. Hypothesis

Draw hypothesis of a pattern.

Since classification problem range from 0 to 1

We need to make use of this`sigmoid`

function

# 2. Cost

Calculate the Cost for single training point.

# 3. Cost function

Draw cost function for iterating whole training set.

# 4. Get optimized parameter

Learn from training set to get optimized parameter for proposed algorithm.

### Gradient Descend###

### Others

- Conjugate gradient
- BFGS
- L-BFGS