Sigmoid function is a mathematical function with beautiful S-shape curve, which is widely used in logistic regression and artificial neural network. The mathematical form of sigmoid function is:
0
f(x)=11+e−x
The function image is as follows:
It can be seen that sigmoid function is continuous, smooth, strictly monotone, and symmetric with (0,0.5) center, which is a very good threshold function.
When x approaches negative infinity, y approaches 0; when x approaches positive infinity, y approaches 1; when x = 0, y = 0.5. Of course, when x goes beyond the range of [- 6,6], the value of the function basically does not change, and the value is very close, so it is generally not considered in the application.
The range of sigmoid function is limited between (0,1). We know that [0,1] corresponds to the range of probability value, so sigmoid function can be associated with a probability distribution.
The derivative of sigmoid function is its own function, that is
f′(x)=f(x)(1−f(x))
The calculation is very convenient and time-saving. The derivation process is as follows:
according to the commonly used derivation formula,
is obtained
f′(x)=(−1)(1+e−x)−2(0+(−1)e−x)=e−x(1+e−x)2=e−x1+e−x11+e−x
And:
1−f(x)=1−11+e−x=e−x1+e−x
Therefore,
f′(x)=f(x)(1−f(x))
.
Although sigmoid function has good properties, it can be used in classification problems, such as the classifier of logistic regression model. But why choose this function? In addition to the above mathematical easier to deal with, there are its own derivation characteristics.
For the classification problem, especially for the binary classification problem, it is assumed that the distribution obeys Bernoulli distribution. The PMF of Bernoulli distribution is:
0
f(x|p)=px(1−p)1−x
According to《
The general expression framework of the family of exponential distributions is as follows
f(x|θ)=h(x)exp{η(θ)T(x)−A(θ)}
The Bernoulli distribution is transformed into:
f(x|p)=exp{ln(p1−p)x+log(1−p)}
Among them:
θ=p
,
h(x)=1
,
T(x)=x
,
η(θ)=lnp1−p
,
A(θ)=−ln(1−p)
. Therefore, Bernoulli distribution also belongs to exponential distribution family.
We can deduce it
p
And η (θ):
the relationship between η (θ) and η (θ) was analyzed
η(θ)=lnp1−p
Then:
−η(θ)=−lnp1−p=ln1−pp=ln(1p−1)
The results are as follows
e−η(θ)=1p−1
1+e−η(θ)=1p
p=11+e−η(θ)
This is the form of sigmoid function.
Read More:
- Deep learning: derivation of sigmoid function and loss function
- Golang timer function executes a function every few minutes
- Python – get the information of calling function from called function
- Ioremap function and iounmap() function
- error C4996: ‘scanf‘: This function or variable may be unsafe.Visual Studio Series compilers report errors using scanf function
- C / C + + rounding function: round function
- Using curl to generate gnutls_ Handshake() failed: error in the pull function or gnutls recv error: error in the pull function
- R language notes – sample() function
- The function of flatten layer in deep learning
- Solution of Greenplum query calling function error
- JS error – typeerror: XXX is not a function
- control reaches end of non-void function
- error(-215) !empty() in function detectMultiScale solution
- [Warning] incompatible implicit declaration of built-in function ‘strcat’
- error[E0061]: this function takes 1 argument but 2 arguments were supplied
- Function definition is not allowed here
- geom_ Warning message: sign function’s own error
- Matlab — looking for peak function
- Tensorflow in function tf.Print Method of outputting intermediate value
- error: (-215:Assertion failed) !ssize.empty() in function ‘resize‘