#costfunction
Explore tagged Tumblr posts
Photo

Fourier transform package is highly efficient for analyzing, maintaining, and managing a large databases. Check our Info : www.incegna.com Reg Link for Programs : http://www.incegna.com/contact-us Follow us on Facebook : www.facebook.com/INCEGNA/? Follow us on Instagram : https://www.instagram.com/_incegna/ For Queries : [email protected] #deeplearning,#fouriertransformation,#specialtportrayal,#databases,#datascience,#datascientists,#costfunction,#neuralnetworks,#gradientdescent,#backpropagation,#datanormalization https://www.instagram.com/p/B-ZJH4XAbmT/?igshid=1ve79hloe8pel
#deeplearning#fouriertransformation#specialtportrayal#databases#datascience#datascientists#costfunction#neuralnetworks#gradientdescent#backpropagation#datanormalization
0 notes
Photo

Andrew Ng killing it with his sketching while talking about saddle-points in high dimensional spaces on the awesome Deep Learning course on Coursera!! 😂😂😂 . . #djav #djlife #geekbyday #coderbyday #djbynight #machinelearning #artificialintelligence #deeplearning #neuralnetworks #costfunction #localoptima #globaloptima #highdimspace #saddlepoints #sketch #drawing #baddrawing #goodlearning #funlearning #epic #epicsketch #thatsahorse #lookslikeakomododragon (at Coursera)
#artificialintelligence#thatsahorse#highdimspace#baddrawing#djbynight#djlife#coderbyday#lookslikeakomododragon#deeplearning#funlearning#neuralnetworks#djav#costfunction#epicsketch#drawing#localoptima#geekbyday#epic#globaloptima#saddlepoints#machinelearning#sketch#goodlearning
0 notes
Text
Artificial Neural Networks Part-II
My new blog post is finally out! Tell me in the comment section, how do you like it. I would appreciate having a word with you. Cheers! #machinelearning #nanotechnology #neuralnetworks #ai #activationenergy #gradientdescent #neuron #costfunction
In the previous blog, we learned about how the networks learn, how we came upon computers with brains, and why are neural networks critical. We got this as machines learn via training data. For this post, we will determine the heart of neural networks which involves some fascinating and multi-variant functions. We will learn about gradient descent, backpropagation, and a bit about MATH. We will…
View On WordPress
#Activation Energy#AI#Alan Turing#artificial intelligence#Backpropogation#Cost Function#Gradient Descent#machine learning#MNST#Neural Networks
1 note
·
View note
Text
Title
总结
20170904
修订代码 e .^ (-z) 为exp(-z) 。
修订代码
gradients = 1 / m * X' * (H - y) + lambda / m * theta; gradients(1) = 1 / m * X(:,1)' * (H - y);
为
temp = theta; temp(1) = 0; gradients = 1 / m * X' * (H - y) + lambda / m * temp;
1. 假设函数(Hypothesis)
1.1. 线性回归(Linear Regression)
$$ \begin {split} h_{\theta }\left( x\right) &=\theta {0}+\theta _{1}x{1}+\theta {2}x{2}+\ldots +\theta {n}x{n}\ &=\sum {j=+}^{n}\theta _{j}x{j} \qquad \left( x_{0}=1\right) \ &=\theta ^{T}x \qquad \left(\theta =\left[ \begin{matrix} \theta {0}\ \theta _{1}\ :\ \theta _{n}\end{matrix} \right] \in \mathbb{R} ^{n+1},x=\left[ \begin{matrix} x{0}\ x_{1}\ \vdots \ x_{n}\end{matrix} \right] \in \mathbb{R} ^{n+1},x_{0}=1 \right) \ \end {split} $$
function [y_test] = hypothesisFunction(X_test, theta) y_test = X_test * theta; end
1.2. 逻辑回归(Logistic Regression)
需要保持 $0\leq h_{\theta }\left( x\right) \leq 1$ ,所以引入公式 $g\left( z\right) =\dfrac {1} {1+e^{-z}}$ $$ \begin {split} h_{\theta }\left( x\right) &=g\left(\theta {0}+\theta _{1}x{1}+\theta {2}x{2}+\ldots +\theta {n}x{n}\right)\ &=g\left(\sum {j=+}^{n}\theta _{j}x{j}\right) \qquad \left( x_{0}=1\right) \ &=g\left(\theta ^{T}x\right) \qquad \left(\theta =\left[ \begin{matrix} \theta {0}\ \theta _{1}\ :\ \theta _{n}\end{matrix} \right] \in \mathbb{R} ^{n+1},X=\left[ \begin{matrix} x{0}\ x_{1}\ \vdots \ x_{n}\end{matrix} \right] \in \mathbb{R} ^{n+1},x_{0}=1 \right) \ \Rightarrow h_{\theta }\left( x\right) &=\dfrac {1} {1+e^{-\theta ^{T}x}} \ \end {split} $$
function [y_test] = hypothesisFunction(X_test, theta) y_test = 1 ./ (1 + exp(-X * theta)); end
function p = predict(theta, X) % 用于效验准确度 p = round(1 ./ (1 + exp(-X * theta))); end
2. 成本函数(Cost Function)
2.1. 线性回归 成本函数
$$ \begin {split} J\left( \theta \right) &=\dfrac {1} {2m}\sum {i=1}^{m}\left( h{\theta} \left( x^{\left( i\right)}\right) -y^{\left( i\right)}\right) ^{2} \end {split} $$
function J = costFunction(X, y, theta) m = length(y); J = sum((X * theta - y) .^ 2) / (2 * m); end
2.2. 逻辑回归 成本函数
$$ \begin {split} J\left( \theta \right) =-\dfrac {1} {m}\sum {i=1}^{m}\left[ y^{\left( i\right)}\log h{\theta }\left( x^{\left( i\right)}\right)+\left( 1-y^{\left( i\right)}\right) \log \left( 1-h_{\theta }\left( x^{\left( i\right)}\right) \right)\right]\ \end {split} $$
function J = costFunction(X, y, theta) m = length(y); H = 1 ./ (1 + exp(-X * theta)); J = -1/m * sum(y.*log(H) + (1-y).*log((1-H))); end
3. 正则化成本函数(Cost Function Regularized)
3.1. 线性回归 正则化成本函数
$$ \begin {split} J\left( \theta \right) &=\dfrac {1} {2m}\left[\sum {i=1}^{m}\left( h{\theta} \left( x^{\left( i\right)}\right) -y^{\left( i\right)}\right) ^{2}+\lambda \sum _{j=1}^{n}\theta _{j}^{2}\right]\ \end {split} $$
function J= linearRegCostFunction(X, y, theta, lambda) m = length(y); J = sum((X * theta - y) .^ 2) / (2 * m) + lambda / (2 * m) * sum(theta(2:end,:) .^ 2);
3.2. 逻辑回归 正则化成本函数
$$ \begin {split} J\left( \theta \right) =-\dfrac {1} {m}\sum {i=1}^{m}\left[ y^{\left( i\right)}\log h{\theta }\left( x^{\left( i\right)}\right)+\left( 1-y^{\left( i\right)}\right) \log \left( 1-h_{\theta }\left( x^{\left( i\right)}\right) \right)\right]+\dfrac {\lambda } {2m}\sum _{j=1}^{n}\theta _{j}^{2}\ \end {split} $$
function J = costFunctionRegularized(X, y, theta, lambda) m = length(y); H = 1 ./ (1 + exp(-X * theta)); J = -1 / m * sum(y .* log(H) + (1 - y) .* log((1 - H))) + lambda / (2 * m) * sum(theta(2:end,:) .^ 2); end
4. 梯度下降(Gradient Descent)
复杂度为$O\left( kn^{2}\right)$ $$ \begin {split} Repeat{ \ \theta {j}&:=\theta _{j}-\alpha \dfrac {\partial } {\partial \theta _{j}}J\left( \theta \right) \qquad \left( j=0,1,\cdots ;n\right)\ &:=\theta _{j}-\alpha \dfrac {1} {m}\sum _{i=1}^{m}\left(\left( h{\theta }\left( x^{\left( i\right)}\right) -y^{\left( i\right)}\right)x_{j}^{\left( i\right)}\right)\ } \end {split} $$
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iterations) % 如使用fminunc函数则无需此方法。 m = length(y); J_history = zeros(num_iterations, 1); for iter = 1:num_iterations theta = theta - alpha / m * X' * (X * theta - y); J_history(iter) = computeCostMulti(X, y, theta); end end
4.1. 线性回归 梯度下降 偏导数算法
function gradients = gradientsFunction(X, y, theta) % 使用fminunc函数时,将本函数代码添加到Cost Function对应函数中。 m = length(y); H = X * theta; gradients = 1 / m * X' * (H - y); end
4.2. 逻辑回归 梯度下降 偏导数算法
function gradients = gradientsFunction(X, y, theta) % 使用fminunc函数时,将本函数代码添加到Cost Function对应函数中。 m = length(y); H = 1 ./ (1 + exp(-X * theta)); gradients = 1 / m * X' * (H - y); end
5. 正则化梯度下降(Gradient Descent Regularized)
$$ \begin {split} Repeat{ \ \theta {0}&:=\theta _{0}-\alpha \dfrac {\partial } {\partial \theta _{0}}J\left( \theta \right)\ &:=\theta _{0}-\alpha \dfrac {1} {m}\sum _{i=1}^{m}\left(\left( h{\theta }\left( x^{\left( i\right)}\right) -y^{\left( i\right)}\right)x_{0}^{\left( i\right)}\right)\ \theta {j}&:=\theta _{j}-\alpha \dfrac {\partial } {\partial \theta _{j}}J\left( \theta \right) \qquad \left( j=1,\cdots ,n\right)\ &:=\theta _{j}-\alpha \left[ \left( \dfrac {1} {m}\sum _{i=1}^{m}\left(\left( h{\theta }\left( x^{\left( i\right)}\right) -y^{\left( i\right)}\right)x_{j}^{\left( i\right)}\right)\right) +\dfrac {\lambda } {m}\theta {j}\right]\ &:=\theta _{j}\left( 1-\alpha \dfrac {\lambda } {m}\right)-\alpha \dfrac {1} {m}\sum _{i=1}^{m}\left(\left( h{\theta }\left( x^{\left( i\right)}\right) -y^{\left( i\right)}\right)x_{j}^{\left( i\right)}\right)\ } \end {split} $$
这里将$\theta _{0}$ 单独处理。
另:$\because 1-\alpha \dfrac {\lambda } {m} < 1\therefore \theta _{j}\left( 1-\alpha \dfrac {\lambda } {m}\right) < \theta _{j}$
5.1. 线性回归 正则化梯度下降 偏导数算法
function gradients = gradientsFunctionRegularized(X, y, theta, lambda) % 使用fminunc函数时,将本函数代码添加到Cost Function对应函数中。 m = length(y); H = X * theta; temp = theta; temp(1) = 0; gradients = 1 / m * X' * (H - y) + lambda / m * temp; end
5.2. 逻辑回归 正则化梯度下降 偏导数算法
function gradients = gradientsFunctionRegularized(X, y, theta, lambda) % 使用fminunc函数时,将本函数代码添加到Cost Function对应函数中。 m = length(y); H = 1 ./ (1 + exp(-X * theta)); temp = theta; temp(1) = 0; gradients = 1 / m * X' * (H - y) + lambda / m * temp; end
6. 特征缩放(Feature Scaling)
将$x_{i}$ 的值固定在一个范围内,这样梯度下降的算法将更加具有运算效率。
这个范围通常定为$-1\leq x_{i}\leq 1$ ,$-3\leq x_{i}\leq 3$ , $-\dfrac {1} {3}\leq x_{i}\leq \dfrac {1} {3}$ 。 $$ \begin {split} x_{i}:=\dfrac {x_{i}-\mu {i}} {s{i}}\ \end {split} $$ $\mu {i}$ 为$x{i}$ 的平均值。
$s_{i}$ 为$x_{i}$ 的范围(max-min),代码中常用方差$\sigma$ 替代。
function [X_normalize, mu, sigma] = featureNormalize(X) % 不要包含x0列 n = size(X, 2); mu = mean(X); sigma = std(X); X_normalize = (X - mu) * (ones(n, 1) * (1 ./ sigma) .* eye(n)); end
7. 正规方程(Normal Equation)
复杂度为$O\left( n^{3}\right)$
条件:
n <= 10000。计算机性能越强这个数字越大。
m>n。否则无逆矩阵。
无冗余。
$$ \begin {split} \theta =\left( X^{T}X\right) ^{-1}X^{T}\overrightarrow {y} \end {split} $$
function [theta] = normalEquation(X, y) theta = pinv(X' * X) * X' * y; % 或使用inv end
8. 正则化 正规方程(Regularized Normal Equation)
条件:
n <= 10000。计算机性能越强这个数字越大。
无冗余。
$$ \begin {split} \theta &=\left( X^{T}X+\lambda \cdot L\right) ^{-1}X^{T}\overrightarrow {y}\ L&=\left[ \begin{matrix} 0& & & & \ & 1& & & \ & & 1& & \ & & & \cdots & \ & & & & 1\end{matrix} \right] \qquad \left( L\in \mathbb{R} ^{\left( n+1\right) \times \left( n+1\right) }\right)\ \end {split} $$
9. 其他
9.1. 为X添加x0列
function [X] = addx0(X) X = [ones(m, 1) X]; end
9.2. 对X_test特征缩放
function [X_test_normalize] = featureNormalizeForTest(X_test, mu, sigma) % 不要包含x0列 n = size(X, 2); X_test_normalize = (X_test - mu) * (ones(n, 1) * (1 ./ sigma) .* eye(n)); end
9.3. 使用fminunc函数
成本函数代码:
function [J, gradients] = costFunction(X, y, theta, lambda) % 逻辑回归 正则化 成本函数和梯度下降算法 m = length(y); H = 1 ./ (1 + exp(-X * theta)); J = -1 / m * sum(y .* log(H) + (1 - y) .* log((1 - H))) + lambda / (2 * m) * sum(theta(2:end,:) .^ 2); gradients = 1 / m * X' * (H - y) + lambda / m * theta; gradients(1) = 1 / m * X(:,1)' * (H - y); end
调用代码:
initial_theta = zeros(size(X, 2), 1); lambda = 1; options = optimset('GradObj', 'on', 'MaxIter', 100); % 循环100次 [theta, J, exit_flag] = fminunc(@(t)(costFunction(X, y, t, lambda)), initial_theta, options);
Written with StackEdit.
0 notes