Partial derivative of MSE cost function in Linear Regression?Partial derivative in gradient descent for two variablesderivative of cost function for Logistic RegressionHow to represent the parameters in logistic functionComputing vector linear regressionlinear regression simplificationHow do I calculate the partial derivative of this function?Second derivative of the cost function of logistic functionWhy are terms flipped in partial derivative of logistic regression cost function?Understanding partial derivative of logistic regression cost functionwhy is the least square cost function for linear regression convex
Sequence of Tenses: Translating the subjunctive
Is there a problem with hiding "forgot password" until it's needed?
Did Dumbledore lie to Harry about how long he had James Potter's invisibility cloak when he was examining it? If so, why?
How do scammers retract money, while you can’t?
Can the discrete variable be a negative number?
Why does indent disappear in lists?
Trouble understanding the speech of overseas colleagues
A Rare Riley Riddle
How does Loki do this?
How do I extract a value from a time formatted value in excel?
How to run a prison with the smallest amount of guards?
Sort a list by elements of another list
Unreliable Magic - Is it worth it?
Integer addition + constant, is it a group?
Increase performance creating Mandelbrot set in python
What happens if you roll doubles 3 times then land on "Go to jail?"
How do we know the LHC results are robust?
Where does the Z80 processor start executing from?
Why Were Madagascar and New Zealand Discovered So Late?
Is a stroke of luck acceptable after a series of unfavorable events?
How long to clear the 'suck zone' of a turbofan after start is initiated?
Pole-zeros of a real-valued causal FIR system
when is out of tune ok?
Tiptoe or tiphoof? Adjusting words to better fit fantasy races
Partial derivative of MSE cost function in Linear Regression?
Partial derivative in gradient descent for two variablesderivative of cost function for Logistic RegressionHow to represent the parameters in logistic functionComputing vector linear regressionlinear regression simplificationHow do I calculate the partial derivative of this function?Second derivative of the cost function of logistic functionWhy are terms flipped in partial derivative of logistic regression cost function?Understanding partial derivative of logistic regression cost functionwhy is the least square cost function for linear regression convex
$begingroup$
I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.
This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .
beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned
Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.
beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned
calculus partial-derivative linear-regression
$endgroup$
|
show 2 more comments
$begingroup$
I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.
This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .
beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned
Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.
beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned
calculus partial-derivative linear-regression
$endgroup$
$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40
$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45
$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15
$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29
1
$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38
|
show 2 more comments
$begingroup$
I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.
This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .
beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned
Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.
beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned
calculus partial-derivative linear-regression
$endgroup$
I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.
This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .
beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned
Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.
beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned
calculus partial-derivative linear-regression
calculus partial-derivative linear-regression
edited Mar 18 at 11:58
MarianD
1,6511617
1,6511617
asked Mar 17 at 23:36
user214user214
1126
1126
$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40
$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45
$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15
$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29
1
$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38
|
show 2 more comments
$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40
$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45
$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15
$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29
1
$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38
$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40
$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40
$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45
$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45
$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15
$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15
$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29
$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29
1
1
$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38
$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38
|
show 2 more comments
1 Answer
1
active
oldest
votes
$begingroup$
The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate
$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$
If we calculate the partial derivatives we obtain
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$
In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$
As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.
$endgroup$
$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53
$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22
$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00
1
$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04
1
$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3152235%2fpartial-derivative-of-mse-cost-function-in-linear-regression%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate
$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$
If we calculate the partial derivatives we obtain
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$
In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$
As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.
$endgroup$
$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53
$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22
$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00
1
$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04
1
$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12
|
show 1 more comment
$begingroup$
The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate
$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$
If we calculate the partial derivatives we obtain
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$
In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$
As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.
$endgroup$
$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53
$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22
$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00
1
$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04
1
$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12
|
show 1 more comment
$begingroup$
The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate
$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$
If we calculate the partial derivatives we obtain
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$
In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$
As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.
$endgroup$
The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate
$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$
If we calculate the partial derivatives we obtain
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$
In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$
As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.
edited Mar 18 at 13:02
user214
1126
1126
answered Mar 17 at 23:41
MachineLearnerMachineLearner
1,319112
1,319112
$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53
$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22
$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00
1
$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04
1
$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12
|
show 1 more comment
$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53
$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22
$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00
1
$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04
1
$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12
$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53
$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53
$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22
$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22
$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00
$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00
1
1
$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04
$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04
1
1
$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12
$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12
|
show 1 more comment
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3152235%2fpartial-derivative-of-mse-cost-function-in-linear-regression%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40
$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45
$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15
$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29
1
$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38