Partial derivative of MSE cost function in Linear Regression?Partial derivative in gradient descent for two variablesderivative of cost function for Logistic RegressionHow to represent the parameters in logistic functionComputing vector linear regressionlinear regression simplificationHow do I calculate the partial derivative of this function?Second derivative of the cost function of logistic functionWhy are terms flipped in partial derivative of logistic regression cost function?Understanding partial derivative of logistic regression cost functionwhy is the least square cost function for linear regression convex

Sequence of Tenses: Translating the subjunctive

Is there a problem with hiding "forgot password" until it's needed?

Did Dumbledore lie to Harry about how long he had James Potter's invisibility cloak when he was examining it? If so, why?

How do scammers retract money, while you can’t?

Can the discrete variable be a negative number?

Why does indent disappear in lists?

Trouble understanding the speech of overseas colleagues

A Rare Riley Riddle

How does Loki do this?

How do I extract a value from a time formatted value in excel?

How to run a prison with the smallest amount of guards?

Sort a list by elements of another list

Unreliable Magic - Is it worth it?

Integer addition + constant, is it a group?

Increase performance creating Mandelbrot set in python

What happens if you roll doubles 3 times then land on "Go to jail?"

How do we know the LHC results are robust?

Where does the Z80 processor start executing from?

Why Were Madagascar and New Zealand Discovered So Late?

Is a stroke of luck acceptable after a series of unfavorable events?

How long to clear the 'suck zone' of a turbofan after start is initiated?

Pole-zeros of a real-valued causal FIR system

when is out of tune ok?

Tiptoe or tiphoof? Adjusting words to better fit fantasy races



Partial derivative of MSE cost function in Linear Regression?


Partial derivative in gradient descent for two variablesderivative of cost function for Logistic RegressionHow to represent the parameters in logistic functionComputing vector linear regressionlinear regression simplificationHow do I calculate the partial derivative of this function?Second derivative of the cost function of logistic functionWhy are terms flipped in partial derivative of logistic regression cost function?Understanding partial derivative of logistic regression cost functionwhy is the least square cost function for linear regression convex













1












$begingroup$


I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.



This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .



beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned



Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.



beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned










share|cite|improve this question











$endgroup$











  • $begingroup$
    Yes, except the minus sign. But this is not important since you set them equal to $0$
    $endgroup$
    – callculus
    Mar 17 at 23:40










  • $begingroup$
    @callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
    $endgroup$
    – user214
    Mar 17 at 23:45











  • $begingroup$
    Are you still interested in an answer?
    $endgroup$
    – callculus
    Mar 18 at 2:15










  • $begingroup$
    @callculus Yes. I'd appreciate it.
    $endgroup$
    – user214
    Mar 18 at 2:29






  • 1




    $begingroup$
    usee214: Great, love it.
    $endgroup$
    – callculus
    Mar 18 at 15:38















1












$begingroup$


I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.



This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .



beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned



Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.



beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned










share|cite|improve this question











$endgroup$











  • $begingroup$
    Yes, except the minus sign. But this is not important since you set them equal to $0$
    $endgroup$
    – callculus
    Mar 17 at 23:40










  • $begingroup$
    @callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
    $endgroup$
    – user214
    Mar 17 at 23:45











  • $begingroup$
    Are you still interested in an answer?
    $endgroup$
    – callculus
    Mar 18 at 2:15










  • $begingroup$
    @callculus Yes. I'd appreciate it.
    $endgroup$
    – user214
    Mar 18 at 2:29






  • 1




    $begingroup$
    usee214: Great, love it.
    $endgroup$
    – callculus
    Mar 18 at 15:38













1












1








1


1



$begingroup$


I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.



This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .



beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned



Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.



beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned










share|cite|improve this question











$endgroup$




I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.



This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .



beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned



Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.



beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned







calculus partial-derivative linear-regression






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Mar 18 at 11:58









MarianD

1,6511617




1,6511617










asked Mar 17 at 23:36









user214user214

1126




1126











  • $begingroup$
    Yes, except the minus sign. But this is not important since you set them equal to $0$
    $endgroup$
    – callculus
    Mar 17 at 23:40










  • $begingroup$
    @callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
    $endgroup$
    – user214
    Mar 17 at 23:45











  • $begingroup$
    Are you still interested in an answer?
    $endgroup$
    – callculus
    Mar 18 at 2:15










  • $begingroup$
    @callculus Yes. I'd appreciate it.
    $endgroup$
    – user214
    Mar 18 at 2:29






  • 1




    $begingroup$
    usee214: Great, love it.
    $endgroup$
    – callculus
    Mar 18 at 15:38
















  • $begingroup$
    Yes, except the minus sign. But this is not important since you set them equal to $0$
    $endgroup$
    – callculus
    Mar 17 at 23:40










  • $begingroup$
    @callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
    $endgroup$
    – user214
    Mar 17 at 23:45











  • $begingroup$
    Are you still interested in an answer?
    $endgroup$
    – callculus
    Mar 18 at 2:15










  • $begingroup$
    @callculus Yes. I'd appreciate it.
    $endgroup$
    – user214
    Mar 18 at 2:29






  • 1




    $begingroup$
    usee214: Great, love it.
    $endgroup$
    – callculus
    Mar 18 at 15:38















$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40




$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40












$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45





$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45













$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15




$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15












$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29




$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29




1




1




$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38




$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38










1 Answer
1






active

oldest

votes


















3












$begingroup$

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate



$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$



If we calculate the partial derivatives we obtain



$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$



In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$



As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.






share|cite|improve this answer











$endgroup$












  • $begingroup$
    Can you please include the corrected formula in your answer?
    $endgroup$
    – user214
    Mar 17 at 23:53










  • $begingroup$
    You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
    $endgroup$
    – MachineLearner
    Mar 18 at 6:22










  • $begingroup$
    I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
    $endgroup$
    – user214
    Mar 18 at 10:00






  • 1




    $begingroup$
    @user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
    $endgroup$
    – MachineLearner
    Mar 18 at 10:04






  • 1




    $begingroup$
    @user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
    $endgroup$
    – MachineLearner
    Mar 18 at 11:12










Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3152235%2fpartial-derivative-of-mse-cost-function-in-linear-regression%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









3












$begingroup$

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate



$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$



If we calculate the partial derivatives we obtain



$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$



In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$



As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.






share|cite|improve this answer











$endgroup$












  • $begingroup$
    Can you please include the corrected formula in your answer?
    $endgroup$
    – user214
    Mar 17 at 23:53










  • $begingroup$
    You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
    $endgroup$
    – MachineLearner
    Mar 18 at 6:22










  • $begingroup$
    I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
    $endgroup$
    – user214
    Mar 18 at 10:00






  • 1




    $begingroup$
    @user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
    $endgroup$
    – MachineLearner
    Mar 18 at 10:04






  • 1




    $begingroup$
    @user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
    $endgroup$
    – MachineLearner
    Mar 18 at 11:12















3












$begingroup$

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate



$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$



If we calculate the partial derivatives we obtain



$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$



In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$



As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.






share|cite|improve this answer











$endgroup$












  • $begingroup$
    Can you please include the corrected formula in your answer?
    $endgroup$
    – user214
    Mar 17 at 23:53










  • $begingroup$
    You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
    $endgroup$
    – MachineLearner
    Mar 18 at 6:22










  • $begingroup$
    I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
    $endgroup$
    – user214
    Mar 18 at 10:00






  • 1




    $begingroup$
    @user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
    $endgroup$
    – MachineLearner
    Mar 18 at 10:04






  • 1




    $begingroup$
    @user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
    $endgroup$
    – MachineLearner
    Mar 18 at 11:12













3












3








3





$begingroup$

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate



$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$



If we calculate the partial derivatives we obtain



$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$



In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$



As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.






share|cite|improve this answer











$endgroup$



The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate



$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$



If we calculate the partial derivatives we obtain



$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$



In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$



As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.







share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited Mar 18 at 13:02









user214

1126




1126










answered Mar 17 at 23:41









MachineLearnerMachineLearner

1,319112




1,319112











  • $begingroup$
    Can you please include the corrected formula in your answer?
    $endgroup$
    – user214
    Mar 17 at 23:53










  • $begingroup$
    You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
    $endgroup$
    – MachineLearner
    Mar 18 at 6:22










  • $begingroup$
    I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
    $endgroup$
    – user214
    Mar 18 at 10:00






  • 1




    $begingroup$
    @user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
    $endgroup$
    – MachineLearner
    Mar 18 at 10:04






  • 1




    $begingroup$
    @user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
    $endgroup$
    – MachineLearner
    Mar 18 at 11:12
















  • $begingroup$
    Can you please include the corrected formula in your answer?
    $endgroup$
    – user214
    Mar 17 at 23:53










  • $begingroup$
    You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
    $endgroup$
    – MachineLearner
    Mar 18 at 6:22










  • $begingroup$
    I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
    $endgroup$
    – user214
    Mar 18 at 10:00






  • 1




    $begingroup$
    @user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
    $endgroup$
    – MachineLearner
    Mar 18 at 10:04






  • 1




    $begingroup$
    @user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
    $endgroup$
    – MachineLearner
    Mar 18 at 11:12















$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53




$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53












$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22




$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22












$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00




$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00




1




1




$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04




$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04




1




1




$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12




$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12

















draft saved

draft discarded
















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3152235%2fpartial-derivative-of-mse-cost-function-in-linear-regression%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How should I support this large drywall patch? Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?How do I cover large gaps in drywall?How do I keep drywall around a patch from crumbling?Can I glue a second layer of drywall?How to patch long strip on drywall?Large drywall patch: how to avoid bulging seams?Drywall Mesh Patch vs. Bulge? To remove or not to remove?How to fix this drywall job?Prep drywall before backsplashWhat's the best way to fix this horrible drywall patch job?Drywall patching using 3M Patch Plus Primer

random experiment with two different functions on unit interval Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)Random variable and probability space notionsRandom Walk with EdgesFinding functions where the increase over a random interval is Poisson distributedNumber of days until dayCan an observed event in fact be of zero probability?Unit random processmodels of coins and uniform distributionHow to get the number of successes given $n$ trials , probability $P$ and a random variable $X$Absorbing Markov chain in a computer. Is “almost every” turned into always convergence in computer executions?Stopped random walk is not uniformly integrable

Lowndes Grove History Architecture References Navigation menu32°48′6″N 79°57′58″W / 32.80167°N 79.96611°W / 32.80167; -79.9661132°48′6″N 79°57′58″W / 32.80167°N 79.96611°W / 32.80167; -79.9661178002500"National Register Information System"Historic houses of South Carolina"Lowndes Grove""+32° 48' 6.00", −79° 57' 58.00""Lowndes Grove, Charleston County (260 St. Margaret St., Charleston)""Lowndes Grove"The Charleston ExpositionIt Happened in South Carolina"Lowndes Grove (House), Saint Margaret Street & Sixth Avenue, Charleston, Charleston County, SC(Photographs)"Plantations of the Carolina Low Countrye