Partial derivative of MSE cost function in Linear Regression?Partial derivative in gradient descent for two variablesderivative of cost function for Logistic RegressionHow to represent the parameters in logistic functionComputing vector linear regressionlinear regression simplificationHow do I calculate the partial derivative of this function?Second derivative of the cost function of logistic functionWhy are terms flipped in partial derivative of logistic regression cost function?Understanding partial derivative of logistic regression cost functionwhy is the least square cost function for linear regression convex

Sequence of Tenses: Translating the subjunctive

Is there a problem with hiding "forgot password" until it's needed?

Did Dumbledore lie to Harry about how long he had James Potter's invisibility cloak when he was examining it? If so, why?

How do scammers retract money, while you can’t?

Can the discrete variable be a negative number?

Why does indent disappear in lists?

Trouble understanding the speech of overseas colleagues

A Rare Riley Riddle

How does Loki do this?

How do I extract a value from a time formatted value in excel?

How to run a prison with the smallest amount of guards?

Sort a list by elements of another list

Unreliable Magic - Is it worth it?

Integer addition + constant, is it a group?

Increase performance creating Mandelbrot set in python

What happens if you roll doubles 3 times then land on "Go to jail?"

How do we know the LHC results are robust?

Where does the Z80 processor start executing from?

Why Were Madagascar and New Zealand Discovered So Late?

Is a stroke of luck acceptable after a series of unfavorable events?

How long to clear the 'suck zone' of a turbofan after start is initiated?

Pole-zeros of a real-valued causal FIR system

when is out of tune ok?

Tiptoe or tiphoof? Adjusting words to better fit fantasy races

Partial derivative of MSE cost function in Linear Regression?

Partial derivative in gradient descent for two variablesderivative of cost function for Logistic RegressionHow to represent the parameters in logistic functionComputing vector linear regressionlinear regression simplificationHow do I calculate the partial derivative of this function?Second derivative of the cost function of logistic functionWhy are terms flipped in partial derivative of logistic regression cost function?Understanding partial derivative of logistic regression cost functionwhy is the least square cost function for linear regression convex

I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.

This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .

beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned

Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.

beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned

edited Mar 18 at 11:58

MarianD

1,6511617

asked Mar 17 at 23:36

user214

1126

$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40

$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45

$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15

$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29

1

$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38

|
show 2 more comments

I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.

This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .

beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned

Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.

beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned

edited Mar 18 at 11:58

MarianD

1,6511617

asked Mar 17 at 23:36

user214

1126

$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40

$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45

$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15

$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29

1

$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38

|
show 2 more comments

I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.

This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .

beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned

Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.

beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned

edited Mar 18 at 11:58

MarianD

1,6511617

asked Mar 17 at 23:36

user214

1126

I'm confused by multiple representations of the partial derivatives of Linear Regression cost function.

This is the MSE cost function of Linear Regression. Here $h_theta(x) = theta_0+theta_1x$ .

beginalignedJ(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(h_theta(x^(i)) - y^(i))^2\J(theta_0,theta_1) &= frac1mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))^2endaligned

Are these the correct partial derivatives of above MSE cost function of Linear Regression with respect to $theta_1, theta_0$? If there's any mistake please correct me.

beginalignedfracdJdtheta_1 &= frac-2mdisplaystylesum_i=1^m(x^(i)).(theta_0 + theta_1x^(i) - y^(i))\
fracdJdtheta_0 &= frac-2mdisplaystylesum_i=1^m(theta_0 + theta_1x^(i) - y^(i))endaligned

calculus partial-derivative linear-regression

edited Mar 18 at 11:58

MarianD

1,6511617

asked Mar 17 at 23:36

user214

1126

edited Mar 18 at 11:58

MarianD

1,6511617

asked Mar 17 at 23:36

user214

1126

edited Mar 18 at 11:58

MarianD

1,6511617

edited Mar 18 at 11:58

MarianD

1,6511617

edited Mar 18 at 11:58

MarianD

1,6511617

asked Mar 17 at 23:36

user214

1126

asked Mar 17 at 23:36

user214

1126

asked Mar 17 at 23:36

user214

1126

$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40

$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45

$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15

$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29

1

$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38

|
show 2 more comments

$begingroup$
Yes, except the minus sign. But this is not important since you set them equal to $0$
$endgroup$
– callculus
Mar 17 at 23:40

$begingroup$
@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".
$endgroup$
– user214
Mar 17 at 23:45

$begingroup$
Are you still interested in an answer?
$endgroup$
– callculus
Mar 18 at 2:15

$begingroup$
@callculus Yes. I'd appreciate it.
$endgroup$
– user214
Mar 18 at 2:29

1

$begingroup$
usee214: Great, love it.
$endgroup$
– callculus
Mar 18 at 15:38

Yes, except the minus sign. But this is not important since you set them equal to $0$

– callculus
Mar 17 at 23:40

@callculus So it's $frac2m$ rather than $frac-2m$ for both the cases. I couldn't get what you meant by "you set them equal to 0".

– user214
Mar 17 at 23:45

Are you still interested in an answer?

– callculus
Mar 18 at 2:15

@callculus Yes. I'd appreciate it.

– user214
Mar 18 at 2:29

usee214: Great, love it.

– callculus
Mar 18 at 15:38

|
show 2 more comments

1 Answer
1

active

oldest

votes

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate

$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$

If we calculate the partial derivatives we obtain

$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$

In order to find the extremum of the cost function $J$ (we seek to minimize it) we need to set these partial derivatives equal to $0$
$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]=0$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]=0$$
$$implies sum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[x_iright] = 0.$$

As we divide by $ -2/m $ for both cases we will obtain the same result. If you had $ +2/m $ then you would divide by $ 2/m $ and still obtain the same equations as stated above. If the equation that we need to solve are identical the solutions will also be identical.

edited Mar 18 at 13:02

user214

1126

answered Mar 17 at 23:41

MachineLearner

1,319112

$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53

$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22

$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00

1

$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04

1

$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12

|
show 1 more comment

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3152235%2fpartial-derivative-of-mse-cost-function-in-linear-regression%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate

$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$

If we calculate the partial derivatives we obtain

$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$

edited Mar 18 at 13:02

user214

1126

answered Mar 17 at 23:41

MachineLearner

1,319112

$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53

$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22

$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00

1

$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04

1

$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12

|
show 1 more comment

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate

$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$

If we calculate the partial derivatives we obtain

$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$

edited Mar 18 at 13:02

user214

1126

answered Mar 17 at 23:41

MachineLearner

1,319112

$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53

$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22

$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00

1

$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04

1

$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12

|
show 1 more comment

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate

$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$

If we calculate the partial derivatives we obtain

$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$

edited Mar 18 at 13:02

user214

1126

answered Mar 17 at 23:41

MachineLearner

1,319112

The derivatives are almost correct, but instead of a minus sign, you should have a plus sign. The minus sign is there if we differentiate

$$J = dfrac1msum_i=1^mleft[y_i-theta_0-theta_1 x_iright]^2$$

If we calculate the partial derivatives we obtain

$$dfracpartial Jpartial theta_0=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-1 right]$$
$$dfracpartial Jpartial theta_1=frac2msum_i=1^m[y_i-theta_0-theta_1x_i]cdotleft[-x_i right]$$

edited Mar 18 at 13:02

user214

1126

answered Mar 17 at 23:41

MachineLearner

1,319112

edited Mar 18 at 13:02

user214

1126

edited Mar 18 at 13:02

user214

1126

edited Mar 18 at 13:02

user214

1126

answered Mar 17 at 23:41

MachineLearner

1,319112

answered Mar 17 at 23:41

MachineLearner

1,319112

answered Mar 17 at 23:41

MachineLearner

1,319112

$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53

$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22

$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00

1

$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04

1

$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12

|
show 1 more comment

$begingroup$
Can you please include the corrected formula in your answer?
$endgroup$
– user214
Mar 17 at 23:53

$begingroup$
You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.
$endgroup$
– MachineLearner
Mar 18 at 6:22

$begingroup$
I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?
$endgroup$
– user214
Mar 18 at 10:00

1

$begingroup$
@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.
$endgroup$
– MachineLearner
Mar 18 at 10:04

1

$begingroup$
@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.
$endgroup$
– MachineLearner
Mar 18 at 11:12

Can you please include the corrected formula in your answer?

– user214
Mar 17 at 23:53

You just have to multipy your partial derivatives by $(-1)$. Both ways lead to the same result.

– MachineLearner
Mar 18 at 6:22

I'm trying to build a Stochastic Gradient Descent. So can I use 2/m insted of -2/m and calculate the gradients right?

– user214
Mar 18 at 10:00

@user214: In the end, the plus or minus does not make a difference, because you set the derivatives equal to zero. But your code could irritate other people. That is why you should use $2/m$ instead of the wrong $-2/m$ (but which leads to the same correct result) as a factor.

– MachineLearner
Mar 18 at 10:04

@user214: I added more details. You can try it on your own for the correct version and for the wrong version. You will see that we obtain the same result if you solve for $theta_0$ and $theta_1$.

– MachineLearner
Mar 18 at 11:12

|
show 1 more comment

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

FYM3C4,dwbx2 PfgOiFa8o0lc 6V8IbSSrdf7,WTvX5P5jZzCA0K0H9 QKlRa

搜尋此網誌

Fdtxjr

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Football at the 1986 Brunei Merdeka Games Contents Teams Group stage Knockout stage References Navigation menu"Brunei Merdeka Games 1986".

Solar Wings Breeze Design and development Specifications (Breeze) References Navigation menu1368-485X"Hang glider: Breeze (Solar Wings)"e

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Football at the 1986 Brunei Merdeka Games Contents Teams Group stage Knockout stage References Navigation menu"Brunei Merdeka Games 1986".

Solar Wings Breeze Design and development Specifications (Breeze) References Navigation menu1368-485X"Hang glider: Breeze (Solar Wings)"e

1 Answer
1

1 Answer
1

1 Answer
1