Relative Entropy and the Wasserstein distanceWasserstein distance from a Dirac measureThe Wasserstein distance on $mathbbR$Wasserstein distances metrize weak convergenceUnderstanding information entropyCompleteness of Wasserstein spaceAn intriguing duality gap for Wasserstein distance for Gaussian distributionsWasserstein Distance with TranslationsWasserstein distance between hyperplane and cubeWasserstein attains its infimumWasserstein distance of two flat triangles
Is exact Kanji stroke length important?
What was required to accept "troll"?
Should my PhD thesis be submitted under my legal name?
Teaching indefinite integrals that require special-casing
A known event to a history junkie
Stereotypical names
In Star Trek IV, why did the Bounty go back to a time when whales were already rare?
Meta programming: Declare a new struct on the fly
Simple recursive Sudoku solver
Partial sums of primes
Why isn't KTEX's runway designation 10/28 instead of 9/27?
The One-Electron Universe postulate is true - what simple change can I make to change the whole universe?
Is a naturally all "male" species possible?
Perfect riffle shuffles
What do you call the infoboxes with text and sometimes images on the side of a page we find in textbooks?
Are taller landing gear bad for aircraft, particulary large airliners?
How to deal with or prevent idle in the test team?
Organic chemistry Iodoform Reaction
Greatest common substring
What will be the benefits of Brexit?
How do ultrasonic sensors differentiate between transmitted and received signals?
"lassen" in meaning "sich fassen"
Are Warlocks Arcane or Divine?
Adding empty element to declared container without declaring type of element
Relative Entropy and the Wasserstein distance
Wasserstein distance from a Dirac measureThe Wasserstein distance on $mathbbR$Wasserstein distances metrize weak convergenceUnderstanding information entropyCompleteness of Wasserstein spaceAn intriguing duality gap for Wasserstein distance for Gaussian distributionsWasserstein Distance with TranslationsWasserstein distance between hyperplane and cubeWasserstein attains its infimumWasserstein distance of two flat triangles
$begingroup$
Can anyone give an informative example of two distributions which have a low Wasserstein distance but high relative entropy (or the other way around)? I find the Wasserstein defined (for some $p$) as
$$ W_p(mu,nu)=(inf_piinPi(mu,nu)int d^p(x,y)pi(dx,dy))^1/p $$
an intuitive, reasonable way to calculate the distance (or displacement) between two probability measures. However im struggling to see what relative entropy really tells us. I have seen from Sanov's theorem that it can be used to control the exponential rate of decay of the probability of a rare event, however I still haven't got an intuitive feel for how it works and would really appreciate a concrete example so I can compare it against the Wasserstein. I have heard that relative entropy controls the fluctuation of one distribution w.r.t another however haven't yet quite understood what this means.
probability-theory measure-theory concentration-of-measure optimal-transport
$endgroup$
add a comment |
$begingroup$
Can anyone give an informative example of two distributions which have a low Wasserstein distance but high relative entropy (or the other way around)? I find the Wasserstein defined (for some $p$) as
$$ W_p(mu,nu)=(inf_piinPi(mu,nu)int d^p(x,y)pi(dx,dy))^1/p $$
an intuitive, reasonable way to calculate the distance (or displacement) between two probability measures. However im struggling to see what relative entropy really tells us. I have seen from Sanov's theorem that it can be used to control the exponential rate of decay of the probability of a rare event, however I still haven't got an intuitive feel for how it works and would really appreciate a concrete example so I can compare it against the Wasserstein. I have heard that relative entropy controls the fluctuation of one distribution w.r.t another however haven't yet quite understood what this means.
probability-theory measure-theory concentration-of-measure optimal-transport
$endgroup$
add a comment |
$begingroup$
Can anyone give an informative example of two distributions which have a low Wasserstein distance but high relative entropy (or the other way around)? I find the Wasserstein defined (for some $p$) as
$$ W_p(mu,nu)=(inf_piinPi(mu,nu)int d^p(x,y)pi(dx,dy))^1/p $$
an intuitive, reasonable way to calculate the distance (or displacement) between two probability measures. However im struggling to see what relative entropy really tells us. I have seen from Sanov's theorem that it can be used to control the exponential rate of decay of the probability of a rare event, however I still haven't got an intuitive feel for how it works and would really appreciate a concrete example so I can compare it against the Wasserstein. I have heard that relative entropy controls the fluctuation of one distribution w.r.t another however haven't yet quite understood what this means.
probability-theory measure-theory concentration-of-measure optimal-transport
$endgroup$
Can anyone give an informative example of two distributions which have a low Wasserstein distance but high relative entropy (or the other way around)? I find the Wasserstein defined (for some $p$) as
$$ W_p(mu,nu)=(inf_piinPi(mu,nu)int d^p(x,y)pi(dx,dy))^1/p $$
an intuitive, reasonable way to calculate the distance (or displacement) between two probability measures. However im struggling to see what relative entropy really tells us. I have seen from Sanov's theorem that it can be used to control the exponential rate of decay of the probability of a rare event, however I still haven't got an intuitive feel for how it works and would really appreciate a concrete example so I can compare it against the Wasserstein. I have heard that relative entropy controls the fluctuation of one distribution w.r.t another however haven't yet quite understood what this means.
probability-theory measure-theory concentration-of-measure optimal-transport
probability-theory measure-theory concentration-of-measure optimal-transport
asked Mar 16 at 21:49
MontyMonty
36113
36113
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
For an example, look at the point masses $delta_0$ and $delta_h$ supported at $0$ and $h$, respectively. The Wasserstein distance between these is $O(h)$, which is small if $h$ is small. But for $hne0$ the relative entropy is infinitely large, as the two measures are mutually singular.
For an example the other way around, let $f$ be the density of a $U[0,N]$ rv, and let $g(x)=(1-epsilon)f(x)$ for $xin[0,N/2]$ and $g(x)=(1+epsilon)f(x)$ otherwise. The Wasserstein distance is something like $O(Nepsilon)$ (because we have to transfer like $epsilon$ of the mass over distance $N/2$, but the relative entropy is something like $O(epsilon)$ because $log f(x)/g(x) = O(epsilon)$. By proper choice of $epsilon$, we can make the Wasserstein distance big but the relative entropy small.
The intuitive picture I have in mind, is that when one looks at the superimposed graphs of the densities of the two measures (pretending that they have densities) is that the the relative entropy measures how much they differ in a vertical sense only, but the Wasserstein metric allows for sideways nudgings of the two graphs.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3150894%2frelative-entropy-and-the-wasserstein-distance%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
For an example, look at the point masses $delta_0$ and $delta_h$ supported at $0$ and $h$, respectively. The Wasserstein distance between these is $O(h)$, which is small if $h$ is small. But for $hne0$ the relative entropy is infinitely large, as the two measures are mutually singular.
For an example the other way around, let $f$ be the density of a $U[0,N]$ rv, and let $g(x)=(1-epsilon)f(x)$ for $xin[0,N/2]$ and $g(x)=(1+epsilon)f(x)$ otherwise. The Wasserstein distance is something like $O(Nepsilon)$ (because we have to transfer like $epsilon$ of the mass over distance $N/2$, but the relative entropy is something like $O(epsilon)$ because $log f(x)/g(x) = O(epsilon)$. By proper choice of $epsilon$, we can make the Wasserstein distance big but the relative entropy small.
The intuitive picture I have in mind, is that when one looks at the superimposed graphs of the densities of the two measures (pretending that they have densities) is that the the relative entropy measures how much they differ in a vertical sense only, but the Wasserstein metric allows for sideways nudgings of the two graphs.
$endgroup$
add a comment |
$begingroup$
For an example, look at the point masses $delta_0$ and $delta_h$ supported at $0$ and $h$, respectively. The Wasserstein distance between these is $O(h)$, which is small if $h$ is small. But for $hne0$ the relative entropy is infinitely large, as the two measures are mutually singular.
For an example the other way around, let $f$ be the density of a $U[0,N]$ rv, and let $g(x)=(1-epsilon)f(x)$ for $xin[0,N/2]$ and $g(x)=(1+epsilon)f(x)$ otherwise. The Wasserstein distance is something like $O(Nepsilon)$ (because we have to transfer like $epsilon$ of the mass over distance $N/2$, but the relative entropy is something like $O(epsilon)$ because $log f(x)/g(x) = O(epsilon)$. By proper choice of $epsilon$, we can make the Wasserstein distance big but the relative entropy small.
The intuitive picture I have in mind, is that when one looks at the superimposed graphs of the densities of the two measures (pretending that they have densities) is that the the relative entropy measures how much they differ in a vertical sense only, but the Wasserstein metric allows for sideways nudgings of the two graphs.
$endgroup$
add a comment |
$begingroup$
For an example, look at the point masses $delta_0$ and $delta_h$ supported at $0$ and $h$, respectively. The Wasserstein distance between these is $O(h)$, which is small if $h$ is small. But for $hne0$ the relative entropy is infinitely large, as the two measures are mutually singular.
For an example the other way around, let $f$ be the density of a $U[0,N]$ rv, and let $g(x)=(1-epsilon)f(x)$ for $xin[0,N/2]$ and $g(x)=(1+epsilon)f(x)$ otherwise. The Wasserstein distance is something like $O(Nepsilon)$ (because we have to transfer like $epsilon$ of the mass over distance $N/2$, but the relative entropy is something like $O(epsilon)$ because $log f(x)/g(x) = O(epsilon)$. By proper choice of $epsilon$, we can make the Wasserstein distance big but the relative entropy small.
The intuitive picture I have in mind, is that when one looks at the superimposed graphs of the densities of the two measures (pretending that they have densities) is that the the relative entropy measures how much they differ in a vertical sense only, but the Wasserstein metric allows for sideways nudgings of the two graphs.
$endgroup$
For an example, look at the point masses $delta_0$ and $delta_h$ supported at $0$ and $h$, respectively. The Wasserstein distance between these is $O(h)$, which is small if $h$ is small. But for $hne0$ the relative entropy is infinitely large, as the two measures are mutually singular.
For an example the other way around, let $f$ be the density of a $U[0,N]$ rv, and let $g(x)=(1-epsilon)f(x)$ for $xin[0,N/2]$ and $g(x)=(1+epsilon)f(x)$ otherwise. The Wasserstein distance is something like $O(Nepsilon)$ (because we have to transfer like $epsilon$ of the mass over distance $N/2$, but the relative entropy is something like $O(epsilon)$ because $log f(x)/g(x) = O(epsilon)$. By proper choice of $epsilon$, we can make the Wasserstein distance big but the relative entropy small.
The intuitive picture I have in mind, is that when one looks at the superimposed graphs of the densities of the two measures (pretending that they have densities) is that the the relative entropy measures how much they differ in a vertical sense only, but the Wasserstein metric allows for sideways nudgings of the two graphs.
edited Mar 17 at 0:37
Ankitp
21029
21029
answered Mar 16 at 22:17
kimchi loverkimchi lover
11.5k31229
11.5k31229
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3150894%2frelative-entropy-and-the-wasserstein-distance%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown