-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fitting user-specified PDF, e.g. power spectral density #62
Comments
Thanks, Ondrej!
For your needs, this is the relevant paper:
https://projecteuclid.org/euclid.aoas/1396966280
I haven't implemented it, but there may be an implementation somewhere here:
http://tuvalu.santafe.edu/~aaronc/powerlaws/
If a good implementation were created for powerlaw, I'd happily bring it on
board.
…On Tue, Oct 2, 2018 at 10:20 AM Ondrej Grover ***@***.***> wrote:
I really like your software, it makes it easier to judge the hype of
powerlaws in datasets.
However, right now it focuses on fitting full datasets, creating their PDf
and CDF on the fly. I'd like to use in situations where I already have a
PDF (defined at several points) - or generally a distribution function of
some sort - and fits shape in some range.
An example is the power spectral density of turbulent plasmas, where there
is an ongoing discussion
<https://dx.doi.org/10.1103/PhysRevLett.107.185003> whether they are
powerlaws or exponentials.
I'd be wiling to contribute modifications to powerlaw which would make
this optional sue-case possible. But I would greatly appreciate if you
could point out how best to approach this issue.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#62>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA6_rwpkcC2lQ7V6ndeNkTO_JlVtoSrbks5ug3Y9gaJpZM4XEI98>
.
|
Thank you for the reply. |
I also found out that their implementation of the operations on binned data is available at http://tuvalu.santafe.edu/~aaronc/powerlaws/bins/ |
The methods currently in powerlaw do not do fitting based on the binned
data; they work directly on the data points themselves. Binning is done
only for visualizing PDFs (in a sense there is no binning for CDFs, which
is actually a major reason to use them for visualization, as they do less
damage to the data in presentation).
…On Wed, Oct 3, 2018 at 2:44 AM Ondrej Grover ***@***.***> wrote:
Thank you for the reply.
My naive hope was that it would suffice to simply enable the user to
specify the cdf and bins directly, i.e. set self.fitting_cdf_bins,
self.fitting_cdf without the actual data as done
[here](self.fitting_cdf_bins, self.fitting_cdf). Then I would probably have
to change operations later on to operate on the CDF instead of the data
itself.
Perhaps a reasonable approach would be to wrap the data in some object
which would expose methods such as cdf, this would separate whatever
source of the information on the data distribution from the actual
calculation with the distribution.
But perhaps I have missed some part where access to actual data is
necessary.
What do you think about this approach?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA6_r_GX72BB3SSY38M4tKXzolSJM_YRks5uhFzbgaJpZM4XEI98>
.
|
I've been reading that article and I began to realize that it may not be directly applicable to the PSD case. The reason is that most algorithms (FFT or wavelet) do not give the PSD as a histogram, but rather actual point-wise estimates, i.e. A dirty (probably not completely wrong, but neither right) workaround would be to generate surrogate datasets based on the pdf given by the PSD. I've seen it done e.g. here. Perhaps I should get in touch with Clauset and ask him for guidance in this. |
Clauset seems to be on sabbatical. I had another idea, perhaps I could simply use the Kolmogorov-Smirnov test to determine the "distance" between the PSD and a given distribution. Chi^2 might be an alternative. But that would mean determining the fitted parameters an |
Mentioning directly @aaronclauset in case you have time (and interest) to comment. |
I really like your software, it makes it easier to judge the hype of powerlaws in datasets.
However, right now it focuses on fitting full datasets, creating their PDF and CDF on the fly. I'd like to use in situations where I already have a PDF (defined at several points) - or generally a distribution function of some sort - and fit its shape in some range.
An example is the power spectral density of fluctuations in turbulent plasmas, where there is an ongoing discussion whether they are powerlaws or exponentials.
I'd be wiling to contribute modifications to powerlaw which would make this optional sue-case possible. But I would greatly appreciate if you could point out how best to approach this issue.
The text was updated successfully, but these errors were encountered: