-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap histogram #1072
Wrap histogram #1072
Changes from 32 commits
c042b0a
4409760
fe086b3
da2539f
f1cbad2
c2ec478
ac99e08
1e0e9aa
6ec0a66
fb2b4ce
8899a48
d66de75
d30037c
b8040ab
2439d6a
05c005f
34af05f
1cfac35
72c022c
14aa960
c84b008
b5cf339
780b820
d9b8e39
151d90b
5ab0e5d
1bc9300
6bc7fa2
69f5d4d
55c3fbc
1d86dc7
c914954
cb9e25e
30e2004
85201e5
9980315
f1575f9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -411,6 +411,7 @@ def _repr_html_(self): | |
grdcontour, | ||
grdimage, | ||
grdview, | ||
histogram, | ||
image, | ||
inset, | ||
legend, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
""" | ||
Histogram - Create a histogram | ||
""" | ||
from pygmt.clib import Session | ||
from pygmt.helpers import build_arg_string, fmt_docstring, kwargs_to_strings, use_alias | ||
|
||
|
||
@fmt_docstring | ||
@use_alias( | ||
A="horizontal", | ||
B="frame", | ||
C="cmap", | ||
G="fill", | ||
J="projection", | ||
R="region", | ||
T="series", | ||
W="pen", | ||
c="panel", | ||
l="label", | ||
p="perspective", | ||
willschlitzer marked this conversation as resolved.
Show resolved
Hide resolved
|
||
) | ||
@kwargs_to_strings(R="sequence", T="sequence") | ||
def histogram(self, table, **kwargs): | ||
weiji14 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
r""" | ||
Plots a histogram, and can read data from a file or | ||
list, array, or dataframe. | ||
|
||
Full option list at :gmt-docs:`histogram.html` | ||
|
||
{aliases} | ||
|
||
Parameters | ||
---------- | ||
table : str, list, or 1d array | ||
A data file name, list, or 1d numpy array. This is a required argument. | ||
{J} | ||
{R} | ||
{B} | ||
{CPT} | ||
{G} | ||
{W} | ||
{c} | ||
label : str | ||
Add a legend entry for the symbol or line being plotted. | ||
{p} | ||
willschlitzer marked this conversation as resolved.
Show resolved
Hide resolved
|
||
horizontal : bool | ||
Plot the histogram using horizonal bars instead of the | ||
default vertical bars. | ||
interval : int or str | ||
willschlitzer marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[*min*\ /*max*\ /]\ *inc*\ [**+n**\ ] | ||
Set the interval for the width of each bar in the histogram. | ||
|
||
""" | ||
willschlitzer marked this conversation as resolved.
Show resolved
Hide resolved
|
||
kwargs = self._preprocess(**kwargs) # pylint: disable=protected-access | ||
with Session() as lib: | ||
file_context = lib.virtualfile_from_data(check_kind="vector", data=table) | ||
with file_context as infile: | ||
arg_str = " ".join([infile, build_arg_string(kwargs)]) | ||
lib.call_module("histogram", arg_str) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
outs: | ||
- md5: 314634815787d56a557f9646dcba7526 | ||
size: 21674 | ||
path: test_histogram.png |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# pylint: disable=redefined-outer-name | ||
""" | ||
Tests histogram. | ||
""" | ||
import pytest | ||
from pygmt import Figure | ||
|
||
|
||
@pytest.fixture(scope="module") | ||
def table(): | ||
""" | ||
Returns a list of integers to be used in the histogram. | ||
""" | ||
return [1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 4, 5, 6, 7, 8, 8, 8, 8, 8, 8] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @seisman I just tried it on both GMT 6.1.1 and 6.2.0rc1, with the figure coming out the same way. Any idea what might be causing the miscount for 8? Also, changing the projection to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like an upstream bug. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll try and recreate it in GMT and open an upstream issue There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. GMT automatically determines the x range from input data. In this example, xmin=1, xmax=8, but the vertical bar for x=8 is outside of the [1, 8] range, so it's not plotted. You should be able to see the x=8 bar, if you manually set There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
BTW, the large vertical scale makes the image difficult to read, why not just use "X10c/10c" for a square basemap? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @seisman I made your suggested changes |
||
|
||
|
||
@pytest.mark.mpl_image_compare | ||
def test_histogram(table): | ||
""" | ||
Tests plotting a histogram using a list of integers. | ||
""" | ||
fig = Figure() | ||
fig.histogram(table=table, projection="X10c/25c", series=1, frame="a", fill="green") | ||
return fig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
series
a good name? Other Python histogram functions (e.g.,numpy.histogram
,plt.histogram
usebins
instead.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment on #1072 (comment). Probably not a good idea to use
bins
as the input for-T
is structured quite differently to thebins
argument innp.histogram
andplt.histogram
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it depends on how we want to implement the -T option in PyGMT.
histogram -T have following usages:
The current implementation doesn't work well for usage 4, since users have to pass a string "0,3,4,6,10".
Instead, if we adopt the syntax of
np.histogram
, we can use more Pythonic syntaxThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we're using
@kwargs_to_strings(T="sequence")
which implies Option 2 (-T0/200/5) and I think this works well formakecpt
/grd2cpt
where regular sized bins are the norm, and alsohistogram
to some extent. Doing Option 4 would mean using@kwargs_to_strings(T="sequence_comma")
and I guess this would be fine for people wanting irregular sized bins forhistogram
.I don't often plot histograms, so I'll leave it up to you all to decide which one is better. Ideally we would standardize the
-T
option across all modules, but we could also useseries
(with T="sequence") formakecpt
/grd2cpt
, andbins
(with T="sequence_comma") forhistogram
if that makes more sense for new users.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note here: we don't have to use
@use_alias
and@kwargs_to_strings
for -T.Instead, we can check if the
bins
(or any other parameter name) argument is a string, one value, three values, or a list of values. I think it provides the most Pythonic syntax and also the greatest flexibility.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. For a single histogram plot, it would be be much easier to use something like the built-in pandas hist function (e.g. just do
df.hist()
). But for say, mixing agrdview
plot with ahistogram
as an 'inset', then using PyGMT would be better 😉Let's get another opinion. Ping @core-man since he seemed interested in
histogram
at #563 (comment). What syntax do you think PyGMT'shistogram
should follow, GMT's or Matplotlib's?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During my 10-year seismological studies, I only use GMT to plot figures and sometimes use AI/3D max to plot cartoons. I began to use Python and PyGMT at the same time last year, so I have never learned matplotlib. So for my study, GMT/PyGMT is enough right now. For the PyGMT scope I understand now, we'd like to provide a good Python interface for GMT. So as we go to v1.0.0, I prefer more Pythonic syntax instead of strict GMT's (e.g., #1117). At that time, I think (also hope) most GMT users will learn GMT by using PyGMT.
For this issue, I think the current implementation is okay and most users can use it well (I never use option 4). However, it's better we could refractor all the similar parameters to be more Pythonic syntax as we go to v1.0.0, e.g.,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should move to a more Pythonic syntax overall. Assuming that is the eventual plan, it's creating more work for our future selves by wrapping new features with a GMT syntax. That being said, I think there are more commonly-used features (setting map projections, updating pen/fill settings, adding a frame) that should be prioritized if the plan is to begin incorporating Pythonic syntax. This is certainly an issue outside of the scope of wrapping
histogram
, but I think if we are changing new features to be in-line with a more Pythonic syntax we should be also updating our old features ASAP.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GenericMappingTools/pygmt-contributors Maybe I am missing it, but is there a consensus on how this function should be modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of time and consistency with other PyGMT modules like makecpt/grd2cpt, I say we stick with using
series
(-T) instead ofbins
. For better or worse, we are still heavily skewed towards the style of GMT core rather than NumPy, and decisions of what is 'Pythonic' should go in a separate issue. It is better to be consistently like GMT, rather than confuse users with a mix of NumPy-like and GMT command-line-like syntax. So @willschlitzer, let's keep things as it is for now :)