Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plotting updates from #106 #110

Merged
merged 8 commits into from
Mar 27, 2018

Conversation

kreczko
Copy link
Member

@kreczko kreczko commented Mar 14, 2018

This PR adds all plotting related changes from #106.

@kreczko
Copy link
Member Author

kreczko commented Mar 14, 2018

@benkrikler These modules definitely needs some more work, I've created an issue regarding the style information (#109).

Since you wrote most of this, could you please have a quick look at @professor-calculus' and @bundocka's work?

I've tidied things up here and there.

@kreczko kreczko requested a review from benkrikler March 16, 2018 11:24
Copy link
Contributor

@benkrikler benkrikler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing particularly serious, but many places where we could improve.

for threshold in all_pileup_effs.iter_all():
if not isinstance(threshold, int):
continue
hist = all_pileup_effs.get_bin_contents(threshold)
hist.drawstyle = "EP"
hist.drawstyle = "P"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be hist.drawstyle = EfficiencyPlot.drawstyle ?

canvas = draw(hists, draw_args={"xtitle": self.offline_title,
"ytitle": "Efficiency"})
draw_args = {"xtitle": self.offline_title, "ytitle": "Efficiency"}
# TODO: special case should not be implemented here!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this comment! I'd be tempted to say we take this out of the PR so we don't let this sort of issue creep into the codebase, and then we work out a more long-term solution. Need finer control over plotting directly from the yaml config file, I suppose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean like # TODO: Remove when we update rootpy to >0.9.2 :)

I will put it in as it will force me to create the issue. Miss the gitlab feature "resolve with issue"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've no objection to TODO comments in the code, I meant let's take out the actual code below this comment that the comment was referring to. That's the sort of issue I want to avoid creeping in, not an issue of having TODO comments in there!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes more sense ;).

However, at the moment we do not have a better solution, do we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we could take it out of the PR, then let Aaron or Alex put it back in on their branch alone, or uncommitted until we come up with a better solution. If we do let it in, we'd want a very specific issue to be opened, I suppose

Copy link
Member Author

@kreczko kreczko Mar 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I had in mind here is to inject (like analyzers & filters in the new model) Styles into plotting routines.
This is something that needs to be done on our side and I am not sure about the exact form yet.
But having this code in master will force us to do something about it (and serves as a good example for the issue).

Let me create the issue and then we can merge this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, issue already exists :)
#109


def get_cumulative_hist(hist):
h = hist.clone(hist.name + '_cumul')
arr = np.cumsum(_reverse([bin.value for bin in hist]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we already have root_numpy in the requirements for this framework? If so we could make use of it here instead, perhaps? Probably not worth it 2bh.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR at least ;).

Copy link
Member Author

@kreczko kreczko Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def cumulative_sum_and_error(hist):
    '''
        Takes a histogram and returns an array of cumulative sums of it with
        the total first.
        E.g.
        histogram entries: [1, 2, 3, 4]
        histogram errors: [1, 1, 2, 2]
        Output: [10, 9, 7, 4], [3.16227766, 3, 2.82842712, 2]
    '''
    hist_values = [b.value for b in hist]
    reversed_cumsum = _reversed_cumulative_sum(hist_values)

    errors_squared = np.square([b.error for b in hist])
    reversed_cumsum_errors = np.sqrt(_reversed_cumulative_sum(errors_squared))

    return reversed_cumsum, reversed_cumsum_errors


def _reversed_cumulative_sum(values):
    reversed_values = np.flipud(values)
    cumsum = np.cumsum(reversed_values)
    reversed_cumsum = np.flipud(cumsum)
    return reversed_cumsum

and the previous code becomes

values, errors = cumulative_sum_and_error(hist)
h.set_content(values)
h.set_error(errors)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this looks good to me. Might even be worth putting this function in a utility module since it could be useful in multiple places. Or even submit a PR to rootpy itself for a new Hist1D method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmsl1t.utils.hist

bin1 = h.get_bin_content(1)
if bin1 != 0:
h.GetSumw2()
h.Scale(4.0e7 / bin1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this magic 4.0e7 come from? Is it something that depends on the run conditions, eg, valid for Run-2 but not for another era, etc? Either way, can we assign a variable with some descriptive name to this value first, then use the variable within this calculation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

40MHz pp collision rate, so it's chosen to just scale everything in reference to that I think.

return h


def _reverse(a):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, this function is only used on 1D lists? If so, you could probably just use the built-in reverse method, reversed(), which would simplify the code and make it easier to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the code would then be

arr = np.cumsum(list(reversed([bin.value for bin in hist])))

not convinced that this is better.

Copy link
Contributor

@benkrikler benkrikler Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true reversed would return an iterator object, so you'd need the list. Still, I'm not sure wrapping this simple operation in a function is worthwhile. Wouldn't this work:

arr = np.cumsum(np.flipud([bin.value for bin in hist])))

or even

arr = np.cumsum([bin.value for bin in reversed(hist)]))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or

arr = cumulative_sum(hist)

# if with_fits:
# fits.append(self.fits.get_bin_contents([pile_up]))
if with_fits:
fits.append(self.fits.get_bin_contents([pile_up]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a left-over from copying this file from the Efficiency plots, but we don't actually seem to assign to self.fits, so I'm pretty sure this would crash. I might actually be responsible for this, I honestly don't remember anymore, but either way, it does look like this class hasn't been used with any fitting. Might be worth to comment out again or delete, unless you can be bothered to add the fitting yourself. As it's a resolution, the fit is simple as a Gaussian (symmetric), or an exponentiallly modified Gaussian (asymmetric), both of which ROOT's TMath has native support for, I believe.

for hist in normed_hists:
if hist.integral != 0:
hist = hist / hist.integral()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something strange here in that integral seems to be working as both a function and an attribute on hist. If it's really a function, then I suppose line 81 will always return true. I think rootpy might be defining integral as a getter property, in which case I dont think calling it as a function will work. Not sure, but it's a bit confusing to me at least.

normed_hists = [hist.Clone() for hist in hists]
for hist in normed_hists:
if hist.integral != 0:
hist = hist / hist.integral()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above: is hist.integral a function or just a value (possibly returned as an @property)? Even if both are valid code, its nicer to stick to just one.


def __make_overlay(self, hists, fits, labels, ytitle, suffix=""):
with preserve_current_style():
# Draw each resolution (with fit)
# TODO: this feels like it does not belong here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree. The TitleOffset is less serious, but setting the range for all plots seems a bit too specific. As well as making it an option in the config file (somehow) we could at least put it as a parameter passed into the init method of this class.

for hist, label in zip(hists, labels):
legend.AddEntry(hist, label)
legend.SetBorderSize(0)
legend.Draw()

ymax = 1.2 * hists[-1].GetMaximum()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is hists sorted? Are we sure that the last one in the list will have the largest value? Might need to do something like:

ymax = 1.2 * max([hist.GetMaximum() for hist in hists])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@kreczko
Copy link
Member Author

kreczko commented Mar 23, 2018

Added cmsl1t.math and cmsl1t.utils.hist to remove duplicated code from various analysers and plotting modules.

@kreczko kreczko force-pushed the kreczko-aaron-alex-plotting branch from 26eca1f to 5e83b85 Compare March 23, 2018 13:49
@kreczko kreczko merged commit dcf0276 into cms-l1t-offline:master Mar 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants