-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added framework to easily translate between Data and other classes #2049
Added framework to easily translate between Data and other classes #2049
Conversation
73e9665
to
5fa2e91
Compare
Codecov Report
@@ Coverage Diff @@
## master #2049 +/- ##
==========================================
+ Coverage 89.63% 89.72% +0.09%
==========================================
Files 399 401 +2
Lines 38694 39095 +401
==========================================
+ Hits 34684 35079 +395
- Misses 4010 4016 +6
Continue to review full report at Codecov.
|
I've now also added the ability to get translate the selection definitions - for example if I define the following translation class for astropy regions: from glue.config import subset_definition_translator
from glue.core.subset import RoiSubsetState
from glue.core.roi import RectangularROI
from regions import RectanglePixelRegion, PixCoord
@subset_definition_translator('astropy-regions')
class AstropyRegionsHandler:
def to_object(self, subset):
data = subset.data
if data.ndim != 2:
raise NotImplementedError("Can only handle 2-d datasets at this time")
subset_state = subset.subset_state
if isinstance(subset_state, RoiSubsetState):
if subset_state.xatt != data.pixel_component_ids[1]:
raise ValueError('subset state xatt should be x pixel coordinate')
if subset_state.yatt != data.pixel_component_ids[0]:
raise ValueError('subset state yatt should be y pixel coordinate')
roi = subset_state.roi
if isinstance(roi, RectangularROI):
xcen = 0.5 * (roi.xmin + roi.xmax)
ycen = 0.5 * (roi.ymin + roi.ymax)
width = roi.xmax - roi.xmin
height = roi.ymax - roi.ymin
return RectanglePixelRegion(PixCoord(xcen, ycen), width, height)
else:
raise NotImplementedError("ROIs of type {0} are not yet supported".format(roi.__class__.__name__))
else:
raise NotImplementedError("Subset states of type {0} are not yet supported".format(subset_state.__class__.__name__)) I can then do the following: In [8]: from glue.core.roi import RectangularROI
In [9]: from glue.core.subset import RoiSubsetState
In [10]: subset_state = RoiSubsetState(dc['image'].pixel_component_ids[1],
...: dc['image'].pixel_component_ids[0],
...: RectangularROI(1, 3.5, -0.2, 3.3))
In [12]: dc.new_subset_group(subset_state=subset_state, label='spatial selection')
Out[12]: <glue.core.subset_group.SubsetGroup at 0x10bd9e8d0>
In [13]: dc['image'].get_selection_definition(format='astropy-regions')
Out[13]: <RectanglePixelRegion(PixCoord(x=2.25, y=1.5499999999999998), width=2.5, height=3.5, angle=0.0 deg)> Note that by default if there is just one dataset and one subset, as above, you don't need to specify the data label or subset label. In terms of where translators should live for e.g. CCDData, Spectrum1D, and astropy regions, I think it's finally time for me to start a glue-astronomy package to contain astronomy-specific things. If we write translation functions for e.g. pandas data frames and so on, those could live in the core package. Anyway, this is ready to play around with - we should discuss the API once you've had a chance to take a looks! |
a5d3a7b
to
2de4e60
Compare
Just FYI I've started adding translators for CCDData, Spectrum1D, and astropy regions to a new glue-astronomy plugin: https://glue-astronomy.readthedocs.io/en/latest/ |
I played around with this for a bit today w/ glue-jupyter, @astrofrog (not the regions part yet, but just the data part) - you can see my results here: So the first-level thing is: in the case of your example from above, it worked great, so that's promising! However, second-level, the current state is: I struggled a lot with understanding how to make my own translators, which I was intentionally trying to do on my own as sort of a "canary in a coal mine" experiment on the theory that users or downstream devs might have lots of their own native objects they want to carry around in glue. I think my initial take-aways from that are:
2 in particular is worrying me a bit, but also suggests a possible adaptation/alteration of this approach: is it possible for the "internal" representation to actually be the native data object, and essentially have |
@eteq - thanks for testing this out! Here are some thoughts/comments:
the issue is actually with specutils, because I don't think we can easily make glue understand arbitrary third-party coordinate objects, and some effort is required when writing the translator to convert things to what glue understands. But glue could do better type checking and give more explicit errors when doing the things you tried. |
@eteq - with glue-viz/glue-jupyter#127 you should be able to do: app.add_data(galaxies=datadct)
app.scatter2d(x='distance', y='velocity', data='galaxies') instead of: data = app.add_data(galaxies=datadct)[0]
app.scatter2d('distance', 'velocity', data=data) and ccddata = CCDData(np.random.random((3, 3)), unit='mJy')
app.add_data(image=ccddata)
app.imshow(data='image') instead of ccddata = CCDData(np.random.random((3, 3)), unit='mJy')
app.data_collection['image'] = ccddata
app.imshow(data=app.data_collection['image']) Does that help a little in terms of the confusion with the data collection stuff? I could then add convenience methods on |
Note to self: I should also add |
I looked at the code examples, but didn't try to run anything. It seems to me a very good idea. Allows users to stay in their comfort zone by letting them access their objects with their familiar APIs, even though they are also glue Data instances. This ultimately contributes for a higher rate of buy-in by the user community. The only concern that came to mind is that it has the potential to generate help calls, from users that are not familiar with the innards and capabilities of glue Data objects, but venture anyways in developing their own translators. We can imagine that there will be cases where mapping in between the two worlds will be difficult, obscure, or even impossible. This can be of course mitigated by good documentation, including some worked examples and plenty of pointers to the already good glue docs. |
@ibusko - the notebook I referenced above was made with these versions:
Maybe you can try those first just to see if that works for you ? I think to work with the suggestions from #2049 (comment) you'll need to get a more recent glue-jupyter, though, so it's possible something it that might break part of my notebook? Easiest thing is probably to just try it and see... |
I tried |
I started (re-) implementing @eteq's gist into a notebook and found some things...
but it threw an exception
gives me:
For all of these I could be mis-inerpreting things. |
8d40d2e
to
5d34e5b
Compare
I'm working on this again! @brechmos-stsci - thanks for the detailed testing. The first issue should now be fixed here (setting e.g. |
Regarding the issue with the multi-dimensional flux columns, this will require some more thought, because glue data objects haven't previously been used to store data where the data has more dimensions than the number of coordinates. We can definitely get this to work somehow, but let's discuss that in glue-viz/glue-astronomy#2 instead. For now, let's focus on cases where there are no 'vector' attributes for the sake of getting this PR merged. |
…ollection to Data
5d34e5b
to
80257aa
Compare
…iting documentation about data translation
In the interest of moving things forward, I am going to merge this then will try and improve the implementations of various translations in glue-astronomy. Once this is ready, we can review whether the current API is suitable or not, but perfect is the enemy of good so I'd rather get something in :) |
As discussed in glue-viz/glue-jupyter#30, it would be nice to have a way of easily being able to get more 'intuitive' data objects that users are used to rather than glue
Data
objects. This PR is a work in progress to add a new data translator registry and methods on the data collection to be able to set or get data as non-glue data objects, and being able to also get subsets as these kinds of objects.With this infrastructure, one can define a data translator as follows:
with this translator registered (thanks to the
@data_translator
decorator), we can then do: