Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manipulating flags is a pain in the neck #1073

Open
aarchiba opened this issue Jul 15, 2021 · 7 comments
Open

Manipulating flags is a pain in the neck #1073

aarchiba opened this issue Jul 15, 2021 · 7 comments

Comments

@aarchiba
Copy link
Contributor

When users need to work with the flags on TOAs objects, they generally need to write for loops for even basic operations (set all or some to TOAs to have a particular value, for example). #1070 has some discussion that demonstrates this. Some specific headaches:

  • Flags can't be manipulated (read from, written to) in an array-like manner
  • Flags and values are all strings when read from tim files, but users can and do store other things
  • Flags appear in tim files with dashes (-fish) but in the flags dictionary without them (d['fish']) and users are not warned if they make errors in this; maskParameters actually do need the dash
  • Users assigning to flag keys or values can easily produce values that don't survive writing to tim files
@aarchiba
Copy link
Contributor Author

One suggestion for easier access: augment the indexing operator on TOAs objects so that toas['mjd'] or toas['fish'] produces an array that looks like an appropriate column. This would enable:

toas['fish', toas['mjd']>58000] = 'carp'

or

some_toas = toas[toas['fish']=='carp']

@aarchiba
Copy link
Contributor Author

One suggestion for data validation: create a FlagsDict class that only accepts valid keys and values, and store that in the astropy table instead of the current plain dicts.

@scottransom
Copy link
Member

I really like the idea of moving the flag setting/getting loops into the indexing operation. It is still looping (and I don't really see a good way around that given the way tables work), but at least it is nice syntactic sugar.

@aarchiba
Copy link
Contributor Author

Now in #1074; open for comments. But you can do toas[10:20, "gui_jump"] = "yes".

We could make all flags into columns in the table. This would allow very uniform handling with some caveats:

  • We'd need to specify types for each flag (e.g. -pp_dm is a Quantity in pc/cm3) and corresponding ways to read and write them as strings,
  • we'd need a well-defined "no value" placeholder, since Astropy tables do not provide this (the empty string is natural for string types),
  • we'd have to make adding a flag easier than constructing a new column for addition to the TOAs object, and
  • this would really break how people access flags now, and it would take some serious witchcraft to provide a workalike API.

@aarchiba
Copy link
Contributor Author

Just to confuse the issue, if you ever did a select_toas_mask based on a flag, that function added a column to toas.table named after the flag and with its values. No attempt was made to keep this column up to date.

@scottransom
Copy link
Member

I'm way behind on reading discussion for this and the other related issues and PRs, but I've seen questions/comments about not keeping columns up to date. I don't think there is any desire for that at all. One of the reasons why we liked using Astropy tables is that we could do temporary calculations and save them in the table in memory -- with no intention of ever writing them out. To me it seems like it should be to the user to keep the tables updated if they are changing things on purpose.

@aarchiba
Copy link
Contributor Author

My concern is when PINT has the same information stored in two places, and users start changing things, how do we keep those two copies of the information in sync?

My answer to that is, if at all possible, don't duplicate information like that. If duplication is necessary, use python's magic properties and the like to prevent changes; failing that, to keep the versions in sync.

If you're going to quietly cache something you need to solve the cache invalidation problem.

Currently we have a situation where users occasionally need to run .setup() on models after making some kinds of change in order to update various internal structures. It's a pain; I've been bitten by it several times in the last week. But I think we're stuck with it.

Can we please not add to this problem? Specific to this issue, if we're going to make flags into columns, we must replace the flag dictionaries, not duplicate their values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants