Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with duplicate measurements, D.precedence, etc... #160

Open
ValentineHerr opened this issue Dec 6, 2018 · 9 comments
Open

problems with duplicate measurements, D.precedence, etc... #160

ValentineHerr opened this issue Dec 6, 2018 · 9 comments
Assignees

Comments

@ValentineHerr
Copy link
Member

@teixeira, if you can look at a few of the records in the D.groups bellow and tell me how you think we should handle that, that would be great.

  1. There are 10 D.groups that have multiple "1" in D.precedence.

    • for 4 of them (10, 11, 130, 291), I think it is because some record belong to more than one group of duplicates... --> maybe the answer is as simple as ignoring the one record that belongs to more than one D.group?
    • for 6 of them (1167, 1168, 1170, 1171, 1172, 1173) it is because there is one record that also belongs to a Replicate group... Maybe it is a mistake here?
  2. There is 41 D.groups that do not have a "1" in D.precedence:
    (387, 432, 433, 458, 475, 476, 489, 490, 493, 556, 580, 607,
    628, 629, 660, 667, 683, 684, 697, 698, 720, 731, 756, 757, 758,
    787, 793, 852, 940, 941, 956, 965, 982, 983, 996, 997, 1000,
    1002, 1003, 1059, 1106)
    I think again it is mostly because there are some records that belong to more than one group, or that are also part of a different group.

I'll try to look into it more closely but you might be faster than me at giving/removing precedence.

@ValentineHerr ValentineHerr changed the title problems with D.precedence problems with duplicate measurements, D.precedence, etc... Dec 11, 2018
@ValentineHerr
Copy link
Member Author

@teixeirak,
I believe that when I originally ran the code for duplicated measurements I was considering stand.age "999" as NA. So, 2 measurements of the same variable, at the same plot, with no dates, and with stand.age "999" were considered as S conflict, and both measurements were getting a capital S (which I think deletes them both when creating ForC_simplified).
Now I ran the code considering "999" as a 'known' stand.age so the 2 records above would get a conflict R if the method (citation.ID) is the same (see measurements.ID 14061 and 14062 for an example) or D if not (see measurements.ID 7782 and 7786 for an example - and precedence would be for the latest study).
That second second solution is better, right?

@ValentineHerr
Copy link
Member Author

@teixeirak, can you remind me what "P" is in conflict.type?
I believe you had entered that for some records at BCI.
Is there any way we can get rid of it and maybe keep something that means the same thing in conflict notes?

I am asking this because my code does not produce any "P" so it also erases it when the script is run. Also, now there are new records that might need that "P" but I don't know how it should be attributed.
For example, look at measuremets.ID 1207 and 18120... My code identifies them as duplicates with conflict.type "C;M:T" and D.precedence given to 18120. But 1207 originally has a "P" in conflict.type and I don't know what to do with it now.

@teixeirak
Copy link
Member

P stands for "plot" and indicates a plot duplicate (but not identical- at least in this case).

@teixeirak
Copy link
Member

yes

@teixeirak
Copy link
Member

Returning here, I'm noticing that my previous response was probably confusing. The "yes" (sent via email) refers to the earlier question ("That second second solution is better, right?").

@teixeirak
Copy link
Member

Regarding this: "for 4 of them (10, 11, 130, 291), I think it is because some record belong to more than one group of duplicates... --> maybe the answer is as simple as ignoring the one record that belongs to more than one D.group?" ... I'm confused. For 10 and 11, I see two records in the D.group, with precedence assigned to only one. For 130 and 291, I see just one record.

@teixeirak
Copy link
Member

Regarding this: "for 6 of them (1167, 1168, 1170, 1171, 1172, 1173) it is because there is one record that also belongs to a Replicate group... Maybe it is a mistake here?"....
These appear to be assigned to two separate groups of replicates (from separate sites). I suspect that manual editing accidentally re-assigned existing D.group numbers.

@teixeirak
Copy link
Member

Regarding this, "There is 41 D.groups that do not have a "1" in D.precedence:
(387, 432, 433, 458, 475, 476, 489, 490, 493, 556, 580, 607,
628, 629, 660, 667, 683, 684, 697, 698, 720, 731, 756, 757, 758,
787, 793, 852, 940, 941, 956, 965, 982, 983, 996, 997, 1000,
1002, 1003, 1059, 1106)
I think again it is mostly because there are some records that belong to more than one group, or that are also part of a different group."....

I'm not sure what's going on. Perhaps I messed up (many times!) when assigning precedence? I think these will just need to be re-done (unless you figure out what's going on).

@teixeirak
Copy link
Member

Regarding the "P" code, it should be handled the same as site duplicates/ super.sites.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants