-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ssl_taxoname vs. most recent entry in petaxhistory #8
Comments
select count(distinct pedlabsampnum) from site;
count
-------
64153
select count(distinct pedlabsampnum) from nasis_site;
count
-------
80462 Reasons, as suggested by Henry F.
|
How many 3135 rows SELECT DISTINCT site.pedlabsampnum as kssl_plsn, nasis_site.pedlabsampnum as nasis_plsn
FROM kssl.site
LEFT JOIN kssl.nasis_site USING (pedlabsampnum)
WHERE nasis_site.pedlabsampnum IS NULL; 19444 rows SELECT count(DISTINCT nasis_site .pedlabsampnum)
FROM kssl.site
RIGHT JOIN kssl.nasis_site USING (pedlabsampnum)
WHERE site.pedlabsampnum IS NULL; |
How many records in the KSSL snapshot have a about 1500 SELECT pedon_key, nasis_site.pedlabsampnum, nasis_site.peiid, site.taxonname, nasis_taxhistory.taxonname as tn,
kssl_to_nasis_sanity.labdatadescflag, kssl_to_nasis_sanity.objwlupdated
FROM kssl.site
JOIN kssl.kssl_to_nasis_sanity ON site.pedlabsampnum = kssl_to_nasis_sanity.pedlabsampnum
JOIN kssl.nasis_site ON kssl_to_nasis_sanity.peiid = nasis_site.peiid
JOIN kssl.nasis_taxhistory ON kssl_to_nasis_sanity.peiid = nasis_taxhistory.peiid
WHERE nasis_taxhistory.taxonname != site.taxonname; A single example: SELECT pedon_key, nasis_site.pedlabsampnum, nasis_site.peiid, site.taxonname, nasis_taxhistory.taxonname as tn,
kssl_to_nasis_sanity.labdatadescflag, kssl_to_nasis_sanity.objwlupdated
FROM kssl.site
JOIN kssl.kssl_to_nasis_sanity ON site.pedlabsampnum = kssl_to_nasis_sanity.pedlabsampnum
JOIN kssl.nasis_site ON kssl_to_nasis_sanity.peiid = nasis_site.peiid
JOIN kssl.nasis_taxhistory ON kssl_to_nasis_sanity.peiid = nasis_taxhistory.peiid
WHERE nasis_taxhistory.taxonname != site.taxonname
AND site.taxonname = 'rescue';
|
Records in kssl.site that have no correspdonding record in kssl.nasis_site OR kssl.nasis_taxhistory. 3115 rows SELECT pedon_key, site.pedlabsampnum, site.taxonname,
nasis_site.pedlabsampnum, nasis_site.peiid, nasis_taxhistory.taxonname as nasis_tn,
kssl_to_nasis_sanity.labdatadescflag, kssl_to_nasis_sanity.objwlupdated
FROM kssl.site
LEFT JOIN kssl.kssl_to_nasis_sanity ON site.pedlabsampnum = kssl_to_nasis_sanity.pedlabsampnum
LEFT JOIN kssl.nasis_site ON kssl_to_nasis_sanity.peiid = nasis_site.peiid
LEFT JOIN kssl.nasis_taxhistory ON kssl_to_nasis_sanity.peiid = nasis_taxhistory.peiid
WHERE nasis_site.peiid IS NULL
AND nasis_taxhistory.peiid IS NULL; |
Some ideas on how to publish a simpler representation in this related issue. |
Update: There is a new table that makes a reasonable linkage between KSSL -- > NASIS site/pedon/taxa: --
-- enforce sanity when connecting KSSL "site" to NASIS data
-- create a new table that ensures 1:1 between labsampnum:peiid AND excludes bogus pedlabsampnum (e.g. ".", "none", "None")
--
DROP TABLE kssl_to_nasis_sanity;
CREATE TABLE kssl_to_nasis_sanity AS
SELECT DISTINCT ON (pedlabsampnum) pedlabsampnum, peiid, labdatadescflag, objwlupdated
FROM nasis_site
WHERE pedlabsampnum IS NOT NULL
AND pedlabsampnum NOT IN ('.', '-', 'none', 'None', '1', 'Unknown', '5', '6', '3')
-- pedon precidence: lab pedon = YES, latest modification date, highest peiid
ORDER BY pedlabsampnum, labdatadescflag DESC, objwlupdated DESC, peiid DESC; However, this will implicitly exclude records in the KSSL data that:
The orphaned KSSL records still have limited site / spatial / taxa information that is worth retaining. A good example, # fine this record in NASIS site data: OK
select peiid, site_id, pedon_id, pedlabsampnum, labdatadescflag, taxonname from nasis_site join nasis_taxhistory using (peiid) where site_id = 'S2011MI041003';
peiid | site_id | pedon_id | pedlabsampnum | labdatadescflag | taxonname
--------+---------------+---------------+---------------+-----------------+-----------
586039 | S2011MI041003 | S2011MI041003 | | No | Rubicon
# find this record in KSSL site: OK
select pedon_id, taxonname, pedlabsampnum from site where pedon_id = 'S2011MI041003';
pedon_id | taxonname | pedlabsampnum
---------------+-----------+---------------
S2011MI041003 | Rubicon | 12N7863
(1 row)
# this record is not in our sanity-checking table
# because: pedlabsampnum is not populated in NASIS!! (checked 2018-04-19)
select * from kssl_to_nasis_sanity where pedlabsampnum = '12N7863';
pedlabsampnum | peiid | labdatadescflag | objwlupdated
---------------+-------+-----------------+--------------
(0 rows) Next steps:
|
Another example, user pedon / site ID 79CA000003, pedlabsampnum 80P0159, pedon_key 7746. Basic taxonomic data in repository (kssl.site), skeleton site in NASIS with no taxonomic data. |
What is the logic used to generate
ssl_name
in the current snapshot, and when should the most recent record in the petaxhistory table be used instead?The text was updated successfully, but these errors were encountered: