-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fully parse all IEEE normtitle
entries
#4
Comments
@ronaldtse do you have any references to standards or IEEE PubID parsing implementations that could help? |
@ronaldtse what we want to do with parsed PubIDs? Do we need to convert it back to PubID, other formats? |
Code that is now used for PubID parsing is here: |
The source files for these entries are at https://github.com/relaton/ieee-rawbib. There are few problems:
|
Regarding pubid, notice that there are multiple types of IEEE PubIDs, and also some jointly-published ones with ISO PubIDs. Since we now have an ISO PubID implementation, it will help us here. |
@ronaldtse could you tell me what problem we are trying to solve here? Do we want to convert to another format or we want to distinguish these bibliographic entries from "ieee-rawbib" / build relations graph or something else? |
Right now, Relaton-IEEE is unable to parse all IEEE PubID entries due to parsing through using regular expressions. It has the following consequences:
i.e. we must properly parse IEEE PubIDs in order to make the full IEEE dataset available for citation. |
Will we use it (pubid-ieee) to replace https://github.com/relaton/relaton-ieee/blob/main/lib/relaton_ieee/rawbib_id_parser.rb ? |
Yes. |
@ronaldtse should we use pubid-iso to parse identifiers like:
? |
The ones that start with ISO, yes. But the rest are IEC identifiers, IEC PubIDs are similar to ISO’s but they have different stages, and allow a sub part (eg IEC 1000-1-2). We need to have a pubid-iec. |
IEEE Std 1073.1.1.1-2004 (https://standards.ieee.org/ieee/1073.1.1.1/1571/) Example of similar identifier: @ronaldtse I believe IEEE Std 1073.1.1.1-2004 should be "IEEE 1073-10101-2004" or "IEEE 11073-10101-2004", what do you think? |
No, we have to keep the original identifier. Its replacement "ISO/IEEE 11073-10101-2004" probably intentionally selected the 10101 part to keep identity with 1.1.1. Notice that 1073 became 11073 because ISO 1073 is already taken by another standard. This is causality in reverse. "P11073-10101c" means it is the "provisional" (i.e. draft) version of "11073-10101c". The "c" character means it is the 3rd Amendment to "11073-10101". According to the website, "P11073-10101c" is done in 2020 so it is a "draft amendment". i.e. historically:
|
@ronaldtse "IEEE 802.15.22.3-2020" - how can I know what is 22 and 3 here? |
I'm trying to find solution how I should treat these numbers. I had an idea to parse it as {number}.{part}.{subpart} but there are over 3 numbers. Maybe I can parse extra numbers as extra subparts. |
"IEEE 802.15.22.3-2020" "IEEE Standard for Spectrum Characterization and Occupancy Sensing":
|
I am not sure on whether there is a proper structure in IEEE identifiers. Some patterns are somewhat arbitrary (e.g. there exists 802.15.22.3 but not 802.15.22.1 and 802.15.22.2.) This is a topic we will need to investigate and analyse. |
@ronaldtse I believe we finished with this issue |
@mico we have 886 identifiers that are not yet being parsed, but I will make that into a new issue. |
Full PubIDs from IEEE:
pubid-sorted.txt.zip
Please also look at #2 and #3 for resolved details.
Method of generating this list:
Some observed rules (#2 (comment)):
/D{N}
or/D.{N}
or_D{N}
means draft NJoint publications:
P844.3/C22.2 293.3/D0, Aug 2018
is an "IEEE P844.3" joint standard with "CSA C22.2 No. 293.3", Draft 0.309/N42.3-1999
is "IEEE 309" joint with "ANSI N42.3"529-1980/Cor 1-2017
means it is a correction of "IEEE 592-1980" issued in 2017, the first corrigenda for the standardP1062/D.19, March 2015
means it is Draft 19 of "IEEE P1062"D
:C37.60/62271-111-2018
is joint IEC 62271-111 and IEEE Std C37.60-2018IEC/IEEE P60079-30-2/D4A, Jul 2013
The text was updated successfully, but these errors were encountered: