Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving code over to sdg-build #1

Open
brockfanning opened this issue Apr 26, 2019 · 6 comments
Open

Moving code over to sdg-build #1

brockfanning opened this issue Apr 26, 2019 · 6 comments

Comments

@brockfanning
Copy link

Hi @LucyGwilliamAdmin, I'd like to start looking at getting your code over to sdg-build as an "Input". First I want to make sure I understand the basic architecture:

  • Step 1 is to convert the DSD into a mapping of codes to human-readable labels
  • Step 2 is to parse the SDMX file and use the code-mapping to create a CSV from the human-readable labels

My first attempt at an SDMX import was with SDMX-JSON format, so your SDMX-ML parser will make a nice compliment. Here is an example of what I think we should try to do with your code, by making it into an "Input" in sdg-build: open-sdg/sdg-build#31

@LucyGwilliamAdmin
Copy link
Owner

LucyGwilliamAdmin commented Apr 26, 2019

Hi @brockfanning, so this would mean that users could upload their SDMX-ML files to the sdg-repo and they'd be converted to SDMX?

Yes, that's correct.

Step 1: since there's a script to convert DSD to CSV, would this involve uploading the DSD to countries data or indicators repo and then mapping created by sdg-build? would that mean that a mapping is created every time the site is built? or could it be done the first time it's uploaded and then that mapping as also saved (say we use UK_DSDv1.2, then the DSD file would be called UK_DSDv1.2.xml and we'd get the sdg-build to save the mapping as UK_mappingv1.2.csv) so that next time its build it can check the version of the both files and if same then skip creating mapping or something? I guess it depends on how long creating the mapping each time takes.

You might have already thought about this, but just some thoughts I had.

@LucyGwilliamAdmin
Copy link
Owner

Also, I can see the code you linked to for the SDMX-JSON input is in a different format (oop?) to what I've done with the SDMX-ML.

Let me know if you want me to do anything with the SDMX-ML code to make it better to be built into sdg-build - I'm happy to help

@brockfanning
Copy link
Author

If we implement your code as an "Input" in sdg-build, this would mean that users could use existing SDMX-ML as a data source for their Open SDG site, instead of the normal approach with CSV files. It might be called "InputSdmxML" for example, and would probably take 2 parameters: the URL of an SDMX-ML file, and the URL of the DSD file. Then it would perform the input/parsing/etc and sdg-build would output everything in the usual JSON needed by Open SDG. So in a nutsell, it would allow using SDMX-ML instead of CSV files.

To the question of caching the mapping of the DSD -- it may be possible to cache this map, but I would recommend against it. It might seem wasteful to repeat the parsing of the DSD on each import, but since this will be automated, we won't notice the extra work. :) And if there is caching then we have to worry about how to recognize changes in the DSD.

So yes the next step I think is to submit your code in the OOP format of an Input class in sdg-build. We could devote a tech call to this process, or have a hangout/hackathon to get it done. I also plan to update my PR for SDMX-JSON to include more documentation and examples.

@brockfanning
Copy link
Author

@LucyGwilliamAdmin Could you point me to the code that generated this file?

@LucyGwilliamAdmin
Copy link
Owner

Hi @brockfanning, that file was generated using ILOSTAT's SMART tool. Some more info about creating the mapping is in this file - I think that is something the country we have to manually.

There is also some more information here

@brockfanning
Copy link
Author

Very helpful, thanks @LucyGwilliamAdmin !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants