-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update convert-form action to include a hash which maps report back to the source form #395
Comments
Not sure whether to add this as a comment on medic/cht-core#5977 or if it is its own feature request (?) It seems fairly easy, but the value would be very high in SQL-land. |
This is a great addition. I don't think the hash should be in the Secondly I think it would make life easier to use a sequential version rather than a random looking hash. That would mean queries could be written to compare with |
Both seem excellent I think the "GitHub hash search" is pretty nice for investigations. Not sure if we can get both somehow. |
Would this be somewhat addressed by medic/cht-core#7310? |
Could we do a hash and the Unix time of the file (Unix modification time would work)? I like the sequence approach mentioned above, but a few concerns
|
@aidan-plenert-macdonald The changes needed for this are...
The code changes are quite minor but it's a complicated are, so once you've had a chance to read through feel free to reach out and I can help on how to make the actual change. |
An alternative might be to append a The |
Yes that would mean this could be rolled out quicker but it's a less correct place for the version to be stored, and we'll be stuck with it forever more. Depending on the field name it also has a slight risk of colliding with a user defined field. It's safer and more correct to store this at the root of the doc so I think we stick with that, and hopefully it's another great reason for people to upgrade cht-core!
The _rev isn't reliably sequential so it's not ideal unfortunately. I think if we just use the file last modified date it satisfies the sequential and (almost) guaranteed unique criteria, with the possible downside of ending up with spurious versions where nothing actually changed. This shouldn't be an issue so long as Another way would be to require a user to enter a specific version, or have cht-conf attempt to detect when something has changed materially, but I prefer the timestamp as it's foolproof. |
If I'm understanding correctly, The flow is,
Then I could resolve this by,
I'll make a draft PR soon |
Draft PR's I'm not sure Git will preserve the file modified time correctly. What timestamp would you recommend? Also, please check this medic/cht-core#7319. I think this is the right spot. |
Love this ticket/idea!! Should we be concerned about forms getting uploaded via the admin GUI ( |
Yes I think so, just to preserve backwards compatibility. We probably just need to store the date the form was uploaded as the version.
Comment added here: https://github.com/medic/cht-conf/pull/431/files#r711800800 |
I think this requires me to upgrade medic-conf to cht-conf. I started that here medic/cht-core#7340 because I'm having trouble otherwise. |
Tested with
|
Is your feature request related to a problem? Please describe.
Over the life of a CHT application, many different versions of a form will run in production. The resulting data in the reports are unschema and so it is difficult to map the report to the version of the form that was filled. Typically, people write SQL heuristics to determine which version they are analysing - but this isn't always possible (for example, changing the wording of a question but nothing else).
Describe the solution you'd like
When we convert-forms via medic-conf, could we hash the xlsx file.
Automatically inject that hash into the form xml so that the completed report will contain
fields.formHash
(or alike).If the hashing algorithm we use is the same as GitHub, this would let users GitHub hash search for the exact version of the form that caused the report.
Describe alternatives you've considered
The form version is in the telemetry documents. Queries are complex and you lose some fidelity when relying on the aggregated info in those docs.
The text was updated successfully, but these errors were encountered: