-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BulkUpload: Support for Batch Uploads of Historical HealthKit Data #26
base: main
Are you sure you want to change the base?
Conversation
Included Standard constraint for BulkUpload, defaulted bulkSize in class initializer, added prefixes for the anchors
Add BulkUpload class + Standard constraint
Expose Progress Object in SpeziHealthKit
Add tests for BulkUpload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding this @mjoerke!
We will probably let #27 be merged first as it improves a decent amount of functionality around HealthKit that would build a good foundation for this contribution in the next step.
@lukaskollmer: @mjoerke and team have been working on a Spezi-based app to conduct studies around health coaching & LLMs. Very cool work and just got accepted for a paper. The motivation for the bulk upload feature was to create a mechanism that allows us to run data donation studies where HealthKit data is grabbed all at once and uploaded in smaller chunks as a large bulk upload. The chunking mechanism is there to ensure that servers are not overloaded & uploads are verified as they progress to avoid any data loss.
Once we have merged #27 it might be nice to get your input & suggestions what's the best way to incorporate this in the improved structure we will have in SpeziHealthKit 👍
Thanks for the update @PSchmiedmayer! #27 definitely seems like it will add useful functionality for BulkUpload. Our study will be launching in mid-February – I'm not sure if this PR will be incorporated into main before then, but it's not blocking for us since we can still use my fork. |
Sounds great; thanks for the update. I will sync with @lukaskollmer this week 👍 |
BulkUpload
♻️ Current situation & Problem
This PR adds support for batched uploads of historical HealthKit data via a new
BulkUpload
class. The relevant issue is described in:This code is several months old now and was used to support our GPTCoach study. It has not been modified since its use in the GPTCoach study, so I know it is functional. I am hoping we can merge these changes into SpeziHealthKit before our next study (mid-Feb) and before my fork goes stale.
However, additional testing and bug fixes are needed before a stable merge.
⚙️ Release Notes
BulkUpload
class andBulkUploadSampleDataSource
implement most of the relevant functionality. We use an anchored query that sequentially collectsbatchSize
samples.BulkUploadConstraint
requires conforming standards to provide aprocessBulk
method. This can be used to, for example, initialize a batched upload to Firebase.Progress
object in theHealthKit
class that aggregates the progress across all bulk upload data sources and can be used to render a progress bar.📚 Documentation
The code has some documentation, but it is incomplete. Additional documentation is required for a release.
✅ Testing
Initial test cases have been added by @bryant-jimenez, but additional testing is needed.
I am aware of one bug that arose during our study:
📝 Code of Conduct & Contributing Guidelines
By submitting creating this pull request, you agree to follow our Code of Conduct and Contributing Guidelines: