Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BulkUpload: Support for Batch Uploads of Historical HealthKit Data #26

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

mjoerke
Copy link

@mjoerke mjoerke commented Jan 8, 2025

BulkUpload

♻️ Current situation & Problem

This PR adds support for batched uploads of historical HealthKit data via a new BulkUpload class. The relevant issue is described in:

This code is several months old now and was used to support our GPTCoach study. It has not been modified since its use in the GPTCoach study, so I know it is functional. I am hoping we can merge these changes into SpeziHealthKit before our next study (mid-Feb) and before my fork goes stale.

However, additional testing and bug fixes are needed before a stable merge.

⚙️ Release Notes

  • The BulkUpload class and BulkUploadSampleDataSource implement most of the relevant functionality. We use an anchored query that sequentially collects batchSize samples.
  • The anchor is stored in UserDefaults so that the bulk upload can be resumed if the app leaves the foreground.
  • The BulkUploadConstraint requires conforming standards to provide a processBulk method. This can be used to, for example, initialize a batched upload to Firebase.
  • We expose a Progress object in the HealthKit class that aggregates the progress across all bulk upload data sources and can be used to render a progress bar.

📚 Documentation

The code has some documentation, but it is incomplete. Additional documentation is required for a release.

✅ Testing

Initial test cases have been added by @bryant-jimenez, but additional testing is needed.

I am aware of one bug that arose during our study:

  • If the user exits the app while a BulkUpload is active (pausing the upload as the app leaves the foreground), additional HealthKit data arrives, and the user opens the app again (resuming the upload), the progress can exceed 100%.

📝 Code of Conduct & Contributing Guidelines

By submitting creating this pull request, you agree to follow our Code of Conduct and Contributing Guidelines:

mjoerke and others added 25 commits March 27, 2024 21:36
Included Standard constraint for BulkUpload, defaulted bulkSize in class initializer, added prefixes for the anchors
Add BulkUpload class + Standard constraint
Expose Progress Object in SpeziHealthKit
Copy link
Member

@PSchmiedmayer PSchmiedmayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this @mjoerke!

We will probably let #27 be merged first as it improves a decent amount of functionality around HealthKit that would build a good foundation for this contribution in the next step.

@lukaskollmer: @mjoerke and team have been working on a Spezi-based app to conduct studies around health coaching & LLMs. Very cool work and just got accepted for a paper. The motivation for the bulk upload feature was to create a mechanism that allows us to run data donation studies where HealthKit data is grabbed all at once and uploaded in smaller chunks as a large bulk upload. The chunking mechanism is there to ensure that servers are not overloaded & uploads are verified as they progress to avoid any data loss.

Once we have merged #27 it might be nice to get your input & suggestions what's the best way to incorporate this in the improved structure we will have in SpeziHealthKit 👍

@PSchmiedmayer PSchmiedmayer added the enhancement New feature or request label Jan 20, 2025
@mjoerke
Copy link
Author

mjoerke commented Jan 20, 2025

Thanks for the update @PSchmiedmayer! #27 definitely seems like it will add useful functionality for BulkUpload.

Our study will be launching in mid-February – I'm not sure if this PR will be incorporated into main before then, but it's not blocking for us since we can still use my fork.

@PSchmiedmayer
Copy link
Member

Sounds great; thanks for the update. I will sync with @lukaskollmer this week 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants