Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to V1 #437

Merged
merged 383 commits into from
Feb 5, 2025
Merged

Upgrade to V1 #437

merged 383 commits into from
Feb 5, 2025

Conversation

rettigl
Copy link
Member

@rettigl rettigl commented Jun 23, 2024

Collection PR for the update to V1.
This contains breaking changes to current behavior and config file layouts.

Items to be addressed here:

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9635884851

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9635820675: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9635881058

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9635820675: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9635919510

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9635820675: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636056013

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636055471: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@rettigl
Copy link
Member Author

rettigl commented Jun 23, 2024

Flash-loader is consistently slower with the new packages. Not sure why.

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636181156

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636055471: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636181338

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636055471: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636221019

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 10 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.814%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/binning/numba_bin.py 3 87.5%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636055471: -0.1%
Covered Lines: 6427
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636258361

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 5 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.08%) to 91.886%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 4 87.03%
Totals Coverage Status
Change from base Build 9636055471: -0.08%
Covered Lines: 6432
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636290716

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636055471: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636630541

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636624947: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636663837

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 10 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.814%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/binning/numba_bin.py 3 87.5%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636624947: -0.1%
Covered Lines: 6427
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636664048

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636624947: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636700237

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636624947: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9636700470

Details

  • 182 of 183 (99.45%) changed or added relevant lines in 41 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.857%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sed/calibrator/momentum.py 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
sed/core/metadata.py 1 93.22%
sed/calibrator/momentum.py 6 86.78%
Totals Coverage Status
Change from base Build 9636624947: -0.1%
Covered Lines: 6430
Relevant Lines: 7000

💛 - Coveralls

@rettigl
Copy link
Member Author

rettigl commented Jun 24, 2024

Flash-loader is consistently slower with the new packages. Not sure why.

This directly comes from the update of dask from 2023.5.0 to 2023.12.0, tested on python 3.9.
Surprisingly, the binning in the Flash example notebook however becomes faster with this new configuration (44.5s vs. 26.9 s).
As we need to look into dask anyways still, I would not consider this a problem for now.

@zain-sohail
Copy link
Member

zain-sohail commented Jun 24, 2024

Flash-loader is consistently slower with the new packages. Not sure why.

This directly comes from the update of dask from 2023.5.0 to 2023.12.0, tested on python 3.9. Surprisingly, the binning in the Flash example notebook however becomes faster with this new configuration (44.5s vs. 26.9 s). As we need to look into dask anyways still, I would not consider this a problem for now.

Are you using the refactored branch or the one on main?

Considering that the creation of buffer files has is independant of dask, I assume it's another package that causes this issue (maybe pandas?)
Dask is only used when loading the buffer files and subsequent processing like sector ID seperation, forward filling, etc. and eventually binning. So it seems to be that the problem is not from dask.
At least this is the case in refactored branch. In main, the sector ID seperation takes place with pandas and not dask.

@rettigl
Copy link
Member Author

rettigl commented Jun 24, 2024

Dask is only used when loading the buffer files and subsequent processing like sector ID seperation, forward filling, etc. and eventually binning. So it seems to be that the problem is not from dask.

It is from dask, the performance changes if I only change the version of dask (from 2024.12.0 to 2024.5.0). There seem to be some other issues with the map_overlap stuff after loading the dataframe. see #448

@rettigl
Copy link
Member Author

rettigl commented Jan 20, 2025

@zain-sohail I made a pre-release for v1 from this now. If the package looks good, and the docs build also looks fine, I suggest you approve this PR, we make a final pre-v1 release and then merge and release v1, what do you think? Reviewing the whole thing again seems unfeasible to me.

@rettigl
Copy link
Member Author

rettigl commented Jan 29, 2025

@zain-sohail I made a pre-release for v1 from this now. If the package looks good, and the docs build also looks fine, I suggest you approve this PR, we make a final pre-v1 release and then merge and release v1, what do you think? Reviewing the whole thing again seems unfeasible to me.

@zain-sohail Did you test the package? What is your plan on how and when we can merge this?

@zain-sohail
Copy link
Member

@zain-sohail Did you test the package? What is your plan on how and when we can merge this?

Hi Laurenz. I have tested the package in a fresh env and also looked at the docs and everything seems fine.

Dima just wanted #534 to be merged before we put out v1. I'd hence suggest we just merge it for now. Since the lab loader doesnt have integration tests right now, I can't promise no bugs but the flash loader should run bug free. And I can fix anything later as well.

@rettigl
Copy link
Member Author

rettigl commented Jan 30, 2025

But what is the advantage with adding #534 still here, compared to first merging this, releasing 1.0, and then add it in 1.1? This is not a breaking change, but adds functionality. So it should be a minor release, from my point of view. This pr is already way too large, and adding here will make it more and more difficult to locate potential problems.

@zain-sohail
Copy link
Member

But what is the advantage with adding #534 still here, compared to first merging this, releasing 1.0, and then add it in 1.1? This is not a breaking change, but adds functionality. So it should be a minor release, from my point of view. This pr is already way too large, and adding here will make it more and more difficult to locate potential problems.

That's also true but I'd actually say it is a breaking change due to me moving some hard coded values to the config files, which the users definitely need to have for the loader to work.

@rettigl
Copy link
Member Author

rettigl commented Jan 30, 2025

I see, I was not aware of this. But then let's complete #534 with tests (mostly dataframe module missing) and careful reviews. This might take some time though from my side, maybe @kutnyakhov also can add a review.

@zain-sohail
Copy link
Member

I see, I was not aware of this. But then let's complete #534 with tests (mostly dataframe module missing) and careful reviews. This might take some time though from my side, maybe @kutnyakhov also can add a review.

Yes, I am also waiting on Dima's review but currently he is on holiday. I'll try to add the tests today or latest this weekend.

@rettigl
Copy link
Member Author

rettigl commented Feb 1, 2025

I see, I was not aware of this. But then let's complete #534 with tests (mostly dataframe module missing) and careful reviews. This might take some time though from my side, maybe @kutnyakhov also can add a review.

Having had a brief look at the config changes in #534, they don't appear to be breaking changes to me, I. E. A config pre-534 should work with the code there (or we can make it work). Thus I would still argue to merge this one now, and then build further on that release. We should not extend this pr any further if not absolutely necessary.

@zain-sohail
Copy link
Member

Having had a brief look at the config changes in #534, they don't appear to be breaking changes to me, I. E. A config pre-534 should work with the code there (or we can make it work). Thus I would still argue to merge this one now, and then build further on that release. We should not extend this pr any further if not absolutely necessary.

I'd suggest that we merge the config model and config file changes to v1 at least before merging with main.

@rettigl
Copy link
Member Author

rettigl commented Feb 4, 2025

Why? I don't see the need for this. Adding optional config elements is not a breaking change. #551 does something similar. Let's not complicate things unnecessarily.

@zain-sohail
Copy link
Member

Why? I don't see the need for this. Adding optional config elements is not a breaking change. #551 does something similar. Let's not complicate things unnecessarily.

Because these are not optional changes to flash loaders, only for general sed they are optional. Earlier, these three fields were hard coded and I moved them out of flash.utils.get_channels and into the config.

index: Optional[Union[Sequence[str], str]] = None
formats: Optional[Union[Sequence[str], str]] = None
fill_formats: Optional[Union[Sequence[str], str]] = None

This is definitely a breaking change in the schema at least for flash data.

@rettigl
Copy link
Member Author

rettigl commented Feb 4, 2025

Why don't you provide default values for these in #534 ? Then it would not break anything.

@zain-sohail
Copy link
Member

Why don't you provide default values for these in #534 ? Then it would not break anything.

I thought we weren't including loader specific defaults in default.yaml but if that's the case then you are right.

@rettigl
Copy link
Member Author

rettigl commented Feb 5, 2025

Why don't you provide default values for these in #534 ? Then it would not break anything.

I thought we weren't including loader specific defaults in default.yaml but if that's the case then you are right.

Yes, we used to handle it like this, true. I don't completely oversee why you want to make these parameters part of the config, but I suggest that you keep the code in a shape where it would also work without. That should be the case anyways for all optional parameters, I think. Maybe we should make sure that the code itself contains defaults for all optional parameters, and not the default config, which contains defaults for all required parameters.
Anyways, we can still add this before publishing v1. I would then make a last release pre-v1, merge here, and make another release candidate for v1 from main.

@rettigl rettigl merged commit 47b979b into main Feb 5, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants