Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add methods to export results in tabular format #280

Merged
merged 45 commits into from
Dec 4, 2024

Conversation

liannette
Copy link
Contributor

  • add method to nplinker class for exporting links, as well as genomic and metabolomic data
  • add test for generating the tabular data for the links

@CunliangGeng
Copy link
Member

CunliangGeng commented Oct 17, 2024

Please assign me to review it when it's ready ;-) If you're still working on that, it's better to change it to a draft PR

@liannette
Copy link
Contributor Author

I thought I was finished, but I realized that it still needs a bit of work. I will assign you as soon as I'm happy with it!

@liannette liannette marked this pull request as draft October 17, 2024 13:29
@liannette liannette marked this pull request as ready for review October 18, 2024 15:43
@liannette
Copy link
Contributor Author

@CunliangGeng It's ready for review :) I just can not request a review explicitly, because I have only read permission for the repository.

@CunliangGeng CunliangGeng self-requested a review October 22, 2024 07:28
Copy link
Member

@CunliangGeng CunliangGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check the code step by step, and first thing to check is about code format and style:

  1. Please check the errors in the static typing, and correct the incorrect typings. You could use mypy to check them locally before committing.
  2. It is not explicitly mentioned, but we do follow some rules for the order of methods/functions in a class/file. The order of methods/functions are:
  • __init__ method
  • other magic methods, e.g. __str__
  • property methods (using @property)
  • regular methods, class methods (using @classmethod), static methods (using @staticmethod)
  • private methods (_func )
  • deprecated methods (using @deprecated)

For the same level of methods, e.g. regular methods, it's recommended to order them in alphabetical order.

Please check the new methods/functions you added and put them in the right place.

@liannette
Copy link
Contributor Author

Done! I've integrated your suggestions or explained my reasoning for any differences. I hope we're good to merge now.

Copy link
Member

@CunliangGeng CunliangGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more comment ;-)

src/nplinker/genomics/bgc.py Show resolved Hide resolved
src/nplinker/nplinker.py Outdated Show resolved Hide resolved
@liannette
Copy link
Contributor Author

Okay, it's implemented!

Copy link
Member

@CunliangGeng CunliangGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments, almost there!

src/nplinker/genomics/bgc.py Outdated Show resolved Hide resolved
src/nplinker/genomics/bgc.py Outdated Show resolved Hide resolved
# Convert dict to comma-separated string
elif isinstance(value, dict):
value = ", ".join([f"{k}:{v}" for k, v in value.items()])
# Convert anything else to string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really want to covert None to ""? Does it make sense to the BGC attributes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the value is None, the corresponding field in the tabular output file should be left empty. This ensures that when the file is opened in Excel, numeric fields are correctly recognized as numbers rather than text, allowing the columns to be sorted properly. For text fields, leaving them empty is also preferable to displaying None, as it is cleaner and more intuitive.

src/nplinker/genomics/bgc.py Outdated Show resolved Hide resolved
src/nplinker/metabolomics/spectrum.py Outdated Show resolved Hide resolved
src/nplinker/nplinker.py Outdated Show resolved Hide resolved
@liannette
Copy link
Contributor Author

Alright, I really hope that we can merge this now 👀

Copy link
Member

@CunliangGeng CunliangGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also hope we can merge it now :shipit: However, we must adhere to some important quality requirements, such as making docstrings and comments accurate and clear; and adding unit tests for all public unit tests. These quality requirements will benefit everyone over time, rather than accumulating more and more technical debt

You could choose what to do next:
1 🍎 Continue this PR and resolve the remaining comments
2 🍐 Merge this PR and open some new and small PRs to resolve the remaining comments

Let me know which one you prefer.

src/nplinker/genomics/bgc.py Outdated Show resolved Hide resolved
src/nplinker/genomics/bgc.py Outdated Show resolved Hide resolved
src/nplinker/genomics/bgc.py Outdated Show resolved Hide resolved
src/nplinker/genomics/bgc.py Outdated Show resolved Hide resolved
tests/unit/genomics/test_bgc.py Outdated Show resolved Hide resolved
tests/unit/metabolomics/test_spectrum.py Show resolved Hide resolved
src/nplinker/metabolomics/spectrum.py Show resolved Hide resolved
tests/unit/scoring/test_link_graph.py Outdated Show resolved Hide resolved
src/nplinker/nplinker.py Show resolved Hide resolved
@liannette
Copy link
Contributor Author

Fair enough! I added the unit tests for the public methods/functions and updated the comments and doc strings. Thanks for the detailed suggestions, I learned a lot, even if it's a lot of effort.

If you have any more comments, I would prefer to 🍐 merge this PR and open smaller PRs to resolve them, if possible. You'll need to merge, as I have write access.

assert tabular_repr["gnps_annotations"] == ""

# Test with molecular family
class MockMolecularFamily:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using MolecularFamily directly?

Copy link
Member

@CunliangGeng CunliangGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 It's ready to merge!!!

One minor comment is that why you use MockMolecularFamily in the tests instead of the real MolecularFamily. You could solve it in a new PR.

@CunliangGeng CunliangGeng merged commit 481a068 into NPLinker:dev Dec 4, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants