-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add methods to export results in tabular format #280
Conversation
liannette
commented
Oct 16, 2024
- add method to nplinker class for exporting links, as well as genomic and metabolomic data
- add test for generating the tabular data for the links
… file, improve docstrings
Please assign me to review it when it's ready ;-) If you're still working on that, it's better to change it to a draft PR |
I thought I was finished, but I realized that it still needs a bit of work. I will assign you as soon as I'm happy with it! |
@CunliangGeng It's ready for review :) I just can not request a review explicitly, because I have only read permission for the repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check the code step by step, and first thing to check is about code format and style:
- Please check the errors in the static typing, and correct the incorrect typings. You could use
mypy
to check them locally before committing. - It is not explicitly mentioned, but we do follow some rules for the order of methods/functions in a class/file. The order of methods/functions are:
__init__
method- other magic methods, e.g.
__str__
- property methods (using
@property
) - regular methods, class methods (using
@classmethod
), static methods (using@staticmethod
) - private methods (
_func
) - deprecated methods (using
@deprecated
)
For the same level of methods, e.g. regular methods, it's recommended to order them in alphabetical order.
Please check the new methods/functions you added and put them in the right place.
Apply suggestions from code review Co-authored-by: Cunliang Geng <[email protected]>
…into output_files
Done! I've integrated your suggestions or explained my reasoning for any differences. I hope we're good to merge now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more comment ;-)
Okay, it's implemented! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments, almost there!
# Convert dict to comma-separated string | ||
elif isinstance(value, dict): | ||
value = ", ".join([f"{k}:{v}" for k, v in value.items()]) | ||
# Convert anything else to string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really want to covert None
to ""? Does it make sense to the BGC attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the value is None
, the corresponding field in the tabular output file should be left empty. This ensures that when the file is opened in Excel, numeric fields are correctly recognized as numbers rather than text, allowing the columns to be sorted properly. For text fields, leaving them empty is also preferable to displaying None, as it is cleaner and more intuitive.
…, tabs are replaced in to_tabular
Alright, I really hope that we can merge this now 👀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also hope we can merge it now However, we must adhere to some important quality requirements, such as making docstrings and comments accurate and clear; and adding unit tests for all public unit tests. These quality requirements will benefit everyone over time, rather than accumulating more and more technical debt
You could choose what to do next:
1 🍎 Continue this PR and resolve the remaining comments
2 🍐 Merge this PR and open some new and small PRs to resolve the remaining comments
Let me know which one you prefer.
…nstead of numerical
Fair enough! I added the unit tests for the public methods/functions and updated the comments and doc strings. Thanks for the detailed suggestions, I learned a lot, even if it's a lot of effort. If you have any more comments, I would prefer to 🍐 merge this PR and open smaller PRs to resolve them, if possible. You'll need to merge, as I have write access. |
assert tabular_repr["gnps_annotations"] == "" | ||
|
||
# Test with molecular family | ||
class MockMolecularFamily: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using MolecularFamily
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 It's ready to merge!!!
One minor comment is that why you use MockMolecularFamily
in the tests instead of the real MolecularFamily
. You could solve it in a new PR.