-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate get_most_severe_consequence_for_summary
in favor of more flexible get_most_severe_csq_from_multiple_csq_lists
#714
base: main
Are you sure you want to change the base?
Conversation
- Allow lof_flags to be missing in addition to lof_flags == "" for lof == "HC" to have no flag penalty - Pass `csq_order` to `add_most_severe_consequence_to_consequence`
…ble to use in `default_generate_gene_lof_matrix`
… into jg/fix_process_consequences
…nsequence_to_consequence more flexible
…sq_lists and make use of it in process_consequences
… into jg/fix_process_consequences # Conflicts: # gnomad/utils/vep.py
…s` to `filter_to_most_severe_consequences` and clean them up
…f `get_most_severe_consequence_for_summary`
…ttps://github.com/broadinstitute/gnomad_methods into jg/deprecate_get_most_severe_consequence_for_summary
…ttps://github.com/broadinstitute/gnomad_methods into jg/deprecate_get_most_severe_consequence_for_summary
… if it's empty after filtering
|
||
- most_severe_consequence: Most severe consequence for variant. | ||
- lof: Whether the variant is a loss-of-function variant. | ||
- no_lof_flags: Whether the variant has any LOFTEE flags (True if no flags). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this returns either True or None right? why not True or False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes, so I'm not sure what the result was before, but now in my test on 2 partitions of the exomes result HT, I get {False: 21, True: 580, None: 78785}
. Let me know if you are still seeing no False
tests/utils/test_vep.py
Outdated
) | ||
|
||
# Test the function | ||
result = get_most_severe_csq_from_multiple_csq_lists(vep_expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also tested with setting prioritize_loftee_no_flags to True or False -- does it make sense that got the same result regardless of how set this param was set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current vep_expr
, yes. I added some changes to it, and a test for it
) | ||
|
||
# Build the case expression to determine the most severe consequence. | ||
ms_csq_expr = hl.case(missing_false=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use case expressions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to? If you have a clearer solution, I'm happy to make modifications, just let me know what your thinking
gnomad/utils/vep.py
Outdated
) | ||
|
||
# Initialize the lof struct with missing values. | ||
lof_expr = hl.struct(lof=hl.missing(hl.tstr), no_lof_flags=hl.missing(hl.tbool)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so lof and no_lof_flags will always be None if prioritize_loftee/prioritize_loftee_no_flags are False, even if lof flags are present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified this a bit, let me know if the changes make sense
… into jg/deprecate_get_most_severe_consequence_for_summary
@pytest.mark.parametrize( | ||
"prioritize_protein_coding, prioritize_loftee, prioritize_loftee_no_flags, additional_order_field, additional_order, expected_most_severe_csq, expected_polyphen_prediction", | ||
[ | ||
(False, False, False, None, None, None, None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only set expected_most_severe_csq if it's "stop_gained"?
else polyphen_prediction | ||
), | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Define csq, protein_coding, lof, no_lof_flags, and polyphen_prediction. |
additional_order=additional_order, | ||
) | ||
|
||
expected_dict = hl.Struct( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how were the values for the expected dict decided?
+ (["lof"] if prioritize_loftee else []) | ||
+ ( | ||
["no_lof_flags"] | ||
if prioritize_loftee_no_flags or prioritize_loftee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i find it confusing when no_lof_flags is True in cases where prioritize_loftee is True and prioritize_loftee_no_flags is False and there are flags present
@@ -777,86 +780,96 @@ def filter_to_most_severe_consequences( | |||
:return: ArrayExpression with of the consequences that match the most severe | |||
consequence. | |||
""" | |||
# Get the dtype of the csq_expr ArrayExpression elements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Get the dtype of the csq_expr ArrayExpression elements | |
# Get the dtype of the csq_expr ArrayExpression elements. |
(True, True, False, *polyphen_params, None, "possibly_damaging"), | ||
(True, False, True, *polyphen_params, None, "possibly_damaging"), | ||
(False, True, True, *polyphen_params, "stop_gained", None), | ||
# Need to figure out class too large error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also got this error
Also adds:
filter_to_most_severe_consequences
, which is used byget_most_severe_csq_from_multiple_csq_lists
loftee_labels
andno_lof_flags
parameters tofilter_vep_transcript_csqs_expr
for filtering by loftee labels and flags.Depends on #713
updates_to_get_most_severe_consequence_for_summary.html.zip testing to make sure the same results are returned for
get_most_severe_consequence_for_summary