-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve unreachable code analysis #302
Conversation
Codecov Report
@@ Coverage Diff @@
## main #302 +/- ##
==========================================
+ Coverage 98.96% 99.03% +0.06%
==========================================
Files 21 21
Lines 679 727 +48
==========================================
+ Hits 672 720 +48
Misses 7 7
... and 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising this PR! I like the algorithm and that it nicely generalizes the earlier solution. However, I'd like to propose a different code organization, so that the core module doesn't grow too large:
- Add a
reachability.py
module with aReachability
class. - This class stores the no-fall-through-nodes, allows to reset them, has a
visit(node)
method that checks the node's type and depending on the type marks nodes as no-fall-through or checks whether their body is fully executable. - The Vulture class has a Reachability member and calls its
visit(node)
function, and retrieves the unreachable code from Reachability before switching to the next module.
This goes in the direction of adding a second NodeVisitor class, but avoids traversing the AST twice. Or do you think a second NodeVisitor class (with all of these visitor_* functions) would be a better alternative?
vulture/core.py
Outdated
@@ -225,6 +227,10 @@ def scan(self, code, filename=""): | |||
self.noqa_lines = noqa.parse_noqa(self.code) | |||
self.filename = filename | |||
|
|||
# We can reset the fall_through_nodes for every module to reduce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# We can reset the fall_through_nodes for every module to reduce | |
# Reset the no-fall-through-nodes for every module to reduce memory usage. |
vulture/core.py
Outdated
|
||
def _can_fall_through_statements_analysis(self, statements): | ||
"""Report unreachable statements. | ||
Returns True if we cannot fall though the list of statements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returns True if we cannot fall though the list of statements | |
Return True if we can execute the full list of statements. |
I do not see a clear advantage of one of the solutions and it probably does not matter too much right now (if the reachability is in a second module it should be really easy to switch between them anyway). I will update the code according to your proposed code organization. |
I am not sure if passing |
I had a look at it again, and I think it is cleaner to also move the I will propose the change as soon as I find time for it. |
I updated
For example.
I could handle this in another PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flew under my radar, sorry. I had three minor comments, but took care of them myself now. Thanks for your work on this, it not only generalizes the code, but also makes it nicer!
@@ -11,6 +12,7 @@ | |||
from vulture import utils | |||
from vulture.config import InputError, make_config | |||
from vulture.utils import ExitCode | |||
from vulture.reachability import Reachability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep these in alphabetical order.
tests/__init__.py
Outdated
assert len(v.unreachable_code) == 1 | ||
item = v.unreachable_code[0] | ||
assert item.first_lineno == lineno | ||
def check_unreachable(v, lineno, size, name, multiple=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing a Boolean in this way is a common code smell suggesting that we rather want two separate functions check_single_unreachable
and check_multiple_unreachables
.
self._report = report | ||
self._no_fall_through_nodes = set() | ||
|
||
def visit(self, node): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment: # All children of the node have already been visited.
Feel free to do so, if you're still interested :) |
I actually thought recently about this PR again and if I should rebase and give it another go 😅
Sure, I will have a look again 👍 |
Motivation
At the moment, the
print
statement is not identified as unreachable.Implementation
I added a data structure (
no_fall_through_nodes
) that stores ast nodes that do not allow a fall through. During the ast traversalcontinue
,break
,raise
andreturn
are added into the this data structure. For every control flow statement / object (for
,try
,while
,with
, but alsomodule
andfunctiondef
), we check if any of the statement in the body is inno_fall_through_nodes
. If that is the case, we report the next statement (if there is any) and add the current node also intono_fall_through_nodes
. To make this workgeneric_visit
(means visiting children) is now executed before visiting the current node.The algorithm was added into the existing
Vulture
object with thegeneric_visit
(which handles recursion). I think a cleaner implementation could implement a separate node visitor (that handles its own recursion), but I was not sure if the project is open for that.The old error reporting message is reused (e.g.
unreachable code after 'try'
). I am not sure if this is the best message or if all unreachable messages should be simplified to something likeunreachable code
.Limitations
match
statementswith
statements (context manager can suppress exceptions)Related Issue
This PR is addressing the issue #270.
Checklist:
tox -e fix-style
to format my code and checked the result withtox -e style
.