Improve unreachable code analysis #302

kreathon · 2023-02-17T10:14:04Z

Motivation

def f():
    try:
        return
    except:
        raise Exception()
    print("Unreachable")

At the moment, the print statement is not identified as unreachable.

Implementation

I added a data structure (no_fall_through_nodes) that stores ast nodes that do not allow a fall through. During the ast traversal continue, break, raise and return are added into the this data structure. For every control flow statement / object (for, try, while, with, but also module and functiondef), we check if any of the statement in the body is in no_fall_through_nodes. If that is the case, we report the next statement (if there is any) and add the current node also into no_fall_through_nodes. To make this work generic_visit (means visiting children) is now executed before visiting the current node.

The algorithm was added into the existing Vulture object with the generic_visit (which handles recursion). I think a cleaner implementation could implement a separate node visitor (that handles its own recursion), but I was not sure if the project is open for that.

The old error reporting message is reused (e.g. unreachable code after 'try'). I am not sure if this is the best message or if all unreachable messages should be simplified to something like unreachable code.

Limitations

does not support match statements
does not support with statements (context manager can suppress exceptions)

Related Issue

This PR is addressing the issue #270.

Checklist:

I have updated the documentation in the README.md file or my changes don't require an update.
I have added an entry in CHANGELOG.md.
I have added or adapted tests to cover my changes.
I have run tox -e fix-style to format my code and checked the result with tox -e style.

codecov-commenter · 2023-02-17T10:16:22Z

Codecov Report

Merging #302 (a59eac4) into main (fc23f33) will increase coverage by 0.06%.
Report is 13 commits behind head on main.
The diff coverage is 100.00%.

❗ Current head a59eac4 differs from pull request most recent head 8d9fcf4. Consider uploading reports for the commit 8d9fcf4 to get more accurate results

@@            Coverage Diff             @@
##             main     #302      +/-   ##
==========================================
+ Coverage   98.96%   99.03%   +0.06%     
==========================================
  Files          21       21              
  Lines         679      727      +48     
==========================================
+ Hits          672      720      +48     
  Misses          7        7

Files Changed	Coverage Δ
vulture/core.py	`98.75% <100.00%> (+0.21%)`	⬆️
vulture/whitelists/ast_whitelist.py	`100.00% <100.00%> (ø)`

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

jendrikseipp

Thanks for raising this PR! I like the algorithm and that it nicely generalizes the earlier solution. However, I'd like to propose a different code organization, so that the core module doesn't grow too large:

Add a reachability.py module with a Reachability class.
This class stores the no-fall-through-nodes, allows to reset them, has a visit(node) method that checks the node's type and depending on the type marks nodes as no-fall-through or checks whether their body is fully executable.
The Vulture class has a Reachability member and calls its visit(node) function, and retrieves the unreachable code from Reachability before switching to the next module.

This goes in the direction of adding a second NodeVisitor class, but avoids traversing the AST twice. Or do you think a second NodeVisitor class (with all of these visitor_* functions) would be a better alternative?

jendrikseipp · 2023-08-20T11:24:34Z

vulture/core.py

@@ -225,6 +227,10 @@ def scan(self, code, filename=""):
        self.noqa_lines = noqa.parse_noqa(self.code)
        self.filename = filename

+        # We can reset the fall_through_nodes for every module to reduce


Suggested change

# We can reset the fall_through_nodes for every module to reduce

# Reset the no-fall-through-nodes for every module to reduce memory usage.

vulture/core.py

jendrikseipp · 2023-08-20T11:43:19Z

vulture/core.py

+
+    def _can_fall_through_statements_analysis(self, statements):
+        """Report unreachable statements.
+        Returns True if we cannot fall though the list of statements


Suggested change

Returns True if we cannot fall though the list of statements

Return True if we can execute the full list of statements.

kreathon · 2023-09-03T10:35:01Z

This goes in the direction of adding a second NodeVisitor class, but avoids traversing the AST twice. Or do you think a second NodeVisitor class (with all of these visitor_* functions) would be a better alternative?

I do not see a clear advantage of one of the solutions and it probably does not matter too much right now (if the reachability is in a second module it should be really easy to switch between them anyway).

I will update the code according to your proposed code organization.

kreathon · 2023-09-03T10:48:58Z

I am not sure if passing _define as report into Reachaiblity is the cleanest solution, but I did not want to create a new datastructure that passes the information back to the Vulture object. What do you think?

kreathon · 2023-09-05T17:31:23Z

I had a look at it again, and I think it is cleaner to also move the _handle_conditional_node into the reachablitly.py (such that the module handles the entire self.unreachable_code LoggingList data structure).

I will propose the change as soon as I find time for it.

kreathon · 2023-09-09T09:15:20Z

I updated

check_unreachable for checking multiple unreachable code segments (alternative would be to add another method for that or assert that there is only a single unreachable code segment separately)
move some tests from test_conditions.py into test_reachability.py
merge handle_condition_node() into Reachability to get more powerful analysis results. There are still cases that are not handled (we need to distinguish between raise/return and break/continue for that).

For example.

while True:
    raise Exception()
print(b)

I could handle this in another PR.

jendrikseipp

This flew under my radar, sorry. I had three minor comments, but took care of them myself now. Thanks for your work on this, it not only generalizes the code, but also makes it nicer!

jendrikseipp · 2023-09-30T13:04:20Z

vulture/core.py

@@ -11,6 +12,7 @@
 from vulture import utils
 from vulture.config import InputError, make_config
 from vulture.utils import ExitCode
+from vulture.reachability import Reachability


Let's keep these in alphabetical order.

jendrikseipp · 2023-11-26T21:02:10Z

tests/__init__.py

-    assert len(v.unreachable_code) == 1
-    item = v.unreachable_code[0]
-    assert item.first_lineno == lineno
+def check_unreachable(v, lineno, size, name, multiple=False):


Passing a Boolean in this way is a common code smell suggesting that we rather want two separate functions check_single_unreachable and check_multiple_unreachables.

jendrikseipp · 2024-11-24T18:54:03Z

vulture/reachability.py

+        self._report = report
+        self._no_fall_through_nodes = set()
+
+    def visit(self, node):


Add comment: # All children of the node have already been visited.

jendrikseipp · 2024-11-24T19:38:32Z

There are still cases that are not handled (we need to distinguish between raise/return and break/continue for that).

For example.
while True:
    raise Exception()
print(b)
I could handle this in another PR.

Feel free to do so, if you're still interested :)

kreathon · 2024-11-24T20:15:22Z

This flew under my radar, sorry. I had three minor comments, but took care of them myself now. Thanks for your work on this, it not only generalizes the code, but also makes it nicer!

I actually thought recently about this PR again and if I should rebase and give it another go 😅

Feel free to do so, if you're still interested :)

Sure, I will have a look again 👍

jendrikseipp requested changes Aug 20, 2023

View reviewed changes

kreathon requested a review from jendrikseipp September 3, 2023 10:57

kreathon and others added 9 commits November 24, 2024 20:15

Improve unreachable code analysis.

a412bfd

Add tests for async for and async with statements

f808240

Move code into reachability.py

4c7f922

Rename reporter to report

45302f1

Simplify reachablitly.py

5edbcf0

Minor code readability improvement

47ee73a

Merge into reachablity analysis

0c60097

Make private

05ff4a9

Fix format.

4e93229

jendrikseipp force-pushed the master branch from 1ddda42 to 4e93229 Compare November 24, 2024 19:15

jendrikseipp added 2 commits November 24, 2024 20:21

Update changelog.

c1e474f

Split into two functions.

980088f

jendrikseipp approved these changes Nov 24, 2024

View reviewed changes

jendrikseipp merged commit 609f5f2 into jendrikseipp:main Nov 24, 2024
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve unreachable code analysis #302

Improve unreachable code analysis #302

kreathon commented Feb 17, 2023

codecov-commenter commented Feb 17, 2023 •

edited by codecov bot

Loading

jendrikseipp left a comment

jendrikseipp Aug 20, 2023

jendrikseipp Aug 20, 2023

kreathon commented Sep 3, 2023

kreathon commented Sep 3, 2023

kreathon commented Sep 5, 2023

kreathon commented Sep 9, 2023

jendrikseipp left a comment

jendrikseipp Sep 30, 2023

jendrikseipp Nov 26, 2023

jendrikseipp Nov 24, 2024

jendrikseipp commented Nov 24, 2024

kreathon commented Nov 24, 2024 •

edited

Loading

	# We can reset the fall_through_nodes for every module to reduce
	# Reset the no-fall-through-nodes for every module to reduce memory usage.

	Returns True if we cannot fall though the list of statements
	Return True if we can execute the full list of statements.

Improve unreachable code analysis #302

Improve unreachable code analysis #302

Conversation

kreathon commented Feb 17, 2023

Motivation

Implementation

Limitations

Related Issue

Checklist:

codecov-commenter commented Feb 17, 2023 • edited by codecov bot Loading

Codecov Report

jendrikseipp left a comment

Choose a reason for hiding this comment

jendrikseipp Aug 20, 2023

Choose a reason for hiding this comment

jendrikseipp Aug 20, 2023

Choose a reason for hiding this comment

kreathon commented Sep 3, 2023

kreathon commented Sep 3, 2023

kreathon commented Sep 5, 2023

kreathon commented Sep 9, 2023

jendrikseipp left a comment

Choose a reason for hiding this comment

jendrikseipp Sep 30, 2023

Choose a reason for hiding this comment

jendrikseipp Nov 26, 2023

Choose a reason for hiding this comment

jendrikseipp Nov 24, 2024

Choose a reason for hiding this comment

jendrikseipp commented Nov 24, 2024

kreathon commented Nov 24, 2024 • edited Loading

codecov-commenter commented Feb 17, 2023 •

edited by codecov bot

Loading

kreathon commented Nov 24, 2024 •

edited

Loading