Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provisional function on imports #89

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8b8446a
-create the function that searches for imports from the notebook ast …
Felice644 May 9, 2022
5930e7b
-improved the find functions of the python core libraries and reduced…
Felice644 May 11, 2022
527c709
- Fixed the function for parsing of .txt, .toml, .yaml file
Felice644 May 19, 2022
a501026
- just a little fix on the name
Felice644 May 19, 2022
c928c12
- First attempt to correct the error on the tests
Felice644 May 20, 2022
88b1a91
- Fixed error on the tests
Felice644 May 21, 2022
c638c58
- Fixed previous comments in notebook class
Felice644 May 21, 2022
d6beae7
Minor improvements in `notebook.py`
louieQ May 22, 2022
9cbed73
- Added test and test class for notebook functions.
Felice644 May 24, 2022
14f9c86
Merge remote-tracking branch 'origin/modules/packeges_imported' into …
Felice644 May 24, 2022
39ff970
- Update project to the last versions
Felice644 May 24, 2022
4a94a5f
- Fixed failing repository tests.
Felice644 May 25, 2022
6163d3b
Integrate the latest refactoring (`core_models.py`)
louieQ May 27, 2022
69cfee8
Improve the parsing strategy for `setup.py` files
louieQ May 27, 2022
c7a9359
Repo requirements parsing fixtures
louieQ May 27, 2022
3c68e02
Refactor requirement file parsers
louieQ May 27, 2022
e3f90a8
Perser for `requirements.txt` ready and tested
louieQ May 27, 2022
4823f1f
Parser for `pyproject.toml` ready and tested
louieQ May 27, 2022
c623078
Parser for `setup.py` ready and tested
louieQ May 27, 2022
28093f3
Parser for `Pipfile` ready and tested
louieQ May 27, 2022
2138192
Handle exceptions while parsing requirement files
louieQ May 30, 2022
de8a6a1
Improve the management of magic functions
louieQ May 30, 2022
03f7185
Check the presence of undeclared dependencies
louieQ May 30, 2022
f24380f
Merge pull request #93 from collab-uniba/merge_PR_89
Felice644 May 31, 2022
4808cd3
- Added tests for dependencies_unmanaged and test_undeclared_dependen…
Felice644 Jun 1, 2022
3c569d2
- Partial solution for the test case.
Felice644 Jun 6, 2022
2f61f4a
Fixed `test_undeclared_dependencies`
louieQ Jun 10, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions kk.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
alabaster==0.7.12
attrs==21.4.0
Babel==2.9.1
bleach==4.1.0
certifi==2021.10.8
charset-normalizer==2.0.11
commonmark==0.9.1
defusedxml==0.7.1
docutils==0.17.1
entrypoints==0.3
gitdb==4.0.9
GitPython==3.1.26
idna==3.3
imagesize==1.3.0
importlib-metadata==4.10.1
importlib-resources==5.4.0
ipython-genutils==0.2.0
Jinja2==3.0.3
jsonschema==4.4.0
jupyter-client==7.1.2
jupyter-core==4.9.1
jupyterlab-pygments==0.1.2
MarkupSafe==2.0.1
mistune==0.8.4
nbclient==0.5.10
nbconvert==6.4.1
nbformat==5.1.3
nest-asyncio==1.5.4
packaging==21.3
pandocfilters==1.5.0
pydantic==1.9.0
Pygments==2.11.2
pyparsing==3.0.7
pyrsistent==0.18.1
python-dateutil==2.8.2
pytz==2021.3
pyzmq==22.3.0
requests==2.27.1
rich==12.0.1
six==1.16.0
smmap==5.0.0
snowballstemmer==2.2.0
Sphinx==4.4.0
sphinx-rtd-theme==1.0.0
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.0
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
testpath==0.5.0
tornado==6.1
traitlets==5.1.1
typing_extensions==4.0.1
urllib3==1.26.8
webencodings==0.5.1
zipp==3.7.0
2 changes: 1 addition & 1 deletion pynblint/nb_linting.py
Original file line number Diff line number Diff line change
Expand Up @@ -404,7 +404,7 @@ def long_multiline_python_comment(notebook: Notebook) -> List[Cell]:
show_details=False,
),
LintDefinition(
slug="long_multiline_python_comment",
slug="long-multiline-python-comment",
description="One or more code cells in this notebook contain Python comments "
f"of {settings.max_multiline_python_comment} or more consecutive lines.",
recommendation="For improved notebook readability, prefer using Markdown "
Expand Down
20 changes: 20 additions & 0 deletions pynblint/notebook.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@ class Notebook(RichRenderable):
"""

def __init__(self, path: Path):
self.imported_packages = None
self.path: Path = path
self.missing_requiremet: set
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information is not supposed to be stored here. It will be computed as the result of a linting function dedicated to checking the presence of import statements that do not map to the declared project requirements.


# Read the notebook with nbformat
with open(self.path) as f:
Expand All @@ -44,6 +46,10 @@ def __init__(self, path: Path):
self.ast = ast.parse(self.script)
except SyntaxError:
self.has_invalid_python_syntax = True
self.imported_packages = self._get_import_package_set()
# non posso accedere ai set creati nella classe Repository,
# come dovrei procedere?
# self.missing_requiremet = self.imported_package.difference(Repository.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should be performed within a dedicated linting function (inside nb_linting.py, I guess).
Here we are just defining notebook properties; the notebook property that we need to add in order to make the check feasible is the set of imported packages, as you already do at line 49.


@property
def code_cells(self) -> List[Cell]:
Expand All @@ -63,6 +69,20 @@ def initial_cells(self) -> List[Cell]:
def final_cells(self) -> List[Cell]:
return self.cells[-settings.final_cells :] # noqa: E203

def _get_import_package_set(self):
import_set = set()
for node in ast.walk(self.ast):
if isinstance(node, ast.Import):
for name in node.names:
import_set.add(name.name.split(".")[0])
elif isinstance(node, ast.ImportFrom):
if node.level > 0:
# Relative imports always refer to the current package.
continue
# assert node.module
import_set.add(node.module.split(".")[0])
return import_set

def __len__(self) -> int:
return len(self.cells)

Expand Down
43 changes: 41 additions & 2 deletions pynblint/repository.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import os
import re
import sys
import tempfile
import zipfile
from abc import ABC
from pathlib import Path
from typing import List, Optional
from typing import List, Optional, Pattern

import git

Expand All @@ -20,9 +22,12 @@ def __init__(self, path: Path):

# Repository info
self.path = path
self.dependencies_module: set

# Extracted content
self.notebooks: List[Notebook] = [] # List of Notebook objects
self.dependencies_module = self._get_dependencies()
self.core_library: set = self._get_core_dependecies()

def retrieve_notebooks(self):

Expand Down Expand Up @@ -67,6 +72,41 @@ def large_file_paths(self) -> List[Path]:
large_files.append(file_path)
return large_files

def _get_dependencies(self) -> set:
dependencies = set()
path = os.path.dirname(os.path.abspath(__file__))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to dynamically fetch this information: the path of a repository is already present as a property in the class Repository. Use self.path to get it.

path = os.path.dirname(path)
# siccome il file di requirement si toverà nella root di
# progetto ho creato un file fittizio con le stesse
# informazioni per non spostare quello pre-esistente
file = "kk.txt"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the requirement.txt file that is present in the repository is located inside the "docs" folder and not in the project root where it should actually be. I wrote this passage in anticipation of what I believe is the future location of "requirement.txt". If, on the other hand, it should remain there where it is, I can change the route without problems.

final_path = os.path.join(path, file)
with open(final_path, "r") as fi:
file_row = fi.read()
tmp = file_row.split("\n")
for item in tmp:
item.strip()
dependencies.add(tuple(item.split("==")))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saving dependencies as tuples is unnecessary. You can save them as simple strings, stripping the version info (which we cannot deduce from the notebook).

So the certifi==2021.10.8 entry in requirements.txt would be stored as certifi in Pynblint.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have inserted the step of the split to make it easier to check with the versions of the libraries to look for. Now I fix this step.


# print(dependencies)

return dependencies

# momentaneamente la lascio qui ma quando ma dovo la sposterò
# perchè ovviamente non è una funzione che riguarda il repository
def _get_core_dependecies(self) -> set:
coredependecies = set()
modules = sys.modules
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that this solution is appropriate? From this SO thread I learn that:

sys.modules - only shows modules that have already been loaded

The solution proposed in one of the answers to the same SO question seems to be a lot cleaner to me and might be the way to go: i.e., using isort's place_module function or getting the list of core modules like this:

from isort.stdlibs.py39 import stdlib
for name in sorted(stdlib): print(name)

pattern: Pattern[str] = re.compile(r".*\\\\Python\\\\Python37\\\\lib\\\\.*.py")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regex is tied to your installation of Python. It would not work with a different Python version (e.g., Python 3.8) or a different virtual environment name.

At least, I would generalize the Python version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, it was an oversight. I correct it immediately.

for i in modules.items():
# print(" PPOSIZIONE 0:" + i[0])
# print(" PPOSIZIONE 1:" + str(i))
if pattern.match(str(i)):
coredependecies.add(i[0])
print(coredependecies)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these are debug print statements that will be removed.


return coredependecies


class LocalRepository(Repository):
"""
Expand Down Expand Up @@ -114,7 +154,6 @@ class GitHubRepository(Repository):
"""

def __init__(self, github_url: str):

self.url = github_url

# Clone the repo in a temp directory
Expand Down