Skip to content

Commit

Permalink
Remove use of from_numpy and misc updates/cleanup (python-graphblas…
Browse files Browse the repository at this point in the history
  • Loading branch information
eriknw authored Feb 27, 2023
1 parent 9a4808a commit 7be68cf
Show file tree
Hide file tree
Showing 15 changed files with 78 additions and 39 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/test_and_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ jobs:
# Consider removing old versions when they become problematic or very old (>=2 years).
nxver=$(python -c 'import random ; print(random.choice(["=2.7", "=2.8", "=3.0", ""]))')
yamlver=$(python -c 'import random ; print(random.choice(["=5.4", "=6.0", ""]))')
sparsever=$(python -c 'import random ; print(random.choice(["=0.12", "=0.13", ""]))')
sparsever=$(python -c 'import random ; print(random.choice(["=0.12", "=0.13", "=0.14", ""]))')
if [[ ${{ steps.pyver.outputs.selected }} == "3.8" ]]; then
npver=$(python -c 'import random ; print(random.choice(["=1.21", "=1.22", "=1.23", ""]))')
spver=$(python -c 'import random ; print(random.choice(["=1.8", "=1.9", "=1.10", ""]))')
Expand Down Expand Up @@ -175,7 +175,7 @@ jobs:
# That is, we don't need to support versions of it that are two years old.
# But, it's still useful for us to test with different versions!
if [[ ${{ steps.sourcetype.outputs.selected}} == "conda-forge" ]] ; then
psgver=$(python -c 'import random ; print(random.choice(["=7.4.0", "=7.4.1", "=7.4.2", ""]))')
psgver=$(python -c 'import random ; print(random.choice(["=7.4.0", "=7.4.1", "=7.4.2", "=7.4.3.0", "=7.4.3.1", ""]))')
else
psgver=""
fi
Expand All @@ -187,7 +187,7 @@ jobs:
echo "versions: np${npver} sp${spver} pd${pdver} ak${akver} nx${nxver} numba${numbaver} yaml${yamlver} sparse${sparsever} psgver${psgver}"
# Once we have wheels for all OSes, we can delete the last two lines.
mamba install pytest coverage coveralls=3.3.1 pytest-randomly cffi donfig pyyaml${yamlver} sparse${sparsever} \
mamba install packaging pytest coverage coveralls=3.3.1 pytest-randomly cffi donfig pyyaml${yamlver} sparse${sparsever} \
pandas${pdver} scipy${spver} numpy${npver} awkward${akver} networkx${nxver} numba${numbaver} \
${{ matrix.slowtask == 'pytest_bizarro' && 'black' || '' }} \
${{ matrix.slowtask == 'notebooks' && 'matplotlib nbconvert jupyter "ipython>=7"' || '' }} \
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ repos:
additional_dependencies: [tomli]
files: ^(graphblas|docs)/
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.0.249
rev: v0.0.252
hooks:
- id: ruff
- repo: https://github.com/sphinx-contrib/sphinx-lint
Expand Down
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,16 +186,18 @@ Similar methods exist for BinaryOp, Monoid, and Semiring.
```python
import graphblas as gb

# numpy arrays
# 1-D array becomes Vector, 2-D array becomes Matrix
A = gb.io.from_numpy(m)
m = gb.io.to_numpy(A)

# scipy.sparse matrices
A = gb.io.from_scipy_sparse(m)
m = gb.io.to_scipy_sparse(m, format='csr')

# networkx graphs
A = gb.io.from_networkx(g)
g = gb.io.to_networkx(A)

# numpy arrays can use `from_dense` and `to_dense` on Vector and Matrix
v = gb.Vector.from_dense(m)
m = v.to_dense()

A = gb.Matrix.from_dense(m, missing_value=0)
m = A.to_dense(fill_value=0)
```
14 changes: 7 additions & 7 deletions docs/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Matrix and Vector, instead, have a ``.from_coo()`` and a ``.to_coo()`` method.
``.from_coo()`` takes index(es) and values as either:

- Python lists
- Numpy arrays
- NumPy arrays

If no dtype is provided, the data type is inferred from the values.

Expand Down Expand Up @@ -77,17 +77,17 @@ A python-graphblas Matrix can be created from a 2-D (PyData) sparse array or mat
``gb.io.to_pydata_sparse()`` will output a 2-D (PyData) sparse array given a python-graphblas Matrix.
The sparse format can be specified. It defaults to "coo".

Numpy (Dense)
NumPy (Dense)
-------------

While not useful for very large graphs, converting to and from small dense numpy arrays can be useful.

``gb.io.from_numpy()`` will convert a 1-D array into a Vector and a 2-D array into a Matrix. When converting
from numpy, zeros are treated as missing values.

``gb.io.to_numpy()`` will convert a Vector or Matrix into the dense equivalent in numpy, filling missing
values with zero.
``Vector.from_dense()`` converts a 1-D array into a Vector and
``Matrix.from_dense()`` a 2-D array into a Matrix. When converting from numpy, a value may be
chosen to become a missing value, such as ``Matrix.from_dense(a, missing_value=0)``.

``.to_dense()`` converts a Vector or Matrix into a numpy array. If there are missing values, a fill
value should be given such as ``.to_dense(fill_value=0)``.

SuiteSparse Export/Import
-------------------------
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ dependencies:
# For linting
- pre-commit
# For testing
- packaging
- pytest-cov
# For debugging
- icecream
Expand Down
9 changes: 4 additions & 5 deletions graphblas/core/operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -941,8 +941,7 @@ def __contains__(self, type_):
self[type_]
except (TypeError, KeyError, numba.NumbaError):
return False
else:
return True
return True

@classmethod
def _remove_nesting(cls, funcname, *, module=None, modname=None, strict=True):
Expand Down Expand Up @@ -2740,9 +2739,9 @@ def _initialize(cls):
cur_op._typed_ops[dtype] = bool_op

# Builtin monoids that are idempotent; i.e., `op(x, x) == x` for any x
for name in {"any", "band", "bor", "land", "lor", "max", "min"}:
for name in ["any", "band", "bor", "land", "lor", "max", "min"]:
getattr(monoid, name)._is_idempotent = True
for name in {
for name in [
"bitwise_and",
"bitwise_or",
"fmax",
Expand All @@ -2752,7 +2751,7 @@ def _initialize(cls):
"logical_or",
"maximum",
"minimum",
}:
]:
getattr(monoid.numpy, name)._is_idempotent = True

# Allow some functions to work on UDTs
Expand Down
3 changes: 1 addition & 2 deletions graphblas/dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,7 @@ def __lt__(self, other):
t2 = lookup_dtype(other).np_type
except ValueError:
raise TypeError(f"Invalid or unknown datatype: {other}") from None
else:
return (t1.kind, t1.itemsize, t1.name) < (t2.kind, t2.itemsize, t2.name)
return (t1.kind, t1.itemsize, t1.name) < (t2.kind, t2.itemsize, t2.name)

def __reduce__(self):
if self._is_udt:
Expand Down
17 changes: 10 additions & 7 deletions graphblas/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -550,14 +550,17 @@ def to_pydata_sparse(A, format="coo"):
if format not in {"coo", "dok", "gcxs"}:
raise ValueError(f"Invalid format: {format}")

if format == "gcxs":
B = to_scipy_sparse(A, format="csr")
if _output_type(A) is _Vector:
indices, values = A.to_coo(sort=False)
s = COO(indices, values, shape=A.shape)
else:
# obtain an intermediate conversion via hardcoded 'coo' intermediate object
B = to_scipy_sparse(A, format="coo")

# convert to pydata.sparse
s = COO.from_scipy_sparse(B)
if format == "gcxs":
B = to_scipy_sparse(A, format="csr")
else:
# obtain an intermediate conversion via hardcoded 'coo' intermediate object
B = to_scipy_sparse(A, format="coo")
# convert to pydata.sparse
s = COO.from_scipy_sparse(B)

# express in the desired format
return s.asformat(format)
Expand Down
4 changes: 3 additions & 1 deletion graphblas/tests/test_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,6 @@ class bad:


def test_version():
assert gb.__version__ > "2022.11.0"
from packaging.version import parse

assert parse(gb.__version__) > parse("2022.11.0")
23 changes: 23 additions & 0 deletions graphblas/tests/test_dtype.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,19 @@
import itertools
import pickle
import string
import sys

import numpy as np
import pytest

import graphblas as gb
from graphblas import dtypes
from graphblas.core import lib
from graphblas.dtypes import lookup_dtype

suitesparse = gb.backend == "suitesparse"
is_win = sys.platform.startswith("win")

all_dtypes = [
dtypes.BOOL,
dtypes.INT8,
Expand Down Expand Up @@ -233,3 +238,21 @@ def test_dtype_to_from_string():
lookup_dtype(dtype)
else:
assert dtype == dtype2


def test_has_complex():
"""Only SuiteSparse has complex (with Windows support in Python after v7.4.3.1)"""
if not suitesparse:
assert not dtypes._supports_complex
return
if not is_win:
assert dtypes._supports_complex
return

import suitesparse_graphblas as ssgb
from packaging.version import parse

if parse(ssgb.__version__) < parse("7.4.3.1"):
assert not dtypes._supports_complex
else:
assert dtypes._supports_complex
9 changes: 6 additions & 3 deletions graphblas/tests/test_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -406,7 +406,8 @@ def test_vector_to_from_pydata_sparse():
assert v.isequal(gb.Vector.from_coo(coords, data, dtype=dtypes.INT64), check_dtype=True)

t = gb.io.to_pydata_sparse(v)
assert t == s
assert t.shape == s.shape
assert (t == s).all()


@pytest.mark.skipif("not sparse")
Expand All @@ -418,7 +419,8 @@ def test_matrix_to_from_pydata_sparse():
assert v.isequal(gb.Matrix.from_coo(*coords, data, dtype=dtypes.INT64), check_dtype=False)

t = gb.io.to_pydata_sparse(v)
assert t == s
assert t.shape == s.shape
assert (t == s).all()

# test ndim
e = sparse.random(shape=(5, 5, 5), density=0)
Expand All @@ -437,7 +439,8 @@ def test_matrix_to_from_pydata_sparse():
assert w.isequal(gb.Matrix.from_coo(*coords, data, dtype=dtypes.INT64), check_dtype=False)

r = gb.io.to_pydata_sparse(w, format="gcxs")
assert r == g
assert r.shape == g.shape
assert (r == g).all()
with pytest.raises(ValueError, match="format"):
gb.io.to_pydata_sparse(w, format="badformat")
with pytest.raises(TypeError, match="sparse.pydata"):
Expand Down
2 changes: 1 addition & 1 deletion graphblas/viz.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ def datashade(M, agg="count", *, width=None, height=None, opts_kwargs=None, **kw
images.extend(image_row)
return hv.Layout(images).cols(ncols)

kwds = dict( # noqa: C408
kwds = dict( # noqa: C408 pylint: disable=use-dict-literal
x="col",
y="row",
c="val",
Expand Down
6 changes: 3 additions & 3 deletions notebooks/Louvain.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -270,7 +270,7 @@
" [0, 0, 1, 0, 0, 1, 0],\n",
" ]\n",
")\n",
"g = gb.io.from_numpy(m)"
"g = Matrix.from_dense(m, missing_value=0)"
]
},
{
Expand Down Expand Up @@ -370,7 +370,7 @@
" [0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n",
" ]\n",
")\n",
"g = gb.io.from_numpy(m)"
"g = Matrix.from_dense(m, missing_value=0)"
]
},
{
Expand Down Expand Up @@ -469,7 +469,7 @@
" [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],\n",
" ]\n",
")\n",
"g = gb.io.from_numpy(m)"
"g = Matrix.from_dense(m, missing_value=0)"
]
},
{
Expand Down
6 changes: 6 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ viz = [
]
test = [
"pytest",
"packaging",
"pandas >=1.2",
"scipy >=1.8",
]
Expand All @@ -94,6 +95,7 @@ complete = [
"sparse >=0.12",
"matplotlib >=3.5",
"pytest",
"packaging",
]

[tool.setuptools]
Expand Down Expand Up @@ -147,6 +149,9 @@ filterwarnings = [
"error",
# MAINT: we can drop support for sparse <0.13 at any time
"ignore:`np.bool` is a deprecated alias:DeprecationWarning:sparse._umath", # sparse <0.13
# sparse 0.14.0 (2022-02-24) began raising this warning; it has been reported and fixed upstream.
"ignore:coords should be an ndarray. This will raise a ValueError:DeprecationWarning:sparse._coo.core",

# setuptools v67.3.0 deprecated `pkg_resources.declare_namespace` on 13 Feb 2023. See:
# https://setuptools.pypa.io/en/latest/history.html#v67-3-0
# MAINT: check if this is still necessary in 2025
Expand Down Expand Up @@ -271,6 +276,7 @@ ignore = [
"PLR0913", # Too many arguments to function call
"PLR0915", # Too many statements
"PLR2004", # Magic number used in comparison, consider replacing magic with a constant variable
"PLW2901", # Outer for loop variable ... overwritten by inner assignment target (Note: good advice, but too strict)
"RET502", # Do not implicitly `return None` in function able to return non-`None` value
"RET503", # Missing explicit `return` at the end of function able to return non-`None` value
"RET504", # Unnecessary variable assignment before `return` statement
Expand Down
3 changes: 2 additions & 1 deletion scripts/check_versions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
# Tip: add `--json` for more information.
conda search 'numpy[channel=conda-forge]>=1.24.2'
conda search 'pandas[channel=conda-forge]>=1.5.3'
conda search 'scipy[channel=conda-forge]>=1.10.0'
conda search 'scipy[channel=conda-forge]>=1.10.1'
conda search 'networkx[channel=conda-forge]>=3.0'
conda search 'awkward[channel=conda-forge]>=2.0.8'
conda search 'sparse[channel=conda-forge]>=0.14.0'
conda search 'numba[channel=conda-forge]>=0.56.4'
conda search 'pyyaml[channel=conda-forge]>=6.0'
conda search 'flake8-comprehensions[channel=conda-forge]>=3.10.1'
Expand Down

0 comments on commit 7be68cf

Please sign in to comment.