We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed job: spark-rapids-ml_nightly/612 Failed case:
FAILED tests/test_approximate_nearest_neighbors.py::test_ivfflat[float32-combo0] - assert (0.5233380000000001 > 0.5335820000000001 or 0.01024400000000003 <= 0.01)
Detailed log:
[2025-01-24T04:59:48.166Z] =================================== FAILURES =================================== [2025-01-24T04:59:48.166Z] _________________________ test_ivfflat[float32-combo0] _________________________ [2025-01-24T04:59:48.166Z] [gw0] linux -- Python 3.10.16 /root/miniconda3/bin/python3.10 [2025-01-24T04:59:48.166Z] [2025-01-24T04:59:48.166Z] combo = ('ivfflat', 'array', 10000, None, 'euclidean') [2025-01-24T04:59:48.166Z] data_type = <class 'numpy.float32'> [2025-01-24T04:59:48.166Z] [2025-01-24T04:59:48.166Z] @pytest.mark.parametrize( [2025-01-24T04:59:48.166Z] "combo", [2025-01-24T04:59:48.166Z] [ [2025-01-24T04:59:48.166Z] ( [2025-01-24T04:59:48.166Z] "ivfflat", [2025-01-24T04:59:48.166Z] "array", [2025-01-24T04:59:48.166Z] 10000, [2025-01-24T04:59:48.166Z] None, [2025-01-24T04:59:48.166Z] "euclidean", [2025-01-24T04:59:48.166Z] ), [2025-01-24T04:59:48.166Z] ( [2025-01-24T04:59:48.166Z] "ivfflat", [2025-01-24T04:59:48.166Z] "vector", [2025-01-24T04:59:48.166Z] 2000, [2025-01-24T04:59:48.166Z] {"nlist": 10, "nprobe": 2}, [2025-01-24T04:59:48.166Z] "euclidean", [2025-01-24T04:59:48.166Z] ), [2025-01-24T04:59:48.166Z] ( [2025-01-24T04:59:48.166Z] "ivfflat", [2025-01-24T04:59:48.166Z] "multi_cols", [2025-01-24T04:59:48.166Z] 5000, [2025-01-24T04:59:48.166Z] {"nlist": 20, "nprobe": 4}, [2025-01-24T04:59:48.166Z] "euclidean", [2025-01-24T04:59:48.166Z] ), [2025-01-24T04:59:48.166Z] ( [2025-01-24T04:59:48.166Z] "ivfflat", [2025-01-24T04:59:48.166Z] "array", [2025-01-24T04:59:48.166Z] 2000, [2025-01-24T04:59:48.166Z] {"nlist": 10, "nprobe": 2}, [2025-01-24T04:59:48.166Z] "sqeuclidean", [2025-01-24T04:59:48.166Z] ), [2025-01-24T04:59:48.166Z] ("ivfflat", "vector", 5000, {"nlist": 20, "nprobe": 4}, "l2"), [2025-01-24T04:59:48.166Z] ( [2025-01-24T04:59:48.166Z] "ivfflat", [2025-01-24T04:59:48.166Z] "multi_cols", [2025-01-24T04:59:48.166Z] 2000, [2025-01-24T04:59:48.166Z] {"nlist": 10, "nprobe": 2}, [2025-01-24T04:59:48.166Z] "inner_product", [2025-01-24T04:59:48.166Z] ), [2025-01-24T04:59:48.166Z] ( [2025-01-24T04:59:48.166Z] "ivfflat", [2025-01-24T04:59:48.166Z] "array", [2025-01-24T04:59:48.166Z] 2000, [2025-01-24T04:59:48.166Z] {"nlist": 10, "nprobe": 2}, [2025-01-24T04:59:48.166Z] "cosine", [2025-01-24T04:59:48.166Z] ), [2025-01-24T04:59:48.166Z] ], [2025-01-24T04:59:48.166Z] ) # vector feature type will be converted to float32 to be compatible with cuml single-GPU NearestNeighbors Class [2025-01-24T04:59:48.166Z] @pytest.mark.parametrize("data_type", [np.float32]) [2025-01-24T04:59:48.166Z] def test_ivfflat( [2025-01-24T04:59:48.166Z] combo: Tuple[str, str, int, Optional[Dict[str, Any]], str], [2025-01-24T04:59:48.166Z] data_type: np.dtype, [2025-01-24T04:59:48.166Z] ) -> None: [2025-01-24T04:59:48.166Z] algoParams = combo[3] [2025-01-24T04:59:48.166Z] [2025-01-24T04:59:48.166Z] # cuvs ivf_flat None sets nlist to 1000 and nprobe to 20, leading to unstable results when run multiple times [2025-01-24T04:59:48.166Z] expected_avg_recall: float = 0.95 if algoParams != None else 0.5 [2025-01-24T04:59:48.166Z] expected_avg_dist_gap: float = 1e-4 if algoParams != None else 1e-2 [2025-01-24T04:59:48.166Z] tolerance: float = 1e-4 if algoParams != None else 1e-2 [2025-01-24T04:59:48.166Z] data_shape: Tuple[int, int] = (10000, 50) [2025-01-24T04:59:48.166Z] > ann_algorithm_test_func( [2025-01-24T04:59:48.166Z] combo=combo, [2025-01-24T04:59:48.166Z] data_shape=data_shape, [2025-01-24T04:59:48.166Z] data_type=data_type, [2025-01-24T04:59:48.166Z] expected_avg_recall=expected_avg_recall, [2025-01-24T04:59:48.166Z] expected_avg_dist_gap=expected_avg_dist_gap, [2025-01-24T04:59:48.166Z] tolerance=tolerance, [2025-01-24T04:59:48.166Z] ) [2025-01-24T04:59:48.166Z] [2025-01-24T04:59:48.166Z] tests/test_approximate_nearest_neighbors.py:632: [2025-01-24T04:59:48.166Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ [2025-01-24T04:59:48.166Z] tests/test_approximate_nearest_neighbors.py:506: in ann_algorithm_test_func [2025-01-24T04:59:48.166Z] ann_evaluator.compare_with_cuml_or_cuvs_sg( [2025-01-24T04:59:48.166Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ [2025-01-24T04:59:48.166Z] [2025-01-24T04:59:48.166Z] self = <tests.test_approximate_nearest_neighbors.ANNEvaluator object at 0x7ff92393b4c0> [2025-01-24T04:59:48.166Z] algorithm = 'ivfflat', algoParams = None [2025-01-24T04:59:48.166Z] given_indices = array([[ 0, 4709, 9361, ..., 3312, 7312, 5266], [2025-01-24T04:59:48.166Z] [ 1, 8804, 1531, ..., 7962, 705, 2092], [2025-01-24T04:59:48.166Z] [ 2, 5018...5482, 6680, 9051], [2025-01-24T04:59:48.166Z] [9998, 1102, 9694, ..., 1317, 2800, 17], [2025-01-24T04:59:48.166Z] [9999, 6308, 7655, ..., 8746, 3210, 8370]]) [2025-01-24T04:59:48.166Z] given_distances = array([[0. , 0.17746431, 0.17917138, ..., 0.20973857, 0.21014556, [2025-01-24T04:59:48.166Z] 0.21043234], [2025-01-24T04:59:48.166Z] [0. , 0.17...84, [2025-01-24T04:59:48.166Z] 0.22275288], [2025-01-24T04:59:48.166Z] [0. , 0.15567206, 0.17591041, ..., 0.20936921, 0.20940953, [2025-01-24T04:59:48.166Z] 0.20962319]]) [2025-01-24T04:59:48.166Z] tolerance = 0.01 [2025-01-24T04:59:48.166Z] [2025-01-24T04:59:48.166Z] def compare_with_cuml_or_cuvs_sg( [2025-01-24T04:59:48.166Z] self, [2025-01-24T04:59:48.166Z] algorithm: str, [2025-01-24T04:59:48.166Z] algoParams: Optional[Dict[str, Any]], [2025-01-24T04:59:48.166Z] given_indices: np.ndarray, [2025-01-24T04:59:48.166Z] given_distances: np.ndarray, [2025-01-24T04:59:48.166Z] tolerance: float, [2025-01-24T04:59:48.166Z] ) -> None: [2025-01-24T04:59:48.166Z] # compare with cuml sg ANN on avg_recall and avg_dist_gap [2025-01-24T04:59:48.166Z] cuvssg_distances, cuvssg_indices = self.get_cuvs_sg_results( [2025-01-24T04:59:48.166Z] algorithm=algorithm, algoParams=algoParams [2025-01-24T04:59:48.166Z] ) [2025-01-24T04:59:48.166Z] [2025-01-24T04:59:48.166Z] # compare cuml sg with given results [2025-01-24T04:59:48.166Z] avg_recall_cumlann = self.cal_avg_recall(cuvssg_indices) [2025-01-24T04:59:48.166Z] avg_recall = self.cal_avg_recall(given_indices) [2025-01-24T04:59:48.166Z] > assert (avg_recall > avg_recall_cumlann) or abs( [2025-01-24T04:59:48.166Z] avg_recall - avg_recall_cumlann [2025-01-24T04:59:48.166Z] ) <= tolerance [2025-01-24T04:59:48.166Z] E assert (0.5233380000000001 > 0.5335820000000001 or 0.01024400000000003 <= 0.01) [2025-01-24T04:59:48.166Z] E + where 0.01024400000000003 = abs((0.5233380000000001 - 0.5335820000000001)) [2025-01-24T04:59:48.166Z] [2025-01-24T04:59:48.166Z] tests/test_approximate_nearest_neighbors.py:308: AssertionError
The text was updated successfully, but these errors were encountered:
Fix has been merged: #828 Nightly gets passed.
Sorry, something went wrong.
No branches or pull requests
Failed job: spark-rapids-ml_nightly/612
Failed case:
Detailed log:
The text was updated successfully, but these errors were encountered: