Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DE-584] Refactor Deprecated /_api/simple methods #275

Merged
merged 10 commits into from
Aug 22, 2023

Conversation

aMahanna
Copy link
Member

@aMahanna aMahanna commented Aug 18, 2023

Main changes

  • Re-implements the following methods via AQL:

    • collection.ids
    • collection.keys
    • collection.all
    • collection.find
    • collection.find_in_box (requesting feedback / second opinion on implementation)
    • collection.random
    • collection.update_match
    • collection.replace_match
    • collection.delete_match
  • Implements the collection.get_index method (GET /_api/index/{id}) as it was previously missing

Miscellaneous changes

  • New utility function: build_filter_conditions (now used in the filter-based simple methods)
  • Add allow_dirty_read parameter to the reworked simple methods given that they are now implemented using the _api/cursor endpoint
  • docstring cleanup
  • test case cleanup

@aMahanna aMahanna self-assigned this Aug 18, 2023
@@ -1364,7 +1379,7 @@ def test_document_find_in_box(col, bad_col, geo, cluster):
result = col.find_in_box(
latitude1=0, longitude1=0, latitude2=6, longitude2=3, limit=0, index=geo["id"]
)
assert clean_doc(result) == [doc1, doc3]
assert clean_doc(result) == []
Copy link
Member Author

@aMahanna aMahanna Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on limit = 0, this case should have never returned any data. strange 🤔

regardless, limit = 0 is behaving as expected now

if index is not None:
data["geo"] = self._name + "/" + index
geo_index = self.get_index(f"{self.name}/{index}")
Copy link
Member Author

@aMahanna aMahanna Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In need of a second opinion on the new implementation of collection.find_in_box

Given the deprecation of the WITHIN_RECTANGLE AQL function, we are now opting to use the GEO_CONTAINS method in combination with the GEO_POLYGON method: https://www.arangodb.com/docs/stable/aql/functions-geo.html#within_rectangle

The GEO_CONTAINS method takes two parameters: the polygon, and the coordinates of the document.

Given that the attribute names representing the coordinates of the document can vary on a user-basis, we cannot assume that doc.longitude and doc.latitude will always exist.

In order to prevent any breaking changes to collection.find_in_box, we rely on the existing index parameter to fetch the Geo Index's fields values to use as the document's coordinate attributes.

Therefore, we assume that the Geo Index fields property is an array containing 2 values at most. Otherwise, a ValueError is raised.

Happy to move to another solution

https://github.com/ArangoDB-Community/python-arango/blob/58a67117f1426d0bd95d09309e58e4b7337d6407/arango/collection.py#L905-L999

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!
Trying to create a GEO index with more than 2 attributes will throw an exception. IMO, double checking it here (along with the fact that the index is truly "geo") is a good idea.

Copy link
Member

@apetenchea apetenchea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! We should just touch upon the default behavior of the index attribute in find_in_box.

Comment on lines 947 to 949
# TODO: REVISIT `geo_str` logic
# https://www.arangodb.com/docs/stable/aql/functions-geo.html#within_rectangle
# Initial assumption that attribute names are "latitude" and "longitude"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default behavior of whithin_rectangle, if an index is not specified, is to use the first geo index that is found. Assuming that the user wants to use whatever geo index is there has its pros and cons, and while I'm not really a fan of that, for now, I think we should keep the current behavior unchanged.
This would be the equivalent of retrieving self.indexes() and searching for the first one with type: geo. If no such index exists, we should throw.
The error that is returned from the arangodb server is:

"ERROR_QUERY_GEO_INDEX_MISSING" : { "code" : 1570, "message" : "no suitable geo index found for geo restriction on '%s'" }

We could create a new exception for this case (e.g. IndexMissingError).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, let me know what you think: 55d559d

if index is not None:
data["geo"] = self._name + "/" + index
geo_index = self.get_index(f"{self.name}/{index}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!
Trying to create a GEO index with more than 2 attributes will throw an exception. IMO, double checking it here (along with the fact that the index is truly "geo") is a good idea.

@codecov-commenter
Copy link

codecov-commenter commented Aug 21, 2023

Codecov Report

Merging #275 (06fa165) into main (744a6e9) will decrease coverage by 0.24%.
Report is 9 commits behind head on main.
The diff coverage is 92.59%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main     #275      +/-   ##
==========================================
- Coverage   99.01%   98.77%   -0.24%     
==========================================
  Files          26       26              
  Lines        3958     4016      +58     
==========================================
+ Hits         3919     3967      +48     
- Misses         39       49      +10     
Files Changed Coverage Δ
arango/client.py 98.41% <ø> (ø)
arango/http.py 100.00% <ø> (ø)
arango/collection.py 97.59% <91.39%> (-0.92%) ⬇️
arango/exceptions.py 100.00% <100.00%> (ø)
arango/request.py 94.44% <100.00%> (ø)
arango/utils.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

Copy link
Member

@apetenchea apetenchea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

adding a geo index only allows 1 or 2 fields, so this case will not happen (unless this changes in the future)
@aMahanna aMahanna added the Ready to Merge Pull request is ready to merge label Aug 22, 2023
@apetenchea apetenchea merged commit 022afc2 into main Aug 22, 2023
6 checks passed
@apetenchea apetenchea deleted the feature/de-584-deprecated-simple-api branch August 22, 2023 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready to Merge Pull request is ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants