Skip to content

Commit

Permalink
1103 Improve arrow function (#1112)
Browse files Browse the repository at this point in the history
* improve arrow function

* fix tests

* refactor, so we have an `Arrow` function

* add test for nested arrow functions

* skip sqlite

* allow `arrow` to access keys multiple levels deep

* improve the example data in the playground for JSON data

* move arrow function to JSON, as it can be used by JSON or JSONB

* add `arrow` function to Arrow, so it can be called recursively

* change heading levels of JSON docs

* move `Arrow` to operators folder

* update docs

* improve docstring

* add `technicians` to example JSON

* improve docstrings

* allow `QueryString` as an arg type to `Arrow`

* fix docstring error

* make sure integers can be passed in

* add `QueryString` as an arg type to `arrow` method

* added `GetElementFromPath`

* add docs for ``from_path``

* add `__getitem__` as a shortcut for the arrow method

* update the docs to use the square bracket notation

* explain why the method is called `arrow`

* move arrow tests into separate class

* add `test_multiple_levels_deep`

* add tests for `for_path`

* last documentation tweaks

* add basic operator tests
  • Loading branch information
dantownsend authored Oct 23, 2024
1 parent 265b0c2 commit 85ead5a
Show file tree
Hide file tree
Showing 10 changed files with 482 additions and 106 deletions.
123 changes: 107 additions & 16 deletions docs/src/piccolo/schema/column_types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -189,18 +189,15 @@ Storing JSON can be useful in certain situations, for example - raw API
responses, data from a Javascript app, and for storing data with an unknown or
changing schema.

====
JSON
====
====================
``JSON`` / ``JSONB``
====================

.. autoclass:: JSON

=====
JSONB
=====

.. autoclass:: JSONB

===========
Serialising
===========

Expand All @@ -224,6 +221,7 @@ You can also pass in a JSON string if you prefer:
)
await studio.save()
=============
Deserialising
=============

Expand Down Expand Up @@ -257,29 +255,122 @@ With ``objects`` queries, we can modify the returned JSON, and then save it:
studio['facilities']['restaurant'] = False
await studio.save()
arrow
=====
================
Getting elements
================

``JSON`` and ``JSONB`` columns have an ``arrow`` method (representing the
``->`` operator in Postgres), which is useful for retrieving a child element
from the JSON data.

.. note:: Postgres and CockroachDB only.

``JSONB`` columns have an ``arrow`` function, which is useful for retrieving
a subset of the JSON data:
``select`` queries
==================

If we have the following JSON stored in the ``RecordingStudio.facilities``
column:

.. code-block:: json
{
"instruments": {
"drum_kits": 2,
"electric_guitars": 10
},
"restaurant": true,
"technicians": [
{
"name": "Alice Jones"
},
{
"name": "Bob Williams"
}
]
}
We can retrieve the ``restaurant`` value from the JSON object:

.. code-block:: python
>>> await RecordingStudio.select(
... RecordingStudio.name,
... RecordingStudio.facilities.arrow('mixing_desk').as_alias('mixing_desk')
... RecordingStudio.facilities.arrow('restaurant')
... .as_alias('restaurant')
... ).output(load_json=True)
[{'name': 'Abbey Road', 'mixing_desk': True}]
[{'restaurant': True}, ...]
It can also be used for filtering in a where clause:
As a convenience, you can use square brackets, instead of calling ``arrow``
explicitly:

.. code-block:: python
>>> await RecordingStudio.select(
... RecordingStudio.facilities['restaurant']
... .as_alias('restaurant')
... ).output(load_json=True)
[{'restaurant': True}, ...]
You can drill multiple levels deep by calling ``arrow`` multiple times (or
alternatively use the :ref:`from_path` method - see below).

Here we fetch the number of drum kits that the recording studio has:

.. code-block:: python
>>> await RecordingStudio.select(
... RecordingStudio.facilities["instruments"]["drum_kits"]
... .as_alias("drum_kits")
... ).output(load_json=True)
[{'drum_kits': 2}, ...]
If you have a JSON object which consists of arrays and objects, then you can
navigate the array elements by passing in an integer to ``arrow``.

Here we fetch the first technician from the array:

.. code-block:: python
>>> await RecordingStudio.select(
... RecordingStudio.facilities["technicians"][0]["name"]
... .as_alias("technician_name")
... ).output(load_json=True)
[{'technician_name': 'Alice Jones'}, ...]
``where`` clauses
=================

The ``arrow`` operator can also be used for filtering in a where clause:

.. code-block:: python
>>> await RecordingStudio.select(RecordingStudio.name).where(
... RecordingStudio.facilities.arrow('mixing_desk') == True
... RecordingStudio.facilities['mixing_desk'].eq(True)
... )
[{'name': 'Abbey Road'}]
.. _from_path:

=============
``from_path``
=============

This works the same as ``arrow`` but is more optimised if you need to return
part of a highly nested JSON structure.

.. code-block:: python
>>> await RecordingStudio.select(
... RecordingStudio.facilities.from_path([
... "technicians",
... 0,
... "name"
... ]).as_alias("technician_name")
... ).output(load_json=True)
[{'technician_name': 'Alice Jones'}, ...]
=============
Handling null
=============

Expand Down
9 changes: 9 additions & 0 deletions piccolo/apps/playground/commands/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,11 @@ def populate():
RecordingStudio.facilities: {
"restaurant": True,
"mixing_desk": True,
"instruments": {"electric_guitars": 10, "drum_kits": 2},
"technicians": [
{"name": "Alice Jones"},
{"name": "Bob Williams"},
],
},
}
)
Expand All @@ -244,6 +249,10 @@ def populate():
RecordingStudio.facilities: {
"restaurant": False,
"mixing_desk": True,
"instruments": {"electric_guitars": 6, "drum_kits": 3},
"technicians": [
{"name": "Frank Smith"},
],
},
},
)
Expand Down
117 changes: 78 additions & 39 deletions piccolo/columns/column_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,10 @@ class Band(Table):

if t.TYPE_CHECKING: # pragma: no cover
from piccolo.columns.base import ColumnMeta
from piccolo.query.operators.json import (
GetChildElement,
GetElementFromPath,
)
from piccolo.table import Table


Expand Down Expand Up @@ -2319,6 +2323,76 @@ def column_type(self):
else:
return "JSON"

###########################################################################

def arrow(self, key: t.Union[str, int, QueryString]) -> GetChildElement:
"""
Allows a child element of the JSON structure to be returned - for
example::
>>> await RecordingStudio.select(
... RecordingStudio.facilities.arrow("restaurant")
... )
"""
from piccolo.query.operators.json import GetChildElement

alias = self._alias or self._meta.get_default_alias()
return GetChildElement(identifier=self, key=key, alias=alias)

def __getitem__(
self, value: t.Union[str, int, QueryString]
) -> GetChildElement:
"""
A shortcut for the ``arrow`` method, used for retrieving a child
element.
For example:
.. code-block:: python
>>> await RecordingStudio.select(
... RecordingStudio.facilities["restaurant"]
... )
"""
return self.arrow(key=value)

def from_path(
self,
path: t.List[t.Union[str, int]],
) -> GetElementFromPath:
"""
Allows an element of the JSON structure to be returned, which can be
arbitrarily deep. For example::
>>> await RecordingStudio.select(
... RecordingStudio.facilities.from_path([
... "technician",
... 0,
... "first_name"
... ])
... )
It's the same as calling ``arrow`` multiple times, but is more
efficient / convenient if extracting highly nested data::
>>> await RecordingStudio.select(
... RecordingStudio.facilities.arrow(
... "technician"
... ).arrow(
... 0
... ).arrow(
... "first_name"
... )
... )
"""
from piccolo.query.operators.json import GetElementFromPath

alias = self._alias or self._meta.get_default_alias()
return GetElementFromPath(identifier=self, path=path, alias=alias)

###########################################################################
# Descriptors

Expand All @@ -2337,10 +2411,10 @@ def __set__(self, obj, value: t.Union[str, t.Dict]):

class JSONB(JSON):
"""
Used for storing JSON strings - Postgres only. The data is stored in a
binary format, and can be queried. Insertion can be slower (as it needs to
be converted to the binary format). The benefits of JSONB generally
outweigh the downsides.
Used for storing JSON strings - Postgres / CochroachDB only. The data is
stored in a binary format, and can be queried more efficiently. Insertion
can be slower (as it needs to be converted to the binary format). The
benefits of JSONB generally outweigh the downsides.
:param default:
Either a JSON string can be provided, or a Python ``dict`` or ``list``
Expand All @@ -2352,41 +2426,6 @@ class JSONB(JSON):
def column_type(self):
return "JSONB" # Must be defined, we override column_type() in JSON()

def arrow(self, key: str) -> JSONB:
"""
Allows part of the JSON structure to be returned - for example,
for {"a": 1}, and a key value of "a", then 1 will be returned.
"""
instance = t.cast(JSONB, self.copy())
instance.json_operator = f"-> '{key}'"
return instance

def get_select_string(
self, engine_type: str, with_alias: bool = True
) -> QueryString:
select_string = self._meta.get_full_name(with_alias=False)

if self.json_operator is not None:
select_string += f" {self.json_operator}"

if with_alias:
alias = self._alias or self._meta.get_default_alias()
select_string += f' AS "{alias}"'

return QueryString(select_string)

def eq(self, value) -> Where:
"""
See ``Boolean.eq`` for more details.
"""
return self.__eq__(value)

def ne(self, value) -> Where:
"""
See ``Boolean.ne`` for more details.
"""
return self.__ne__(value)

###########################################################################
# Descriptors

Expand Down
17 changes: 11 additions & 6 deletions piccolo/query/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from piccolo.columns.column_types import JSON, JSONB
from piccolo.custom_types import QueryResponseType, TableInstance
from piccolo.query.mixins import ColumnsDelegate
from piccolo.query.operators.json import JSONQueryString
from piccolo.querystring import QueryString
from piccolo.utils.encoding import load_json
from piccolo.utils.objects import make_nested_object
Expand Down Expand Up @@ -65,16 +66,20 @@ async def _process_results(self, results) -> QueryResponseType:
self, "columns_delegate", None
)

json_column_names: t.List[str] = []

if columns_delegate is not None:
json_columns = [
i
for i in columns_delegate.selected_columns
if isinstance(i, (JSON, JSONB))
]
json_columns: t.List[t.Union[JSON, JSONB]] = []

for column in columns_delegate.selected_columns:
if isinstance(column, (JSON, JSONB)):
json_columns.append(column)
elif isinstance(column, JSONQueryString):
if alias := column._alias:
json_column_names.append(alias)
else:
json_columns = self.table._meta.json_columns

json_column_names = []
for column in json_columns:
if column._alias is not None:
json_column_names.append(column._alias)
Expand Down
Empty file.
Loading

0 comments on commit 85ead5a

Please sign in to comment.