Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: preserve the order of the keys #211

Merged
merged 10 commits into from
Nov 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .credo.exs
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@
{Credo.Check.Refactor.MapJoin, []},
{Credo.Check.Refactor.NegatedConditionsInUnless, []},
{Credo.Check.Refactor.NegatedConditionsWithElse, []},
{Credo.Check.Refactor.Nesting, []},
{Credo.Check.Refactor.Nesting, [max_nesting: 4]},
{Credo.Check.Refactor.UnlessWithElse, []},
{Credo.Check.Refactor.WithClauses, []},
{Credo.Check.Refactor.FilterFilter, []},
Expand Down
92 changes: 85 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,15 +135,18 @@ Mongo.insert_many(top, "users", [

## Data Representation

This driver chooses to accept both maps and lists of key-value tuples when encoding BSON documents (1), but will only decode documents into maps. This has the side effect that document field order is lost when decoding. Maps are convenient to work with, but map keys are not ordered, unlike BSON document fields.
This driver chooses to accept both maps and lists of key-value tuples when encoding BSON documents (1), but will only
decode documents into maps. Maps are convenient to work with, but Elixir map keys are not ordered, unlike BSON document
keys.

Driver users should represent documents using a list of tuples when field order matters, for example when sorting by multiple fields:
That design decision means document key order is lost when encoding Elixir maps to BSON and, conversely, when decoding
BSON documents to Elixir maps. However, see [Preserve Document Key Order](#preserve-document-key-order) to learn how to
preserve key order when it matters.

```elixir
Mongo.find(top, "users", %{}, sort: [last_name: 1, first_name: 1, _id: 1])
```

Additionally, the driver accepts both atoms and strings for document keys, but will only decode them into strings. Creating atoms from arbitrary input (such as database documents) is [discouraged](https://elixir-lang.org/getting-started/mix-otp/genserver.html#:~:text=However%2C%20naming%20dynamic,our%20system%20memory!) because atoms are not garbage collected.
Additionally, the driver accepts both atoms and strings for document keys, but will only decode them into strings.
Creating atoms from arbitrary input (such as database documents) is
[discouraged](https://elixir-lang.org/getting-started/mix-otp/genserver.html#:~:text=However%2C%20naming%20dynamic,our%20system%20memory!)
because atoms are not garbage collected.

[BSON symbols (deprecated)](https://bsonspec.org/spec.html#:~:text=Symbol.%20%E2%80%94%20Deprecated) can only be decoded (2).

Expand All @@ -169,6 +172,81 @@ Additionally, the driver accepts both atoms and strings for document keys, but w
max key :BSON_max
decimal128 Decimal{}

## Preserve Document Key Order

### Encoding from Elixir to BSON

For some MongoDB operations, the order of the keys in a document affect the result. For example, that is the case when
sorting a query by multiple fields.

In those cases, driver users should represent documents using a list of tuples (or a keyword list) to preserve the
order. Example:

```elixir
Mongo.find(top, "users", %{}, sort: [last_name: 1, first_name: 1, _id: 1])
```

The query above will sort users by last name, then by first name and finally by ID. If an Elixir map had been used to
specify `:sort`, query results would end up sorted unexpectedly wrong.

### Decoding from BSON to Elixir

Decoded BSON documents are always represented by Elixir maps because the driver depends on that to implement its
functionality.

If the order of document keys as stored by MongoDB is needed, the driver can be configured to use a BSON decoder module
that puts a list of keys in the original order under the `:__order__` key (and it works recursively).

```elixir
config :mongodb_driver,
decoder: BSON.PreserveOrderDecoder
```

It is possible to customize the key. For example, to use `:original_order` instead of the default `:__order__`:

```elixir
config :mongodb_driver,
decoder: {BSON.PreserveOrderDecoder, key: :original_order}
```

The resulting maps with annotated key order can be recursively transformed into lists of tuples. That allows for
preserving the order again when encoding. Here is an example of how to achieve that:

```elixir
defmodule MapWithOrder do
def to_list(doc, order_key \\ :__order__) do
do_to_list(doc, order_key)
end

defp do_to_list(%{__struct__: _} = elem, _order_key) do
elem
end

defp do_to_list(doc, order_key) when is_map(doc) do
doc
|> Map.get(order_key, Map.keys(doc))
|> Enum.map(fn key -> {key, do_to_list(Map.get(doc, key), order_key)} end)
end

defp do_to_list(xs, order_key) when is_list(xs) do
Enum.map(xs, fn elem -> do_to_list(elem, order_key) end)
end

defp do_to_list(elem, _order_key) do
elem
end
end

# doc = ...
MapWithOrder.to_list(doc)
```

Note that structs are kept as-is, to handle special values such as `BSON.ObjectId`.

The decoder module is defined at compile time. The default decoder is `BSON.Decoder`, which does not preserve document
key order. As it needs to execute fewer operations when decoding data received from MongoDB, it offers improved
performance. Therefore, the default decoder is recommended for most use cases of this driver.

## Writing your own encoding info

If you want to write a custom struct to your mongo collection - you can do that
Expand Down
Loading
Loading