Skip to content

Commit

Permalink
Change package name from msgpack to msgpack-sorted/msgpack_sorted. Im…
Browse files Browse the repository at this point in the history
…plemented the sort_keys option.
  • Loading branch information
Yaakov Belch committed Aug 18, 2023
1 parent 715126c commit 40501f5
Show file tree
Hide file tree
Showing 42 changed files with 125 additions and 82 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/black.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,4 @@ jobs:
- name: Black Code Formatter
run: |
pip install black==22.3.0
black -S --diff --check msgpack/ test/ setup.py
black -S --diff --check msgpack_sorted/ test/ setup.py
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,10 @@ dist/*
*.pyo
*.so
*~
msgpack/__version__.py
msgpack/*.cpp
msgpack_sorted/__version__.py
msgpack_sorted/*.cpp
*.egg-info
.eggs/
/venv
/tags
/docs/_build
Expand Down
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
include setup.py
include COPYING
include README.md
recursive-include msgpack *.h *.c *.pyx *.cpp
recursive-include msgpack_sorted *.h *.c *.pyx *.cpp
recursive-include test *.py
12 changes: 6 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
PYTHON_SOURCES = msgpack test setup.py
PYTHON_SOURCES = msgpack_sorted test setup.py

.PHONY: all
all: cython
Expand All @@ -14,7 +14,7 @@ pyupgrade:

.PHONY: cython
cython:
cython --cplus msgpack/_cmsgpack.pyx
cython --cplus msgpack_sorted/_cmsgpack.pyx

.PHONY: test
test: cython
Expand All @@ -29,10 +29,10 @@ serve-doc: all
.PHONY: clean
clean:
rm -rf build
rm -f msgpack/_cmsgpack.cpp
rm -f msgpack/_cmsgpack.*.so
rm -f msgpack/_cmsgpack.*.pyd
rm -rf msgpack/__pycache__
rm -f msgpack_sorted/_cmsgpack.cpp
rm -f msgpack_sorted/_cmsgpack.*.so
rm -f msgpack_sorted/_cmsgpack.*.pyd
rm -rf msgpack_sorted/__pycache__
rm -rf test/__pycache__

.PHONY: update-docker
Expand Down
65 changes: 41 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,21 @@
# MessagePack for Python
# MessagePack with key-sorted dictionaries for Python

This package `msgpack_sorted` is a fork of the `msgpack` python package.
It adds only one option `sort_keys` (default: False) and its implementation: Sort
dictionary keys with the python `sorted` function when serializing data.

The serialized data format is identical to the msgpack standard, `msgpack_sorted` and
`msgpack` can correctly parse each others output.

This forked package is not intended to replace the official `msgpack` package --
but to coexist with it. For that purpose its name is change to `msgpack_sorted`.
you can install it with `pip install msgpack-sorted` and import it with
`import msgpack_sorted as msgpack`.

Plase refer to the official documentation of `msgpack` for all features (except the
option `sort_keys` explained above).

Most of the documentation below is retained from the `msgpack` package.

[![Build Status](https://github.com/msgpack/msgpack-python/actions/workflows/wheel.yml/badge.svg)](https://github.com/msgpack/msgpack-python/actions/workflows/wheel.yml)
[![Documentation Status](https://readthedocs.org/projects/msgpack-python/badge/?version=latest)](https://msgpack-python.readthedocs.io/en/latest/?badge=latest)
Expand Down Expand Up @@ -60,15 +77,15 @@ See note below for detail.
## Install

```
$ pip install msgpack
$ pip install msgpack-sorted
```

### Pure Python implementation

The extension module in msgpack (`msgpack._cmsgpack`) does not support
The extension module in msgpack_sorted (`msgpack_sorted._cmsgpack`) does not support
Python 2 and PyPy.

But msgpack provides a pure Python implementation (`msgpack.fallback`)
But msgpack_sorted provides a pure Python implementation (`msgpack_sorted.fallback`)
for PyPy and Python 2.


Expand All @@ -82,31 +99,31 @@ Without extension, using pure Python implementation on CPython runs slowly.

## How to use

NOTE: In examples below, I use `raw=False` and `use_bin_type=True` for users
using msgpack < 1.0. These options are default from msgpack 1.0 so you can omit them.
NOTE: For msgpack_sorted, `raw=False` and `use_bin_type=True` are defaults --- just
as in msgpack >= 1.0.


### One-shot pack & unpack

Use `packb` for packing and `unpackb` for unpacking.
msgpack provides `dumps` and `loads` as an alias for compatibility with
msgpack_sorted provides `dumps` and `loads` as an alias for compatibility with
`json` and `pickle`.

`pack` and `dump` packs to a file-like object.
`unpack` and `load` unpacks from a file-like object.

```pycon
>>> import msgpack
>>> msgpack.packb([1, 2, 3], use_bin_type=True)
>>> import msgpack_sorted as msgpack
>>> msgpack.packb([1, 2, 3])
'\x93\x01\x02\x03'
>>> msgpack.unpackb(_, raw=False)
>>> msgpack.unpackb(_)
[1, 2, 3]
```

`unpack` unpacks msgpack's array to Python's list, but can also unpack to tuple:

```pycon
>>> msgpack.unpackb(b'\x93\x01\x02\x03', use_list=False, raw=False)
>>> msgpack.unpackb(b'\x93\x01\x02\x03', use_list=False)
(1, 2, 3)
```

Expand All @@ -122,16 +139,16 @@ Read the docstring for other options.
stream (or from bytes provided through its `feed` method).

```py
import msgpack
import msgpack_sorted as msgpack
from io import BytesIO

buf = BytesIO()
for i in range(100):
buf.write(msgpack.packb(i, use_bin_type=True))
buf.write(msgpack.packb(i))

buf.seek(0)

unpacker = msgpack.Unpacker(buf, raw=False)
unpacker = msgpack.Unpacker(buf)
for unpacked in unpacker:
print(unpacked)
```
Expand All @@ -144,7 +161,7 @@ It is also possible to pack/unpack custom data types. Here is an example for

```py
import datetime
import msgpack
import msgpack_sorted as msgpack

useful_dict = {
"id": 1,
Expand All @@ -162,8 +179,8 @@ def encode_datetime(obj):
return obj


packed_dict = msgpack.packb(useful_dict, default=encode_datetime, use_bin_type=True)
this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime, raw=False)
packed_dict = msgpack.packb(useful_dict, default=encode_datetime)
this_dict_again = msgpack.unpackb(packed_dict, object_hook=decode_datetime)
```

`Unpacker`'s `object_hook` callback receives a dict; the
Expand All @@ -176,7 +193,7 @@ key-value pairs.
It is also possible to pack/unpack custom data types using the **ext** type.

```pycon
>>> import msgpack
>>> import msgpack_sorted as msgpack
>>> import array
>>> def default(obj):
... if isinstance(obj, array.array) and obj.typecode == 'd':
Expand All @@ -191,8 +208,8 @@ It is also possible to pack/unpack custom data types using the **ext** type.
... return ExtType(code, data)
...
>>> data = array.array('d', [1.2, 3.4])
>>> packed = msgpack.packb(data, default=default, use_bin_type=True)
>>> unpacked = msgpack.unpackb(packed, ext_hook=ext_hook, raw=False)
>>> packed = msgpack.packb(data, default=default)
>>> unpacked = msgpack.unpackb(packed, ext_hook=ext_hook)
>>> data == unpacked
True
```
Expand All @@ -219,10 +236,10 @@ You can pack into and unpack from this old spec using `use_bin_type=False`
and `raw=True` options.

```pycon
>>> import msgpack
>>> import msgpack_sorted as msgpack
>>> msgpack.unpackb(msgpack.packb([b'spam', 'eggs'], use_bin_type=False), raw=True)
[b'spam', b'eggs']
>>> msgpack.unpackb(msgpack.packb([b'spam', 'eggs'], use_bin_type=True), raw=False)
>>> msgpack.unpackb(msgpack.packb([b'spam', 'eggs']))
[b'spam', 'eggs']
```

Expand All @@ -231,7 +248,7 @@ and `raw=True` options.
To use the **ext** type, pass `msgpack.ExtType` object to packer.

```pycon
>>> import msgpack
>>> import msgpack_sorted as msgpack
>>> packed = msgpack.packb(msgpack.ExtType(42, b'xyzzy'))
>>> msgpack.unpackb(packed)
ExtType(code=42, data='xyzzy')
Expand All @@ -242,7 +259,7 @@ You can use it with `default` and `ext_hook`. See below.

### Security

To unpacking data received from unreliable source, msgpack provides
To unpacking data received from unreliable source, msgpack_sorted provides
two security options.

`max_buffer_size` (default: `100*1024*1024`) limits the internal buffer size.
Expand Down
4 changes: 2 additions & 2 deletions benchmark/benchmark.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from msgpack import fallback
from msgpack_sorted import fallback

try:
from msgpack import _cmsgpack
from msgpack_sorted import _cmsgpack

has_ext = True
except ImportError:
Expand Down
4 changes: 2 additions & 2 deletions docker/runtests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ for V in "${PYTHON_VERSIONS[@]}"; do
$PYBIN/python setup.py install
rm -rf build/ # Avoid lib build by narrow Python is used by wide python
$PYBIN/pip install pytest
pushd test # prevent importing msgpack package in current directory.
pushd test # prevent importing msgpack_sorted package in current directory.
$PYBIN/python -c 'import sys; print(hex(sys.maxsize))'
$PYBIN/python -c 'from msgpack import _cmsgpack' # Ensure extension is available
$PYBIN/python -c 'from msgpack_sorted import _cmsgpack' # Ensure extension is available
$PYBIN/pytest -v .
popd
done
9 changes: 5 additions & 4 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
API reference
=============

.. module:: msgpack
.. module:: msgpack_sorted

.. autofunction:: pack

Expand Down Expand Up @@ -34,10 +34,11 @@ API reference
exceptions
----------

These exceptions are accessible via `msgpack` package.
(For example, `msgpack.OutOfData` is shortcut for `msgpack.exceptions.OutOfData`)
These exceptions are accessible via `msgpack_sorted` package.
(For example, `msgpack_sorted.OutOfData` is shortcut
for `msgpack_sorted.exceptions.OutOfData`)

.. automodule:: msgpack.exceptions
.. automodule:: msgpack_sorted.exceptions
:members:
:undoc-members:
:show-inheritance:
File renamed without changes.
File renamed without changes.
15 changes: 12 additions & 3 deletions msgpack/_packer.pyx → msgpack_sorted/_packer.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,9 @@ cdef class Packer(object):
:param str unicode_errors:
The error handler for encoding unicode. (default: 'strict')
DO NOT USE THIS!! This option is kept for very specific usage.
:param bool sort_keys:
Sort output dictionaries by key. (default: False)
"""
cdef msgpack_packer pk
cdef object _default
Expand All @@ -111,6 +114,7 @@ cdef class Packer(object):
cdef bint use_float
cdef bint autoreset
cdef bint datetime
cdef bool sort_keys

def __cinit__(self):
cdef int buf_size = 1024*1024
Expand All @@ -122,7 +126,8 @@ cdef class Packer(object):

def __init__(self, *, default=None,
bint use_single_float=False, bint autoreset=True, bint use_bin_type=True,
bint strict_types=False, bint datetime=False, unicode_errors=None):
bint strict_types=False, bint datetime=False, unicode_errors=None,
bool sort_keys=False):
self.use_float = use_single_float
self.strict_types = strict_types
self.autoreset = autoreset
Expand All @@ -139,6 +144,8 @@ cdef class Packer(object):
else:
self.unicode_errors = self._berrors

self.sort_keys = sort_keys

def __dealloc__(self):
PyMem_Free(self.pk.buf)
self.pk.buf = NULL
Expand Down Expand Up @@ -224,7 +231,8 @@ cdef class Packer(object):
raise ValueError("dict is too large")
ret = msgpack_pack_map(&self.pk, L)
if ret == 0:
for k, v in d.items():
_items = sorted(d.items()) if self.sort_keys else d.items()
for k, v in _items:
ret = self._pack(k, nest_limit-1)
if ret != 0: break
ret = self._pack(v, nest_limit-1)
Expand All @@ -235,7 +243,8 @@ cdef class Packer(object):
raise ValueError("dict is too large")
ret = msgpack_pack_map(&self.pk, L)
if ret == 0:
for k, v in o.items():
_items = sorted(o.items()) if self.sort_keys else o.items()
for k, v in _items:
ret = self._pack(k, nest_limit-1)
if ret != 0: break
ret = self._pack(v, nest_limit-1)
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
8 changes: 7 additions & 1 deletion msgpack/fallback.py → msgpack_sorted/fallback.py
Original file line number Diff line number Diff line change
Expand Up @@ -648,6 +648,9 @@ class Packer:
The error handler for encoding unicode. (default: 'strict')
DO NOT USE THIS!! This option is kept for very specific usage.
:param bool sort_keys:
Sort output dictionaries by key. (default: False)
Example of streaming deserialize from file-like object::
unpacker = Unpacker(file_like)
Expand Down Expand Up @@ -681,6 +684,7 @@ def __init__(
strict_types=False,
datetime=False,
unicode_errors=None,
sort_keys=False,
):
self._strict_types = strict_types
self._use_float = use_single_float
Expand All @@ -689,6 +693,7 @@ def __init__(
self._buffer = StringIO()
self._datetime = bool(datetime)
self._unicode_errors = unicode_errors or "strict"
self._sort_keys = sort_keys
if default is not None:
if not callable(default):
raise TypeError("default must be callable")
Expand Down Expand Up @@ -801,7 +806,8 @@ def _pack(
self._pack(obj[i], nest_limit - 1)
return
if check(obj, dict):
return self._pack_map_pairs(len(obj), obj.items(), nest_limit - 1)
_items = sorted(obj.items()) if self._sort_keys else obj.items()
return self._pack_map_pairs(len(obj), _items, nest_limit - 1)

if self._datetime and check(obj, _DateTime) and obj.tzinfo is not None:
obj = Timestamp.from_datetime(obj)
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
4 changes: 2 additions & 2 deletions msgpack/unpack_define.h → msgpack_sorted/unpack_define.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
#ifndef MSGPACK_UNPACK_DEFINE_H__
#define MSGPACK_UNPACK_DEFINE_H__

#include "msgpack/sysdep.h"
#include "msgpack_sorted/sysdep.h"
#include <stdlib.h>
#include <string.h>
#include <assert.h>
Expand Down Expand Up @@ -92,4 +92,4 @@ typedef enum {
}
#endif

#endif /* msgpack/unpack_define.h */
#endif /* msgpack_sorted/unpack_define.h */
File renamed without changes.
Loading

0 comments on commit 40501f5

Please sign in to comment.