Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zcbor.py: Performance improvements in DataTranslator #479

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ The functionality is spread across 5 classes:
1. CddlParser
2. CddlXcoder (inherits from CddlParser)
3. DataTranslator (inherits from CddlXcoder)
4. DataDecoder (inherits from DataTranslator)
4. CodeGenerator (inherits from CddlXcoder)
5. CodeRenderer

Expand Down Expand Up @@ -100,7 +101,7 @@ Most of the functionality falls into one of two categories:
- is_unambiguous(): Whether the type is completely specified, i.e. whether we know beforehand exactly how the encoding will look (e.g. `Foo = 5`).

DataTranslator
-----------
--------------

DataTranslator is for handling and manipulating CBOR on the "host".
For example, the user can compose data in YAML or JSON files and have them converted to CBOR and validated against the CDDL.
Expand All @@ -127,15 +128,23 @@ One caveat is that CBOR supports more features than YAML/JSON, namely:

zcbor allows creating bespoke representations via `--yaml-compatibility`, see the README or CLI docs for more info.

Finally, DataTranslator can also generate a separate internal representation using `namedtuple`s to allow browsing CBOR data by the names given in the CDDL.
DataTranslator functionality is tested in [tests/scripts/test_zcbor.py](tests/scripts/test_zcbor.py)

DataDecoder
-----------

DataDecoder contains functions for generating a separate internal representation using `namedtuple`s to allow browsing CBOR data by the names given in the CDDL.
(This is more analogous to how the data is accessed in the C code.)

DataTranslator functionality is tested in [tests/scripts](tests/scripts)
This functionality was originally part of DataTranslator, but was moved because the internal representation was always created but seldom used, and the namedtuples caused a noticeable performance hit.

DataDecoder functionality is tested in [tests/scripts/test_zcbor.py](tests/scripts/test_zcbor.py)

CodeGenerator
-------------

CodeGenerator, like DataTranslator, inherits from CddlXcoder.
It is used to generate C code.
Its primary purpose is to construct the individual decoding/encoding functions for the types specified in the given CDDL document.
It also constructs struct definitions used to hold the decoded data/data to be encoded.

Expand All @@ -158,6 +167,7 @@ repeated_foo() concerns itself with the individual value, while foo() concerns i

When invoking CodeGenerator, the user must decide which types it will need direct access to decode/encode.
These types are called "entry types" and they are typically the "outermost" types, or the types it is expected that the data will have.
CodeGenerator will generate a public function for each entry type.

The user can also use entry types when there are `"BSTR"`s that are CBOR encoded, specified as `Foo = bstr .cbor Bar`.
Usually such strings are automatically decoded/encoded by the generated code, and the objects part of the encompassing struct.
Expand Down
9 changes: 9 additions & 0 deletions MIGRATION_GUIDE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
# zcbor v. 0.9.99

* The following `DataTranslator` functions have been moved to a separate class `DataDecoder`:

* `decode_obj()`
* `decode_str_yaml()`
* `decode_str()`

The split was done for performance reasons (namedtuple objects are slow to create).
The `DataDecoder` class is a subclass of `DataTranslator` and can do all the the same things, just a bit slower.
This functionality is only relevant when zcbor is imported, so all CLI usage is unaffected.

# zcbor v. 0.9.0

Expand Down
6 changes: 1 addition & 5 deletions __init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,4 @@

from pathlib import Path

from .zcbor.zcbor import (
CddlValidationError,
DataTranslator,
main
)
from .zcbor.zcbor import CddlValidationError, DataTranslator, DataDecoder, main
8 changes: 4 additions & 4 deletions scripts/add_helptext.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from sys import argv

p_root = Path(__file__).absolute().parents[1]
p_README = Path(p_root, 'README.md')
p_README = Path(p_root, "README.md")

pattern = r"""
Command line documentation
Expand Down Expand Up @@ -42,13 +42,13 @@
```
"""

with open(p_README, 'r', encoding="utf-8") as f:
with open(p_README, "r", encoding="utf-8") as f:
readme_contents = f.read()
new_readme_contents = sub(pattern + r'.*', output, readme_contents, flags=S)
new_readme_contents = sub(pattern + r".*", output, readme_contents, flags=S)
if len(argv) > 1 and argv[1] == "--check":
if new_readme_contents != readme_contents:
print("Check failed")
exit(9)
else:
with open(p_README, 'w', encoding="utf-8") as f:
with open(p_README, "w", encoding="utf-8") as f:
f.write(new_readme_contents)
1 change: 1 addition & 0 deletions scripts/black.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
black $(dirname "$0")/.. -l 100
18 changes: 10 additions & 8 deletions scripts/regenerate_samples.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,24 @@
from tempfile import mkdtemp

p_root = Path(__file__).absolute().parents[1]
p_build = p_root / 'build'
p_pet_sample = p_root / 'samples' / 'pet'
p_pet_cmake = p_pet_sample / 'pet.cmake'
p_pet_include = p_pet_sample / 'include'
p_pet_src = p_pet_sample / 'src'
p_build = p_root / "build"
p_pet_sample = p_root / "samples" / "pet"
p_pet_cmake = p_pet_sample / "pet.cmake"
p_pet_include = p_pet_sample / "include"
p_pet_src = p_pet_sample / "src"


def regenerate():
tmpdir = Path(mkdtemp())
run(['cmake', p_pet_sample, "-DREGENERATE_ZCBOR=Y", "-DCMAKE_MESSAGE_LOG_LEVEL=WARNING"],
cwd=tmpdir)
run(
["cmake", p_pet_sample, "-DREGENERATE_ZCBOR=Y", "-DCMAKE_MESSAGE_LOG_LEVEL=WARNING"],
cwd=tmpdir,
)
rmtree(tmpdir)


def check():
files = (list(p_pet_include.iterdir()) + list(p_pet_src.iterdir()) + [p_pet_cmake])
files = list(p_pet_include.iterdir()) + list(p_pet_src.iterdir()) + [p_pet_cmake]
contents = "".join(p.read_text(encoding="utf-8") for p in files)
tmpdir = Path(mkdtemp())
list(makedirs(tmpdir / f.relative_to(p_pet_sample).parent, exist_ok=True) for f in files)
Expand Down
2 changes: 1 addition & 1 deletion scripts/requirements-test.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pyelftools
pycodestyle
black
west
ecdsa
26 changes: 16 additions & 10 deletions scripts/update_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@
from datetime import datetime

p_root = Path(__file__).absolute().parents[1]
p_VERSION = Path(p_root, 'zcbor', 'VERSION')
p_RELEASE_NOTES = Path(p_root, 'RELEASE_NOTES.md')
p_MIGRATION_GUIDE = Path(p_root, 'MIGRATION_GUIDE.md')
p_common_h = Path(p_root, 'include', 'zcbor_common.h')
p_VERSION = Path(p_root, "zcbor", "VERSION")
p_RELEASE_NOTES = Path(p_root, "RELEASE_NOTES.md")
p_MIGRATION_GUIDE = Path(p_root, "MIGRATION_GUIDE.md")
p_common_h = Path(p_root, "include", "zcbor_common.h")

RELEASE_NOTES_boilerplate = """
Any new bugs, requests, or missing features should be reported as [Github issues](https://github.com/NordicSemiconductor/zcbor/issues).
Expand All @@ -31,23 +31,29 @@ def update_relnotes(p_relnotes, version, boilerplate="", include_date=True):
new_date = f" ({datetime.today().strftime('%Y-%m-%d')})" if include_date else ""
relnotes_new_header = f"# zcbor v. {version}{new_date}\n"
if ".99" not in relnotes_lines[0]:
relnotes_contents = relnotes_new_header + boilerplate + '\n\n' + relnotes_contents
relnotes_contents = relnotes_new_header + boilerplate + "\n\n" + relnotes_contents
relnotes_contents = sub(r".*?\n", relnotes_new_header, relnotes_contents, count=1)
p_relnotes.write_text(relnotes_contents, encoding="utf-8")


if __name__ == "__main__":
if len(argv) != 2 or match(r'\d+\.\d+\.\d+', argv[1]) is None:
if len(argv) != 2 or match(r"\d+\.\d+\.\d+", argv[1]) is None:
print(f"Usage: {argv[0]} <new zcbor version>")
exit(1)
version = argv[1]
(major, minor, bugfix) = version.split('.')
(major, minor, bugfix) = version.split(".")

p_VERSION.write_text(version, encoding="utf-8")
update_relnotes(p_RELEASE_NOTES, version, boilerplate=RELEASE_NOTES_boilerplate)
update_relnotes(p_MIGRATION_GUIDE, version, include_date=False)
p_common_h_contents = p_common_h.read_text(encoding="utf-8")
common_h_new_contents = sub(r"(#define ZCBOR_VERSION_MAJOR )\d+", f"\\g<1>{major}", p_common_h_contents)
common_h_new_contents = sub(r"(#define ZCBOR_VERSION_MINOR )\d+", f"\\g<1>{minor}", common_h_new_contents)
common_h_new_contents = sub(r"(#define ZCBOR_VERSION_BUGFIX )\d+", f"\\g<1>{bugfix}", common_h_new_contents)
common_h_new_contents = sub(
r"(#define ZCBOR_VERSION_MAJOR )\d+", f"\\g<1>{major}", p_common_h_contents
)
common_h_new_contents = sub(
r"(#define ZCBOR_VERSION_MINOR )\d+", f"\\g<1>{minor}", common_h_new_contents
)
common_h_new_contents = sub(
r"(#define ZCBOR_VERSION_BUGFIX )\d+", f"\\g<1>{bugfix}", common_h_new_contents
)
p_common_h.write_text(common_h_new_contents, encoding="utf-8")
39 changes: 23 additions & 16 deletions tests/decode/test5_corner_cases/floats.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,28 +12,35 @@
import math
import cbor2


def print_var(val1, val2, bytestr):
var_str = ""
for b in bytestr:
var_str += hex(b) + ", "
print(str(val1) + ":", val2, bytestr.hex(), var_str)

def print_32_64(str_val, val = None):
val = val or float(str_val)
print_var(str_val, val, struct.pack("!e", numpy.float16(val)))
print_var(str_val, val, struct.pack("!f", struct.unpack("!e", struct.pack("!e", numpy.float16(val)))[0]))
print (numpy.float32(struct.unpack("!e", struct.pack("!e", numpy.float16(val)))[0]))
print_var(str_val, val, struct.pack("!f", val))
print_var(str_val, val, struct.pack("!d", val))
print_var(str_val, val, struct.pack("!d", struct.unpack("!f", struct.pack("!f", val))[0]))
print()
var_str = ""
for b in bytestr:
var_str += hex(b) + ", "
print(str(val1) + ":", val2, bytestr.hex(), var_str)


def print_32_64(str_val, val=None):
val = val or float(str_val)
print_var(str_val, val, struct.pack("!e", numpy.float16(val)))
print_var(
str_val,
val,
struct.pack("!f", struct.unpack("!e", struct.pack("!e", numpy.float16(val)))[0]),
)
print(numpy.float32(struct.unpack("!e", struct.pack("!e", numpy.float16(val)))[0]))
print_var(str_val, val, struct.pack("!f", val))
print_var(str_val, val, struct.pack("!d", val))
print_var(str_val, val, struct.pack("!d", struct.unpack("!f", struct.pack("!f", val))[0]))
print()


print_32_64("3.1415")
print_32_64("2.71828")
print_32_64("1234567.89")
print_32_64("-98765.4321")
print_32_64("123/456789", 123/456789)
print_32_64("-2^(-42)", -1/(2**(42)))
print_32_64("123/456789", 123 / 456789)
print_32_64("-2^(-42)", -1 / (2 ** (42)))
print_32_64("1.0")
print_32_64("-10000.0")
print_32_64("0.0")
Expand Down
30 changes: 30 additions & 0 deletions tests/scripts/test_performance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import zcbor
import cbor2
import cProfile, pstats


try:
import zcbor
except ImportError:
print(
"""
The zcbor package must be installed to run these tests.
During development, install with `pip3 install -e .` to install in a way
that picks up changes in the files without having to reinstall.
"""
)
exit(1)

cddl_contents = """
perf_int = [0*1000(int/bool)]
"""
raw_message = cbor2.dumps(list(range(1000)))
cmd_spec = zcbor.DataTranslator.from_cddl(cddl_contents, 3).my_types["perf_int"]
# cmd_spec = zcbor.DataDecoder.from_cddl(cddl_contents, 3).my_types["perf_int"]

profiler = cProfile.Profile()
profiler.enable()
json_obj = cmd_spec.str_to_json(raw_message)
profiler.disable()

profiler.print_stats()
Loading