-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using simdjson as an SAX tokenizer #64
Comments
It's an interesting direction to explore. In terms of performance, we do alright on x86 platforms, but we have no optimizations for ARM platforms, which turns out to be the majority of our uses. It would be interesting to see if we could use some or all of simdjson in our parser. I don't know if the SAX parses easily slots into our code, but just replacing the string and number parsers with the ones from simdjson might be an easy performance win, and would allow us to remove some of our own code. |
5cc233e Add support for microblaze. 4a51e73 Add support for e2k architecture. (#118) 7da5db9 Add min exponent width option in double-to-string conversion (#116) 3f9cd30 Remove reference to `diy-fp.cc` e424c2b More Bignum fiddling. (#108) 583c6b7 Merge branch 'master' of github.com:google/double-conversion 5720620 Remove redundant parenthesis. 7bc0c47 Optimise Bignum layout. (#107) e67096c Split Strtod() (#106) 32c4026 Split double-conversion. (#104) 251fef6 Fix naming. (#103) 2c29075 Consistent macro prefix. (#101) e394b49 Use standard min/max. (#102) 5fa81e8 Fix some issues with invalid hex-float literals. fb2364d Improve gitignore. 8fcee05 Usefulcat master (#98) 067c887 Fix warning for g++ 4.9.3. d1c0b80 Update Changelog. 4fa48d5 CMake: install to correct lib dir (#93) cb5cf99 Add big endian ARM support (#92) 3dfc1e3 Switch to relative includes. 860b431 Fix typo in test. 53c4c75 Update Changelog and version number. f5c59a2 Merge branch 'master' of github.com:google/double-conversion 8751aaf Fix 16-bit separators. 990c447 msvc: check if _MSC_VER is defined (#88) 87d21e3 Allow for compilation in emscripten (#86) 4b2a7f3 Merge branch 'branch_v3.1' d583754 Add test cases. 9823421 Update version numbers. eafa625 Add support of ARC architecture (#82) 4199ef3 Update version numbers. e67d737 Merge branch 'master' into branch_v3.1 fd043b2 Fix hex literal bug. 20ecba5 Support separator characters. 05a3fea Add support for hexadecimal float literals. aa554d9 Fix bug where hex numbers would lose the minus sign. b479bea Add comments for achitecture check. 768a445 Add support for aarch64_be, or1k and microblazebe. 4e8b3b5 Add support for Windows on ARM and ARM64 (#76) 9a8e518 Merge pull request spotify#68 from floitschG/static_size_assert ae9ad90 Merge pull request spotify#69 from floitschG/pnacl e543cca Add Native Client as support architecture. da420c3 Use `static_assert` with newer compilers. 7a560cf Merge pull request spotify#65 from floitschG/avoid_undef_cast 3ef9576 Address comments. df50df0 Avoid undefined cast to make ASAN happy. 3992066 Merge pull request spotify#64 from google/floitschG-patch-1 6c1e714 Add `exports_files` 3ad9d20 Processed length should include no trailing junk (spotify#63) 1b5fa31 Clarify output charset in DoubleToAscii documentation (spotify#61) e1aa127 Fix warning for code that will never be executed (spotify#59) cf2f0f3 Merge pull request spotify#57 from google/rename_macro c58352d Rename macros. fe9b384 Merge pull request spotify#52 from uburuntu/master 1d5a688 REF: replace deprecated headers 4873703 REF: meaningless static definition in anonymous namespace 14033f6 REF: init member in constructor 2a257b7 Merge pull request spotify#51 from isaachier/master 3c04013 Suppress issue in clang analyzer. 678cef3 Merge pull request spotify#50 from isaachier/master a131c65 Remove unused CMake file. 5664746 Merge pull request spotify#47 from AKindyakov/case_insensibility_for_special 8140713 Remove unnecessary INSTALL_INTERFACE expression. e13e72e Use template for CMake installation. aa2df66 Fix mistake for build interface include dir. 8e02bf4 Improve CMake changes. a711666 Update CMake package generation. 9972d3c Implement ALLOW_CASE_INSENSIBILITY mode for StringToDoubleConverter class 23cac04 Update Changelog. ca220cb Update Changelog. 7f54e48 Merge branch 'avoid_negative_shift' 4abe326 Avoid negative shift. (spotify#42) 78cd7b1 Add assert and test. 1921cb3 Avoid negative shift. d8d4e66 Merge pull request spotify#39 from uburuntu/master 48b5e70 Fix previous fix 12c0a23 cctest: fix possible null pointer dereference 617af29 Add const qualifiers where it possible bb8e225 ieee: remove extra qualification Double:: 79fb300 Merge pull request spotify#38 from sorear/add-riscv 8316ed5 Add support for RISC-V d4d68e4 Merge pull request spotify#37 from KindDragon/patch-1 d7f9404 Update and rename README to README.md git-subtree-dir: vendor/double-conversion git-subtree-split: 5cc233e98b74c5c370de888198a2b35200d55468
simdjson seems to be the gold standard in terms of JSON-parsing performance. It's always being updated with state-of-the-art algorithms for parsing, makes excellent use of intrinsics, and supports both arm and x86_64. It's also in use by many different organizations and has extensive testing via fuzzing etc. . I don't know what the performance needs are for JSON parsing here at Spotify, but if there's any desire for more speed, simdjson would be a great choice. It could be used as an SAX tokenizer, or simply forked to have spotify-json's high-level API built on top of it.
The text was updated successfully, but these errors were encountered: