Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller URL : binary gzip compressed data storage, base64 wrapped in the url #25

Closed
1000i100 opened this issue Mar 19, 2020 · 1 comment

Comments

@1000i100
Copy link

Right now, loopy data are stored in json with a serialize / deserialize process.
I think we can do better to store big system in small url (even without url reducer)

Here is the binary data scheme :

  • ThisVersionStartWith0:1bit, to allow evolution starting with 1
  • nodeNumber: 16bit (max 65535)
  • edgeNumber: 16bit (max 65535)
  • labelNumber: 16bit (max 65535)
  • globalOptions: 32bit (actually 2 boolean -> 2 bit, soon 5 bool -> 5bit used over 32)
  • bitForStringIndex: 5bit (to define how many bit to link to a string max is 31bit index so 2 000 000 000 strings)
  • bitByNode: 16bit (to define how many bit by node so the max is 2^16-1 bit)
  • bitByEdge: 16bit (to define how many bit by node so the max is 2^16-1 bit)
  • bitByLabel: 16bit (to define how many bit by node so the max is 2^16-1 bit)
  • for each node:
    • each props with associated bit number depending of the option number or numeric limit
    • a bitForStringIndex bit index number for string.
  • for each edge:
    • each props with associated bit number depending of the option number or numeric limit
    • a bitForStringIndex bit index number for string.
  • for each label:
    • each props with associated bit number depending of the option number or numeric limit
    • a bitForStringIndex bit index number for string.

this stored in binary then gzipped (or lzma) then b64 for URL
and the string part :
string are dedup and sorted by size ASC then stored in a json array escaped for url.
the index of this array is used in the binary part to target strings.

the parser try to rebuild and allow the json array string to be erroneous/partial, in this case, it add at the end "] or ] and retry parsing. the missing string in index a replaced by ? so the graph can be used with less text but still usable.

What's your feeling about this ?

PS : this change can be done with compatibility for actual json storage by detecting the url scheme post ?char

PS2: embed setting can be stored in globalOptions, same for autoplay if not handled as special node.

@1000i100
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant