yeah, LuaJIT is one of the use cases I had in mind working on this. JSON is pretty fast in modern JS engines, but in Lua land, JSON kinda sucks and doesn't really match the language without using virtual tables.
JSON has `null` values with string keyds, but lua doesn't have `null`. It has `nil`, but you can't have a key with a nil value. Setting nil deletes the key
Lua tables are unordered. But JS and JSON are often ordered and order often matters.
RX, however matches Lua/LuaJIT extremely well and should out-perform the JS Proxy based decoder using metatables. Since it's using metatables anyway do to the lazy parsing, it's trivial to do things like preserve order when calling `pairs` and `ipairs` and even including keys with associated null values.
You can round trip safely in Lua, which is not easy with most JSON implementations.
Thanks for the feedback. I've improved the framing to make the purpose/value more clear. What do you think about "RX is a read-only embedded store for JSON-shaped data"?
That benchmark is a fair comparison for a real-world production workload and use case. Sadly I can't share the details. But suffice it to say that the dataset is a huge object with tens of thousands of paths as keys and moderately large objects as values (averaging around 3KB of JSON each) all with slightly different shapes. The use is reading just a few entries by path an then looking up some properties within those entries.
The benchmark (or is supposed to) measures end-to-end parse + lookup.
JSON: 92 MB
RX: 5.1 MB
Request-path lookup: ~47,000x faster
Time to decode a manifest and look up one URL path:
the project framing needs some help perhaps. JSON is really good at a lot of use cases that this will never replace. But there are cases where JSON is currently used where this is much better. In particular large unstructured datasets where you only need to read a tiny subset of the data in a single request.
I'm happy to hear suggestions. This format was actually the internal .rexc bytecode for Rex (routing expressions), but when I realized it was actually a pretty good standalone format, I renamed it `.rx` for short. I am aware of RxJS, but I think that `rx-format` is different enough and `.rx` file extensions are unique enough, it's not too confusing.
sick is binary, rx is textual (this matters for tooling)
sick has size limits (65534 max keys for example. I have real-world rx datasets reaching this size already)
rx uses arbitrary precision variable-length b64 integers. There are no size limits anywhere inherit in the format, just in implementations.
sick does not preserve object key order
rx preserves object key order, but still implements O(log2 N) lookups for object keys.
Yes, the format allows for objects to be stored with a pointer to a shared schema (either an array of keys or another object that has the desired keys)
The current implementation is pretty close to ideal when deciding to use this encoding.
The current format version is the exact same feature set as JSON. I even encode numbers as arbitrary precision decimals (which JSON also does). This is quite different from CBOR which stores floats in binary as powers of 2.
I could technically add binary to the format, but then it would lose the nice copy-paste property. But with the byte-aware length prefixes, it would just work otherwise.
it's not really possible to stay human readable and get the compression levels and random access properties I was going for. But it is as human tooling friendly as possible given the constraints.
I find it obvious that your first attempt failed. Try again, you have not even remotely failed enough if you are making the argument that this is kinda readable. Yes, ascii words are easy to pick out, you didn’t do that, you did the part that makes it all harder.
JSON has `null` values with string keyds, but lua doesn't have `null`. It has `nil`, but you can't have a key with a nil value. Setting nil deletes the key
Lua tables are unordered. But JS and JSON are often ordered and order often matters.
RX, however matches Lua/LuaJIT extremely well and should out-perform the JS Proxy based decoder using metatables. Since it's using metatables anyway do to the lazy parsing, it's trivial to do things like preserve order when calling `pairs` and `ipairs` and even including keys with associated null values.
You can round trip safely in Lua, which is not easy with most JSON implementations.
reply