Show HN: Jesth – Next-level human-readable data serialization format

chatmasta · on May 18, 2023

This looks surprisingly nice, which is a high bar to clear when it comes to my opinion of new config file formats.

Am I understanding correctly that this basically adds a type system to TOML? Have you considered calling it TypeTOML and making it a superset of TOML? (Maybe this is covered in the readme which I only skimmed.)

It would be cool if I could rename `config.toml` to `config.ttoml` and add types as I need them, similarly to how I can rename `script.js` to `script.ts` for iterative adoption of TypeScript. Although obviously this would require the consuming code to implement the Jesth (TypeTOML?:)) parsing, which would maybe defeat the point of iteratively adopting it (why bother with partial compatibility then?). Perhaps you could make it a _compatible_ superset, with types implemented using TOML comments so that existing TOML parsers can parse an (untyped) structure from a Jesth file by ignoring the comments, while Jesth can parse a typed structure from the same file.

alexrustic · on May 19, 2023

Thank you for your kind words !

I remind you that any comparison with TOML, JSON, or YAML only concerns one of the capabilities of Jesth, namely the ability to convert a compatible section into a dictionary data structure.

A Jesth document may not have a section intended to be converted into a dictionary data structure. Therefore Jesth can be used e.g. as a markup language for docstrings (My closed-source documentation generator parses the source code to populate the 'docs' folder of my projects with Markdown files [1])

Therefore, the lines below are for Jesth sections intended to be converted to a dictionary data structure.

There is currently no type system, that is, a mechanism to ensure that values assigned to a certain key always conform to a specific data type. I'm thinking about it. Think about how we create relational database tables with SQL.

Jesth is not going to be a TOML superset, they have incompatible underlying philosophies. For example, the design decisions behind Jesth accidentally created an unlimited pool of reserved words (headers with double square brackets on either side are reserved words), from which I used [[END]] to mark the end of a Jesth stream. TOML currently doesn't have such a thing since they already use these double square brackets on each side for something that is trivially done in Jesth.

[1] https://github.com/pyrustic/jesth/tree/master/docs/modules

bath_ · on May 18, 2023

How is the name of this project meant to be pronounced? "jest h" or "jezzith" are the first two things that come to mind (I'm not sure how to write out the second one phonetically)

alexrustic · on May 18, 2023

I think it should be pronounced /dʒest/ [0]

[0] https://dictionary.cambridge.org/pronunciation/english/jest

Edit: I just updated the project's README to include this detail. Thanks for making me think about it !

ultrablack · on May 18, 2023

This looks remarkably like the dosbox conf format. Im sure you have reinvented something from the 90s?

  [sdl]

  # fullscreen -- Start dosbox directly in fullscreen.
  # fulldouble -- Use double buffering in fullscreen.
  # fullresolution -- What resolution to use for fullscreen: original or fixed size (e.g. 1024x768).
  # windowresolution -- Scale the window to this size IF the output device supports hardware scaling.
  # output -- What to use for output: surface,overlay,opengl,openglnb,ddraw.
  # autolock -- Mouse will automatically lock, if you click on the screen.
  # sensitiviy -- Mouse sensitivity.
  # waitonerror -- Wait before closing the console if dosbox has an error.
  # priority -- Priority levels for dosbox: lowest,lower,normal,higher,highest,pause (when not focussed).
  #             Second entry behind the comma is for when dosbox is not focused/minimized.
  # mapperfile -- File used to load/save the key/event mappings from.
  # usescancodes -- Avoid usage of symkeys, might not work on all operating systems.

  fullscreen=false
  fulldouble=false
  fullresolution=
  windowresolution=2048x1536
  output=ddraw
  autolock=false
  sensitivity=100
  waitonerror=true
  priority=higher,normal
  mapperfile=mapper.txt
  usescancodes=true

alexrustic · on May 18, 2023

Thanks for your comment, but I'm still not convinced, at least until I see how you nest collections with this format.

dang · on May 18, 2023

(Mod here - I added two spaces to the beginning of most of those lines so HN's software would format it like code. I hope that's ok!)

alfalfasprout · on May 18, 2023

First off, this is very cool! So don't let my comments dissuade you...

But IME serialization tends to fall into two boats:

1. I want something akin to a config that's human readable but very simple and I don't want to think about strong typing 2. I'm serializing data and I want schemas, efficiency, etc.

I feel like (1) is pretty well handled by TOML and JSON and (2) is pretty well handled by flat/proto buffers, thrift/avro, capn' proto, etc.

I guess I'm wondering where you see this being used.

alexrustic · on May 18, 2023

Thank you for your comment !

Jesth belongs to the first boat but on one condition: if you wish.

If you need a section to represent a dictionary data structure, you should use the syntax designed for that, so you can later call the section's "make_dict" or "get_dict" methods to convert the raw lines (list of strings) in an object dictionary.

You are free to create your own hacks to convert a raw section into an object that suits your needs.

About boat 2 schemas, I'm thinking of designing a type validation schema, much like what we do when creating tables with SQL.

billconan · on May 18, 2023

I'm looking for a good human readable format for my notebook app.

I don't want to use markdown, because markdown is very difficult to parse and I can't use existing parsers, because I want to extend markdown's syntax to support extended content types and revisions.

(I'm currently using djot)

I also looked at structured format, like toml and hjson. What I don't like about them is when the content contain a deep nested structure, the document will become unreadable.

for example, a blockquote can contain a paragraph and a paragraph can contain another blockquote and paragraph, etc

I can't tell if your format is practical for deeply nested structures.

candiddevmike · on May 18, 2023

Why is markdown difficult to parse or extend? I am using Markdown-it for instance and use custom parsers to extend it with custom shortlinks.

billconan · on May 18, 2023

some good reads on this topic

https://johnmacfarlane.net/beyond-markdown.html

https://clehaxze.tw/gemlog/2022/03-31-markdown-is-not-contex...

https://blog.codinghorror.com/the-future-of-markdown/

alexrustic · on May 18, 2023

Hi ! Thank you for your reply ! I think the best way to be sure that Jesth will meet your needs is to try jesth-demo [0].

I think Jesth does a better job than TOML, YAML and JSON when it comes to nested structures or readability in general. JSON remains the boss of machine-to-machine communication, though !

It is also possible that my other project Exn [1] meets your needs.

[0] https://github.com/pyrustic/jesth-demo#readme

[1] https://news.ycombinator.com/item?id=34947927

billconan · on May 18, 2023

thank you very much, here is my notebook app https://github.com/shi-yan/Epiphany

Exn does look very related.

alexrustic · on May 18, 2023

Epiphany looks cool ! Exn does not yet support mathematical expressions and deliberately promotes editing of raw exonote text files.

It looks like under the hood you're using Webkit-like technology...

Could Epiphany embed programs like Exn does ?

billconan · on May 18, 2023

My first version was something like jupyter notebook https://www.youtube.com/watch?v=rQjBhsC3oi0

but I don't run it anymore. The one on github is a rewrite and it is closer to notion.

I removed the programming part, because I feel that a notebook is not the best environment for writing code. It may be ok for ad hoc programming, but for serious coding, I want an IDE. I may create multiple files instead of mixing everything together in a single page.

alexrustic · on May 18, 2023

> My first version was something like jupyter notebook https://www.youtube.com/watch?v=rQjBhsC3oi0

It's a nice job you've done. I hope you know about the existence of Bartosz Ciechanowski's interactive articles [1][2].

> I removed the programming part, because I feel that a notebook is not the best environment for writing code. ... I may create multiple files instead of mixing everything together in a single page.

Exn does not mix source code with prose as in literary programming. You can embed on an Exonote, a program (developed with an IDE) available in your current virtual environment for example ! [3]

[1] https://news.ycombinator.com/item?id=31261533

[2] https://news.ycombinator.com/item?id=33249215

[3] https://news.ycombinator.com/item?id=34965910

dukoid · on May 18, 2023

I had a similar problem recently and in the end I went for TOML, encoding the path in the section headers (square brackets), avoiding any indentation.

michaelteter · on May 18, 2023

I'm not being negative, but I find it funny that this human readible serialization format has a name which is much less human readable than many other names.

robotvert · on May 19, 2023

What about CUE? (https://cuetorials.com/) It feels it'd solve your problem and more.

alexrustic · on May 19, 2023

Jesth is like a broken INI file parser that can only split a document into sections (each section consists of a header and a body which is just a list of strings).

Now, on top of that, I can write a hack to convert an arbitrary section to a dictionary data structure (provided the body of that section is written with a specific syntax designed for my hack).

I made this hack and included it in the Jesth library, so people can use it, much like the Python standard library is just there to help people not waste time rewriting the same algorithms for common tasks.

Jesth would be like JSON which is only about data. CUE, Dhall and Jsonnet jump on top of JSON to add some cool stuff.

I used Jesth for example to design a docstring markup language (consumed [1] by a documentation generator), as well as a scripting language [2].

I will soon publish a simple data validation mechanism for Jesth dict-sections (sections intended to be converted into a dictionary data structure). It might inspire people to create a more complex data validation or data constraint language on top of Jesth. This could be more readable than what is done elsewhere.

[1] https://github.com/pyrustic/jesth/tree/master/docs/modules

[2] https://github.com/pyrustic/backstage

midnitewarrior · on May 18, 2023

I can't even pronounce the name, how am I going to read it?

alexrustic · on May 18, 2023

It should be pronounced /dʒest/ [0]

[0] https://dictionary.cambridge.org/pronunciation/english/jest

miohtama · on May 18, 2023

After being frustrated with

- INI

- XML

- JSON

- YAML

- TOML

... I welcome our new Jesth overlords. Progress is unstoppable even if it happens in baby steps.

riku_iki · on May 18, 2023

I always thought prettified json looks very human readable.

alexrustic · on May 18, 2023

A Jesth dictionary section is natively prettified. Additionally, anything you can encode in JSON can be embedded in a Jesth section, sharing the same document with another section containing, for example, a poem or a ChatGPT prompt. What JSON does is just one of things Jesth can do. For example, you won't use JSON as a markup language for, say, docstrings.

When it comes to machine-to-machine communication, JSON is more relevant.

dusted · on May 19, 2023

Yes. On a similar note, the best thing about YAML is that you can just write JSON in your YAML files and it will work just fine.

Villain5875 · on May 19, 2023

Does it feel like updated ini?

alexrustic · on May 19, 2023

It looks more like a broken INI file parser. And I could say "it's not a bug, it's a feature !", because it could be used to design a markup language for docstrings (consumed [1] by a documentation generator), which the INI file cannot do just as well.

[1] https://github.com/pyrustic/jesth/tree/master/docs/modules

ramesh31 · on May 18, 2023

Cool, so I see you've rediscovered S-Expressions.

asalahli · on May 18, 2023

You snark is unwarranted IMO. How's this any similar to s-expressions than, say, TOML is? Or YAML