Interesting stuff on E

Interesting things you may want to know about E, EFL and development in general.

Eet compared with JSON - Eet comes out on top

Given the mainstream can't figure out what to use (XML, JSON... what's next?), Eet manages to lead hands-down when compared to JSON

As time goes on, and more things pass me by, I realize that Eet is an insanely cool little project and library. It doesn't get the attention it deserves. Eet is like the universal solution bucket for "I have data i need to get in and out of my program", and it does it elegantly, efficiently and easily (if you use C or C++).

As it so happens I happened to have the opportunity to look at libjson today. I never used it before, and I spent all of maybe 10 minutes reading some sample code on how to use it so I could figure it out. So I wrote a benchmarking tool. It is very simple. It encodes a large blob of in-memory structure data as either JSON or as an Eet file, and it decodes, it. The code is here for anyone that cares: config.c. Let me know if the libjson usage could be massively faster, but for the purposes of this comparison, I think it's good enough.

So the results. Size is in bytes. Read and write values are time in seconds (so smaller is better).

TestFile SizeUncached WriteUncached ReadCached WriteCached Read
JSON (libjson)1389573601.313.561.222.32
EET (no compression)316506370.730.350.720.32
EET (zlib level 9)2187801.000.411.000.40
EET (lz4hc)1597850.930.340.890.33
EET (lz4)2617320.730.360.700.34

Summary

  • Data files can be between 4 and 860 TIMES larger with JSON.
  • Cold read time for JSON is 10 TIMES slower than Eet.
  • Cold writes take 40% longer with JSON than with Eet.
  • Hot reads are 7 TIMES slower with JSON than with Eet.
  • Hot writes take 37% longer with JSON than with Eet.

That's pretty amazing for a small pokey library lurking under the covers of EFL. It hasn't been trumpeted around the world as the go-to data encode/decode library. It hasn't had a lot of fine-grained optimization efforts by the "big boys", but it clearly does some amazing stuff.

Yes. The comparison is a bit artificial. It's a very large amount of data. The POINT was to give a large amount of data so it can be usefully compared, rather than benchmark timings being hidden in system noise. This ensures our benchmark is pretty much ONLY looking at cost of decode (or encode), and associated I/O, rather than measuring other things.

In some research I found the following: PSYC. It seems interesting, but mostly useful because PSYC compared itself to libXML (sax and DOM mode). Since Eet is a DOM style parser, this lets us then take a GUESS at Eet vs XML numbers.

Let me use the user profile numbers ad that seems more general. XML (dom) parsing would be 2.6 TIMES slower than libjson. That would (for reads - which are the most common things when it comes to data storage like configuration, data files etc.) make XML 18 to 26 TIMES slower than Eet. PSYC itself comes in at 13% the time needed for JSON. So Eet would still beat PSYC. Not too shabby.

So if you are in the market for needing your data (configuration, protocol, state or anything else) stored and retrieved quickly, Eet is for you. If you want to send it around from process to process or machine to machine, Eet is also for you. It's done and ready, and that doesn't even begin to look at the features Eet has like built in compression (to keep transfer sizes down), built in encryption (to keep your data secure), and multi-key file handling to boot.