September 10, 2009

JESON: Enhancing JSON Syntax

JESON

When I work in Swing/Java, I don't write out all the code to build the components to construct the GUI. How tedious! Instead, I make use of a technology based on SwiXml to describe my GUI layouts via an external description file in XML.

Since working on the iPhone, I find myself doing similar tedious coding. Even with my own TableFormEditor, I am doing lots boring setup coding that can best be done in a string-based description. I like to call this technique "configuration-based coding". In short, this technique replaces code with a more terse description that achieves the same thing.

The benefits of using such a technique is that you can tweak your "configuration file" or "script file" without changing your code. This also implications with the iPhone's 3.0 support for downloading software updates inside your app. Apple doesn't prescribe what these downloads are, except that they aren't "code". So configuration files seem like a logical choice here.

So my desire was something like SwiXml to use on the iPhone. However, I don't want XML; it is a pig to parse and not fun at all to hand-edit. Certainly on the iPhone plists can achieve my requirements. Plists can be persisted to a file and read back in quite easily. Plists also support basic constructs like dictionaries, arrays, strings, etc. But I wanted something more friendly for hand editing. I don't like the clunky interface of the plist editor. So I started looking at JSON.

JSON is one of latest developments to gain a lot of hype. The data interchange format JSON (JavaScript Object Notation) started life during the AJAX craze. Many think JSON spells the end for XML. JSON can do many of the things that XML can, but in a simpler, cleaner way. I'm all for anything that would put XML to rest for good. XML is an abomination; the idea is great, the implementation is awful. There are even several decent Objective-C implementations for JSON reading and writing. I'm on the JSON bandwagon.

Almost.

As I worked with JSON a little bit, I started to realize that JSON is only marginally better than XML from a syntactical standpoint. I understand JSON is a subset of Javascript expression syntax, and thus its specification is limited to the confines of what Javascript can handle. But this argument is becoming more and more moot given the dangers of actually evaling JSON. People are saying you should never, ever do this. (Check out this blog article.) I don't use Javascript and probably never will. So why should I suffer through its syntax?

To address the bad taste JSON leaves in my mouth, I have come up with a set of enhancements to the JSON syntax. I call this "JESON" (I stuck the "E" where I did because "JES" happens to be my initials). JESON can read and write original JSON just fine, but it also supports the syntax enhancements described below. The implementation I have modified is Stig Brautaset's wonderful json-framework package available at http://code.google.com/p/json-framework/. This package has a simple enough parser to allow these changes to be made quite easily.

So, what's my beef with JSON syntax? There are several minor annoyances, but annoyances nonetheless. JSON incurs plenty of useless noise which I would prefer not to see or type. Let me demonstrate with an example of JSON syntax.

This example was swiped from the JSON website example page:

{
  "menu": {
     "id": "file",
     "value": "File",
     "popup": {
         "menuitem": [
            {"value": "New", "onclick": "CreateNewDoc()"},
            {"value": "Open", "onclick": "OpenDoc()"},
            {"value": "Close", "onclick": "CloseDoc()"}
         ]
     }
  }
}

JSON syntax violates several of my syntax rules I apply to good languages. The first violation is this syntax rule: "Don't require syntactical elements that serve no purpose".

I am a minimalist when it comes to syntax. Less is better. You have to admit when looking at the above example, JSON has lots of punctuation. But is it all necessary?

According to the spec, a dictionary key must always be a quoted string. Why? Unless the key can be anything but a string, what is the point of the quotes? For most needs, the key is always an identifier, an alphanumeric word. Requiring the quotes here serves no purpose. And the busy little tick marks make it hard to quickly discern between the keys and values.

To fix this, JESON makes the quotes on the keys optional:

{
  menu: {
     id: "file",
     value: "File",
     popup: {
         menuitem: [
            {value: "New", onclick: "CreateNewDoc()"},
            {value: "Open", onclick: "OpenDoc()"},
            {value: "Close", onclick: "CloseDoc()"}
         ]
     }
  }
}

To me, this one little change makes a big difference. I find this cleaner and easier to read. If your key happens to have spaces or weird characters in it, then you will have to keep the quotes there.

Here is another one of my rules of a good language: "Minimize shifted characters as much as possible".

I type a lot. A LOT. By the end of the day, my wrists and fingers are pretty stressed. Thus this rule is designed to minimize moving the fingers over to that shift key. Typing CamelCase code all day long can really kill me. So in the context of JSON, I have two problems: the colons and the double quotes.

Using a colon for assignment is certainly non-intuitive. The predominant operator for assignment in most modern languages is the equal sign. The '=' character has more "width", and therefore it is easier to see than the tiny colon. And best of all, the '=' requires no shift to type.

{
  menu= {
     id= "file",
     value= "File",
     popup= {
         menuitem= [
            {value= "New", onclick= "CreateNewDoc()"},
            {value= "Open", onclick= "OpenDoc()"},
            {value= "Close", onclick= "CloseDoc()"}
         ]
     }
  }
}

Now let's look at the quotes. Since JSON has no distinction between characters and strings, we can optionally support single quotes. Some may like the bolder style of double quotes, and even I prefer them sometimes when I want them to stand out more, but there is no reason not to support single quotes. And single quotes require no shift.

{
  menu= {
     id= 'file',
     value= 'File',
     popup= {
         menuitem= [
            {value= 'New', onclick= "CreateNewDoc()"},
            {value= 'Open', onclick= "OpenDoc()"},
            {value= 'Close', onclick= "CloseDoc()"}
         ]
     }
  }
}
I left some double quotes in the example to show they can be mixed. You can't mix them on the same string, though.

Hey, now it's looking better. But I'm looking at rule 1 again (no useless characters) and I see 2 more useless characters. The = character, which I so triumphantly argued for, now looks superfluous. But you might like them, you might not. I prefer to not use them for assigning a dictionary or array, but to use them for simple key/value pairs inside a dictionary.

{
  menu {
     id='file',
     value='File',
     popup {
         menuitem [
            {value='New', onclick="CreateNewDoc()"},
            {value='Open', onclick="OpenDoc()"},
            {value='Close', onclick="CloseDoc()"}
         ]
     }
  }
}


The other useless characters are the commas. They add no value to the parser, so they are strictly visual. Thus, JESON makes them optional. To me, they help separate key/value pairs when they are all in the same line, but commas are worthless when elements are separated across lines. My preference is to remove the commas between those elements one their own line, but use commas between elements that are in the same line:


{
  menu {
     id='file'
     value='File'
     popup {
         menuitem[
            {value='New', onclick="CreateNewDoc()"}
            {value='Open', onclick="OpenDoc()"}
            {value='Close', onclick="CloseDoc()"}
         ]
     }
  }
}

One change I made to Stig's JSON implementation I modified was to remove support for "fragments", which are not strict JSON anyway, and comments in the code alluded that it was a deprecated feature. With fragments removed, JESON must begin with either a dictionary or an array, nothing else. The majority of the JSON/JESON text I use has a dictionary as the outer object. I wanted that to be the default so I didn't need the outer {} characters. I enhanced JSON to assume the object is a dictionary if '{' or '[' were not the first non-whitespace characters. Thus, the following is exactly the same as the above, only a little cleaner:

menu {
    id='file'
    value='File'
    popup {
        menuitem[
           {value='New', onclick="CreateNewDoc()"}
           {value='Open', onclick="OpenDoc()"}
           {value='Close', onclick="CloseDoc()"}
        ]
    }
}

Man, that is looking sweet. It is a far cry better than the noisy original, which I'll put up again just so you can compare:
{
  "menu": {
     "id": "file",
     "value": "File",
     "popup": {
         "menuitem": [
            {"value": "New", "onclick": "CreateNewDoc()"},
            {"value": "Open", "onclick": "OpenDoc()"},
            {"value": "Close", "onclick": "CloseDoc()"}
         ]
     }
  }
}
One other enhancement in JESON is to make () synonymous with {}.

menu(
  id='file'
  value='File'
  popup(
      menuitem[
         {value='New', onclick="CreateNewDoc()"}
         {value='Open', onclick="OpenDoc()"}
         {value='Close', onclick="CloseDoc()"}
      ]
  )
)

Now some of those dictionaries look like function calls. Why would I want this? I have a vision of a lightweight scripting language based on the JESON syntax. Consider the following:

tableViewController {
   if [x == true] then {
       type='grouped'
       doSomething(var='value')
   }
   else {
       type = 'ungrouped'
       doSomethingElse(var1='value1', var2='value2')
   }
}

Kinda sorta looks like code, doesn't it? But it is actually valid JESON. The original JSON would look like this:

{
"tableViewController" : {
    "if" : ["x" == true], "then" : { "type" : "grouped", "doSomething" : { "var":"value"} },
    "else" : { "type" : "grouped", "doSomethingElse" : { "var1":"value1", "var2":"value2"} }
}
}

The usefulness of this may seem dubious, but I will discuss this in part 3.

One other thing that JSON doesn't support is comments. For any hand-edited configuration language, comments are mandatory, so JESON adds support for the basic // and /**/ formats:
// this is a menu
menu(
    id='file'
    value='File'
    popup(
        menuitem[
           // removed this one for now
           /*{value='New', onclick="CreateNewDoc()"}*/
           {value='Open', onclick="OpenDoc()"}
           {value='Close', onclick="CloseDoc()"}
        ]
    )
)

So that's it for JESON. Not much, really, but to me it makes it much easier to write and read. Here is a summary of all the enhancements JESON makes to the JSON syntax:

  1. keys in dictionaries do not need quotes if the word is all alphanumeric characters + '_'
  2. single quotes are interchangeable with double quotes
  3. the '=' character is interchangeable with the ':' character
  4. the ':' or '=' characters are optional
  5. Outer object is assumed to be a dictionary if '{' or '[' does not start the text
  6. "()" characters are interchangeable with "{}"
  7. commas separating array items and dictionary key/value pairs are optional
  8. comments via "//" and "/**/" are supported

In part 2 of this series, I'll discuss where to get the JESON parser and how to use it.







No comments:

Post a Comment