| # clarinet | |
| `clarinet` is a sax-like streaming parser for JSON. works in the browser and node.js. `clarinet` is inspired (and forked) from [sax-js][saxjs]. just like you shouldn't use `sax` when you need `dom` you shouldn't use `clarinet` when you need `JSON.parse`. for a more detailed introduction and a performance study please refer to this [article][blog]. | |
| # design goals | |
| `clarinet` is very much like [yajl] but written in javascript: | |
| * written in javascript | |
| * portable | |
| * robust (~110 tests pass before even announcing the project) | |
| * data representation independent | |
| * fast | |
| * generates verbose, useful error messages including context of where | |
| the error occurs in the input text. | |
| * can parse json data off a stream, incrementally | |
| * simple to use | |
| * tiny | |
| # motivation | |
| the reason behind this work was to create better full text support in node. creating indexes out of large (or many) json files doesn't require a full understanding of the json file, but it does require something like `clarinet`. | |
| # installation | |
| ## node.js | |
| 1. install [npm] | |
| 2. `npm install clarinet` | |
| 3. `var clarinet = require('clarinet');` | |
| ## browser | |
| 1. minimize clarinet.js | |
| 2. load it into your webpage | |
| # usage | |
| ## basics | |
| ``` js | |
| var clarinet = require("clarinet") | |
| , parser = clarinet.parser() | |
| ; | |
| parser.onerror = function (e) { | |
| // an error happened. e is the error. | |
| }; | |
| parser.onvalue = function (v) { | |
| // got some value. v is the value. can be string, double, bool, or null. | |
| }; | |
| parser.onopenobject = function (key) { | |
| // opened an object. key is the first key. | |
| }; | |
| parser.onkey = function (key) { | |
| // got a key in an object. | |
| }; | |
| parser.oncloseobject = function () { | |
| // closed an object. | |
| }; | |
| parser.onopenarray = function () { | |
| // opened an array. | |
| }; | |
| parser.onclosearray = function () { | |
| // closed an array. | |
| }; | |
| parser.onend = function () { | |
| // parser stream is done, and ready to have more stuff written to it. | |
| }; | |
| parser.write('{"foo": "bar"}').close(); | |
| ``` | |
| ``` js | |
| // stream usage | |
| // takes the same options as the parser | |
| var stream = require("clarinet").createStream(options); | |
| stream.on("error", function (e) { | |
| // unhandled errors will throw, since this is a proper node | |
| // event emitter. | |
| console.error("error!", e) | |
| // clear the error | |
| this._parser.error = null | |
| this._parser.resume() | |
| }) | |
| stream.on("openobject", function (node) { | |
| // same object as above | |
| }) | |
| // pipe is supported, and it's readable/writable | |
| // same chunks coming in also go out. | |
| fs.createReadStream("file.json") | |
| .pipe(stream) | |
| .pipe(fs.createReadStream("file-altered.json")) | |
| ``` | |
| ## arguments | |
| pass the following arguments to the parser function. all are optional. | |
| `opt` - object bag of settings regarding string formatting. all default to `false`. | |
| settings supported: | |
| * `trim` - boolean. whether or not to trim text and comment nodes. | |
| * `normalize` - boolean. if true, then turn any whitespace into a single | |
| space. | |
| ## methods | |
| `write` - write bytes onto the stream. you don't have to do this all at | |
| once. you can keep writing as much as you want. | |
| `close` - close the stream. once closed, no more data may be written until | |
| it is done processing the buffer, which is signaled by the `end` event. | |
| `resume` - to gracefully handle errors, assign a listener to the `error` | |
| event. then, when the error is taken care of, you can call `resume` to | |
| continue parsing. otherwise, the parser will not continue while in an error | |
| state. | |
| ## members | |
| at all times, the parser object will have the following members: | |
| `line`, `column`, `position` - indications of the position in the json | |
| document where the parser currently is looking. | |
| `closed` - boolean indicating whether or not the parser can be written to. | |
| if it's `true`, then wait for the `ready` event to write again. | |
| `opt` - any options passed into the constructor. | |
| and a bunch of other stuff that you probably shouldn't touch. | |
| ## events | |
| all events emit with a single argument. to listen to an event, assign a | |
| function to `on<eventname>`. functions get executed in the this-context of | |
| the parser object. the list of supported events are also in the exported | |
| `EVENTS` array. | |
| when using the stream interface, assign handlers using the `EventEmitter` | |
| `on` function in the normal fashion. | |
| `error` - indication that something bad happened. the error will be hanging | |
| out on `parser.error`, and must be deleted before parsing can continue. by | |
| listening to this event, you can keep an eye on that kind of stuff. note: | |
| this happens *much* more in strict mode. argument: instance of `Error`. | |
| `value` - a json value. argument: value, can be a bool, null, string on number | |
| `openobject` - object was opened. argument: key, a string with the first key of the object (if any) | |
| `key` - an object key: argument: key, a string with the current key | |
| `closeobject` - indication that an object was closed | |
| `openarray` - indication that an array was opened | |
| `closearray` - indication that an array was closed | |
| `end` - indication that the closed stream has ended. | |
| `ready` - indication that the stream has reset, and is ready to be written | |
| to. | |
| ## samples | |
| some [samples] are available to help you get started. one that creates a list of top npm contributors, and another that gets a bunch of data from twitter and generates valid json. | |
| # roadmap | |
| check [issues] | |
| # contribute | |
| everyone is welcome to contribute. patches, bug-fixes, new features | |
| 1. create an [issue][issues] so the community can comment on your idea | |
| 2. fork `clarinet` | |
| 3. create a new branch `git checkout -b my_branch` | |
| 4. create tests for the changes you made | |
| 5. make sure you pass both existing and newly inserted tests | |
| 6. commit your changes | |
| 7. push to your branch `git push origin my_branch` | |
| 8. create an pull request | |
| helpful tips: | |
| check `index.html`. there's two env vars you can set, `CRECORD` and `CDEBUG`. | |
| * `CRECORD` allows you to `record` the event sequence from a new json test so you don't have to write everything. | |
| * `CDEBUG` can be set to `info` or `debug`. `info` will `console.log` all emits, `debug` will `console.log` what happens to each char. | |
| in `test/clarinet.js` there's two lines you might want to change. `#8` where you define `seps`, if you are isolating a test you probably just want to run one sep, so change this array to `[undefined]`. `#718` which says `for (var key in docs) {` is where you can change the docs you want to run. e.g. to run `foobar` i would do something like `for (var key in {foobar:''}) {`. | |
| # meta | |
| * code: `git clone git://github.com/dscape/clarinet.git` | |
| * home: <http://github.com/dscape/clarinet> | |
| * bugs: <http://github.com/dscape/clarinet/issues> | |
| * build: [](http://travis-ci.org/dscape/clarinet) | |
| `(oO)--',-` in [caos] | |
| [npm]: http://npmjs.org | |
| [issues]: http://github.com/dscape/clarinet/issues | |
| [caos]: http://caos.di.uminho.pt/ | |
| [saxjs]: http://github.com/isaacs/sax-js | |
| [yajl]: https://github.com/lloyd/yajl | |
| [samples]: https://github.com/dscape/clarinet/tree/master/samples | |
| [blog]: http://writings.nunojob.com/2011/12/clarinet-sax-based-evented-streaming-json-parser-in-javascript-for-the-browser-and-nodejs.html |