docs: add README — file layout, standalone Python usage, tokenizer source cf333db verified mlboydaisuke commited on Apr 27