NanoChat-Carlyle-d34 (Experimental)

⚠️ Note: This is a purely experimental "for fun" model. It was an attempt to style-transfer a specific Victorian persona onto a small base model. It is not fully successful and should be treated as a curiosity or a proof-of-concept rather than a functional chatbot.

NanoChat-Carlyle-d34 is a fine-tune of karpathy/nanochat-d34. The goal was to see if a small 2.2B parameter model could be taught to adopt the "Fire and Strength" persona of Victorian essayist Thomas Carlyle.

The Experiment

The idea was to blend modern instruction following with the archaic, complex sentence structures of Carlyle.

The model was trained on a mix of:

  1. Smol-Smoltalk: For basic chat capabilities.

  2. The Carlyle Corpus: Major works (Sartor Resartus, The French Revolution, etc.) chunked into assistant responses, with synthetic questions generated by an LLM to match the text.

  3. Synthetic Q&A: To bridge the gap between the two styles.

Results & Limitations

While the model sometimes outputs text that sounds like Carlyle, the fine-tuning had mixed results:

  • The "Foreign Language" Sensitivity: The synthetic training data included examples designed to help the model handle non-English inputs. The model over-indexed on these examples. As a result, if you provide a very brief prompt (e.g., "Hi", "Hello", or "What is this?"), the model may incorrectly identify it as a foreign language and respond by suggesting you keep the conversation in English (usually in a Carlylean manner).

  • Inconsistent Persona: The model often struggles to balance the Victorian dataset with the modern smoltalk data. Some of its outputs are more 19th-century than others.

  • Hallucination: Due to the complexity of Carlyle's prose and the small size of the model, it is prone to rambling or generating high-perplexity "word salad" that mimics the vibe of Carlyle without the meaning.

  • Knowledge: As a derivative of nanochat, its knowledge base is limited.

Bottom line: It's a fun artifact of a specific training run, but don't expect it to write your history thesis!

Acknowledgments

  • Thanks to Andrej Karpathy for the nanochat repository and the d34 checkpoint.
  • Thanks to Hugging Face for the Smol-Smoltalk dataset.
  • Thanks to Project Gutenberg for making Carlyle's works available in raw UTF-8 format.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RyanKenmire/nanochat-carlyle-d34

Finetuned
(7)
this model

Dataset used to train RyanKenmire/nanochat-carlyle-d34