--- license: mit pipeline_tag: text-generation library_name: transformers --- Chain of Thought (CoT) transformer model trained to do multi-step integer arithmetic. Model details: - **Vocabulary Size**: 40 (Character Tokenization) - **Layer Count**: 8 - **Attention Head Count**: 4 - **Residual Stream Size**: 256 - **Context Length**: 256 - **Tokens Trained on**: 419,612,160 Training Score During Training [score.png](score.png) ```py from transformers import pipeline pipe = pipeline( "text-generation", model="bart1259/MiniCOTMath" ) print(pipe("Input: (5 + 5)\n", max_new_tokens=100)[0]["generated_text"]) ``` Outputs: ``` Input: (5 + 5) Step 1: (5 + 5) (5 + 5) = 10 Step 2: 10 Final Result: 10 Input: (3 * 8) Step 1: (3 * 8) (3 ``` To end at the token, you can setup streaming like: ```py from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer, StoppingCriteria from transformers import StoppingCriteria class StopCriteria(StoppingCriteria): def __call__(self, input_ids, scores, **kwargs): generated_text = tokenizer.decode(input_ids[0]) return "" in generated_text def __len__(self): return 1 def __iter__(self): yield self prompt = "Input: (5 + 5)\n" tokenizer = AutoTokenizer.from_pretrained("bart1259/MiniCOTMath") model = AutoModelForCausalLM.from_pretrained("bart1259/MiniCOTMath").cuda() encoded_input = tokenizer(prompt, return_tensors='pt') input_ids=encoded_input['input_ids'].cuda() streamer = TextStreamer(tokenizer=tokenizer, skip_prompt=False) _ = model.generate( input_ids, streamer=streamer, pad_token_id=tokenizer.eos_token_id, do_sample=True, temperature=0.25, max_new_tokens=256, stopping_criteria=StopCriteria() ) ``` Outputs: ``` Input: (5 + 5) Step 1: (5 + 5) (5 + 5) = 10 Step 2: 10 Final Result: 10 ```