| | --- |
| | license: mit |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | --- |
| | |
| | Chain of Thought (CoT) transformer model trained to do multi-step integer arithmetic. |
| |
|
| | Model details: |
| | - **Vocabulary Size**: 40 (Character Tokenization) |
| | - **Layer Count**: 8 |
| | - **Attention Head Count**: 4 |
| | - **Residual Stream Size**: 256 |
| | - **Context Length**: 256 |
| | - **Tokens Trained on**: 419,612,160 |
| |
|
| | Training Score During Training |
| |
|
| | [score.png](score.png) |
| |
|
| | ```py |
| | from transformers import pipeline |
| | |
| | pipe = pipeline( |
| | "text-generation", model="bart1259/MiniCOTMath" |
| | ) |
| | print(pipe("Input: (5 + 5)\n", max_new_tokens=100)[0]["generated_text"]) |
| | ``` |
| |
|
| | Outputs: |
| | ``` |
| | Input: (5 + 5) |
| | |
| | Step 1: |
| | (5 + 5) |
| | (5 + 5) = 10 |
| | |
| | Step 2: |
| | 10 |
| | Final Result: 10 |
| | <end> |
| | |
| | Input: (3 * 8) |
| | |
| | Step 1: |
| | (3 * 8) |
| | (3 |
| | ``` |
| |
|
| | To end at the <end> token, you can setup streaming like: |
| |
|
| | ```py |
| | from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer, StoppingCriteria |
| | from transformers import StoppingCriteria |
| | |
| | class StopCriteria(StoppingCriteria): |
| | def __call__(self, input_ids, scores, **kwargs): |
| | generated_text = tokenizer.decode(input_ids[0]) |
| | return "<end>" in generated_text |
| | |
| | def __len__(self): |
| | return 1 |
| | |
| | def __iter__(self): |
| | yield self |
| | |
| | prompt = "Input: (5 + 5)\n" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("bart1259/MiniCOTMath") |
| | model = AutoModelForCausalLM.from_pretrained("bart1259/MiniCOTMath").cuda() |
| | |
| | encoded_input = tokenizer(prompt, return_tensors='pt') |
| | input_ids=encoded_input['input_ids'].cuda() |
| | |
| | streamer = TextStreamer(tokenizer=tokenizer, skip_prompt=False) |
| | _ = model.generate( |
| | input_ids, |
| | streamer=streamer, |
| | pad_token_id=tokenizer.eos_token_id, |
| | do_sample=True, |
| | temperature=0.25, |
| | max_new_tokens=256, |
| | stopping_criteria=StopCriteria() |
| | ) |
| | ``` |
| |
|
| | Outputs: |
| | ``` |
| | Input: (5 + 5) |
| | |
| | Step 1: |
| | (5 + 5) |
| | (5 + 5) = 10 |
| | |
| | Step 2: |
| | 10 |
| | Final Result: 10 |
| | <end> |
| | ``` |
| |
|