Ajmalps commited on
Commit
ec90815
Β·
verified Β·
1 Parent(s): 443092c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -9,18 +9,18 @@ library_name: transformers
9
 
10
  ## Introduction πŸŽ‰
11
 
12
- In the spirit of open innovation, we're thrilled to share our pioneering work on pretraining with a custom architecture and dataset. **Boomer-4b**, our 3.51 billion parameter marvel, represents a significant stride in the AI field. Crafted meticulously from custom synthetic data generated with textbook style. This model not only exemplifies our commitment to advancing the boundaries of AI through creative architecture but also through thoughtful data amalgamation.
13
 
14
  ## Quick Start πŸš€
15
 
16
- Jump straight into using Boomer-4b:
17
 
18
  ```python
19
  import torch
20
  from transformers import AutoTokenizer, AutoModelForCausalLM
21
 
22
- tokenizer = AutoTokenizer.from_pretrained("budecosystem/Boomer-4b")
23
- model = AutoModelForCausalLM.from_pretrained("budecosystem/Boomer-4b", torch_dtype=torch.bfloat16)
24
  inputs = tokenizer("Newton's second law", return_tensors="pt")
25
  sample = model.generate(**inputs, max_length=128)
26
  print(tokenizer.decode(sample[0]))
@@ -48,26 +48,26 @@ The training was finely tuned with the following hyperparameters:
48
 
49
  ## Evaluations and Comparisons πŸ…
50
 
51
- Boomer-4b has been rigorously evaluated across several benchmarks:
52
 
53
  | Model | MMLU | ARC | HellaSwag | GSM8K | Winogrande | MATH | MathQA | DROP | LogiQA |
54
  |-------|------|-----|-----------|-------|------------|------|--------|------|--------|
55
- | **Boomer-4b** | 55.59 | 58.53 | **74.70** | 47.76 | **72.22** | 4.00 | 35.98 | 0.74 | 31.80 |
56
  | GeneZC/MiniChat-3B | 39.17 | 44.03 | 67.19 | 10.54 | 65.27 | - | - | - | - |
57
  | openlm-research/open_11ama_3b_v2 | 27.12 | 44.03 | 71.6 | 0.91 | 67.01 | - | - | - | - |
58
  | microsoft/phi-2 | 58.11 | 61.09 | 75.11 | 54.81 | 74.35 | - | - | - | - |
59
  | TinyLlama/TinyLlama-1.1B-intermediate | 26.04 | 33.87 | 60.31 | 1.44 | 59.51 | - | - | - | - |
60
 
61
- ## Why Boomer-4b? ✨
62
 
63
- Boomer-4b's remarkable performance across a variety of benchmarks not only showcases its robustness and versatility but also highlights its superiority in handling complex reasoning and understanding tasks. It stands as a continuation of our pursuit of excellence in AI, building on the foundation laid by Boomer 1b.
64
 
65
- ## Limitations of Boomer-4b
66
 
67
- Despite its impressive achievements, Boomer-4b encounters challenges in areas requiring intricate mathematical problem-solving and sophisticated logical reasoning, as reflected in its subdued performance in MATH and LogiQA evaluations. This variability in task performance suggests limitations in its capacity to uniformly apply and adapt its knowledge base across a spectrum of reasoning and synthesis challenges, pointing to areas for further refinement and enhancement.
68
 
69
  ## Acknowledgments πŸ™
70
 
71
- A special thanks to the open-source community and the researchers who paved the way for innovations like Boomer. Our team's dedication to curating the dataset and fine-tuning the model has been instrumental in achieving this milestone.
72
 
73
- Dive into the future of AI with Boomer-4b and explore its capabilities in pushing the boundaries of what's possible in language understanding and beyond.
 
9
 
10
  ## Introduction πŸŽ‰
11
 
12
+ In the spirit of open innovation, we're thrilled to share our pioneering work on pretraining with a custom architecture and dataset. **boomer-4b**, our 3.51 billion parameter marvel, represents a significant stride in the AI field. Crafted meticulously from custom synthetic data generated with textbook style. This model not only exemplifies our commitment to advancing the boundaries of AI through creative architecture but also through thoughtful data amalgamation.
13
 
14
  ## Quick Start πŸš€
15
 
16
+ Jump straight into using boomer-4b:
17
 
18
  ```python
19
  import torch
20
  from transformers import AutoTokenizer, AutoModelForCausalLM
21
 
22
+ tokenizer = AutoTokenizer.from_pretrained("budecosystem/boomer-4b")
23
+ model = AutoModelForCausalLM.from_pretrained("budecosystem/boomer-4b", torch_dtype=torch.bfloat16)
24
  inputs = tokenizer("Newton's second law", return_tensors="pt")
25
  sample = model.generate(**inputs, max_length=128)
26
  print(tokenizer.decode(sample[0]))
 
48
 
49
  ## Evaluations and Comparisons πŸ…
50
 
51
+ boomer-4b has been rigorously evaluated across several benchmarks:
52
 
53
  | Model | MMLU | ARC | HellaSwag | GSM8K | Winogrande | MATH | MathQA | DROP | LogiQA |
54
  |-------|------|-----|-----------|-------|------------|------|--------|------|--------|
55
+ | **boomer-4b** | 55.59 | 58.53 | **74.70** | 47.76 | **72.22** | 4.00 | 35.98 | 0.74 | 31.80 |
56
  | GeneZC/MiniChat-3B | 39.17 | 44.03 | 67.19 | 10.54 | 65.27 | - | - | - | - |
57
  | openlm-research/open_11ama_3b_v2 | 27.12 | 44.03 | 71.6 | 0.91 | 67.01 | - | - | - | - |
58
  | microsoft/phi-2 | 58.11 | 61.09 | 75.11 | 54.81 | 74.35 | - | - | - | - |
59
  | TinyLlama/TinyLlama-1.1B-intermediate | 26.04 | 33.87 | 60.31 | 1.44 | 59.51 | - | - | - | - |
60
 
61
+ ## Why boomer-4b? ✨
62
 
63
+ boomer-4b's remarkable performance across a variety of benchmarks not only showcases its robustness and versatility but also highlights its superiority in handling complex reasoning and understanding tasks. It stands as a continuation of our pursuit of excellence in AI, building on the foundation laid by boomer 1b.
64
 
65
+ ## Limitations of boomer-4b
66
 
67
+ Despite its impressive achievements, boomer-4b encounters challenges in areas requiring intricate mathematical problem-solving and sophisticated logical reasoning, as reflected in its subdued performance in MATH and LogiQA evaluations. This variability in task performance suggests limitations in its capacity to uniformly apply and adapt its knowledge base across a spectrum of reasoning and synthesis challenges, pointing to areas for further refinement and enhancement.
68
 
69
  ## Acknowledgments πŸ™
70
 
71
+ A special thanks to the open-source community and the researchers who paved the way for innovations like boomer. Our team's dedication to curating the dataset and fine-tuning the model has been instrumental in achieving this milestone.
72
 
73
+ Dive into the future of AI with boomer-4b and explore its capabilities in pushing the boundaries of what's possible in language understanding and beyond.