Text Generation
Transformers
Safetensors
llama
text-generation-inference
MultivexAI commited on
Commit
7260d48
·
verified ·
1 Parent(s): 6067843

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -8,23 +8,23 @@ license: mit
8
  ---
9
  # MultivexAI/Plyx-15M
10
 
11
- **MultivexAI/Plyx-15M** is a 15 million parameter language model trained completely from scratch using the LlamaForCausalLM architecture. It is designed for maximum efficiency, showing that focusing intensely on data quality can create a highly capable foundation even at this minimal size.
12
 
13
- Plyx-15M is intended for quick testing, research into data efficiency, and specialized fine-tuning tasks where model size must be kept small.
14
 
15
- **Model Series Note:** This is **Version 1** of the Plyx model family. We are continuing this work and plan to release future models in various sizes. We expect to publish initial performance benchmarks for Plyx-15M here soon.
16
 
17
  ## Pre-training Data
18
 
19
- Plyx-15M was trained exclusively on a carefully selected set of premium datasets, prioritizing accuracy and structure.
20
 
21
- 1. **`fineweb-pro`** This data is a highly refined subset of FineWeb. It was aggressively filtered using advanced, automated tools to remove common errors and noise, giving the model a clean understanding of everyday language.
22
- 2. **`fineweb-edu`** Content focused on education and instruction, providing the model with a solid base in clear, organized knowledge.
23
- 3. **`finepdfs`** Specialized knowledge sourced from millions of professional reports and complex documents (PDFs). This ensures the model is exposed to formal, technical writing styles and organized information structures.
24
 
25
- ### Limitations
26
 
27
- **Plyx-15M is a very small model (15 million parameters)**. Its overall performance will be limited compared to models with billions of parameters. It should primarily be used for research, highly specific tasks, or as a base for fine-tuning, not as a general-purpose language model replacement.
28
 
29
  ## License
30
 
 
8
  ---
9
  # MultivexAI/Plyx-15M
10
 
11
+ **MultivexAI/Plyx-15M** is a 15 million parameter language model, trained from scratch using the Llama architecture.
12
 
13
+ We built this model to be a small, useful foundation for various tasks. It's a great starting point for quick tests, research projects, or fine-tuning on specialized jobs where a small model footprint is important.
14
 
15
+ **Model Series Note:** This is the first model in our Plyx series. We're continuing this work and plan to release future models in various sizes. We'll be adding some initial performance benchmarks here soon.
16
 
17
  ## Pre-training Data
18
 
19
+ The model was trained on a carefully curated mix of data to build a great foundation:
20
 
21
+ 1. **`fineweb-pro`**: A heavily filtered and refined version of the FineWeb dataset. This provides a strong base in general-purpose language by removing significant noise and low-quality content.
22
+ 2. **`fineweb-edu`**: A subset of FineWeb containing educational and instructional content, used to ground the model in well-structured, factual information.
23
+ 3. **`finepdfs`**: A large collection of documents from PDFs, including professional reports and technical papers. This component introduces the model to more formal language, complex sentence structures, and data-rich formats.
24
 
25
+ ### A Note on Size and Performance
26
 
27
+ To set the right expectations: **Plyx-15M is a 15-million-parameter model, which is quite small.** Its performance won't be comparable to models with billions of parameters. It's best used for research, highly specific tasks, or as a base for fine-tuningnot as a drop-in replacement for a large, general-purpose model.
28
 
29
  ## License
30