Text Generation
Transformers
Safetensors
llama
text-generation-inference
MultivexAI commited on
Commit
f72a055
·
verified ·
1 Parent(s): d3fe2a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -18,9 +18,9 @@ Plyx-15M is intended for quick testing, research into data efficiency, and speci
18
 
19
  Plyx-15M was trained exclusively on a carefully selected set of premium datasets, prioritizing accuracy and structure.
20
 
21
- 1. **`fineweb-pro` (Ultra-Clean Web Text):** This data is a highly refined subset of general internet content. It was aggressively filtered using advanced, automated tools to remove common errors and noise, giving the model a clean understanding of everyday language.
22
- 2. **`fineweb-edu` (Learning Materials):** Content focused on education and instruction, providing the model with a solid base in clear, organized knowledge.
23
- 3. **`finepdfs` (Structured Documents):** Specialized knowledge sourced from millions of professional reports and complex documents (PDFs). This ensures the model is exposed to formal, technical writing styles and organized information structures.
24
 
25
  ### Limitations
26
 
 
18
 
19
  Plyx-15M was trained exclusively on a carefully selected set of premium datasets, prioritizing accuracy and structure.
20
 
21
+ 1. **`fineweb-pro`** This data is a highly refined subset of general internet content. It was aggressively filtered using advanced, automated tools to remove common errors and noise, giving the model a clean understanding of everyday language.
22
+ 2. **`fineweb-edu`** Content focused on education and instruction, providing the model with a solid base in clear, organized knowledge.
23
+ 3. **`finepdfs`** Specialized knowledge sourced from millions of professional reports and complex documents (PDFs). This ensures the model is exposed to formal, technical writing styles and organized information structures.
24
 
25
  ### Limitations
26