FAM (First Attempt Model) โ Base Model
About
FAM stands for First Attempt Model. This is a base model, created as a foundational starting point for more complex LLM solutions.
As a base model, it has been trained on a large corpus of text to predict the next token but has not undergone instruction fine-tuning or RLHF. It is best suited for further fine-tuning
Specifications
- Type: Base Model (Pre-trained)
- Architecture: GPT-2 (Causal Language Model)
- Parameters: 100M
- Context Length: 1024 tokens
- Tokenizer: Sergeial1972-user/FAM-tokenizer (BPE, BertNormalizer)
- Training Data: 1.6B tokens from the Sergeial1972-user/cleaned_C4 dataset.
- License: MIT
## Dataset Info
The model was trained on a cleaned version of the **C4 (Colossal Clean Crawled Corpus)** dataset.
* **Total rows:** ~7.83M
* **Focus:** General English web content.
* **Link:** [Sergeial1972-user/cleaned_C4](https://huggingface.co)
## Limitations & Intent
* **Base Model Nature:** This model will not answer questions like a chatbot; it will only continue the text. To get specific behavior (e.g., answering questions), it requires instruction fine-tuning.
* **Scale:** With 100M parameters, its knowledge and reasoning capabilities are limited compared to larger models.
* **Content:** May reflect the biases and noise present in the C4 web-crawled data.
- Downloads last month
- 32