FAM (First Attempt Model) — Base Model

About

FAM stands for First Attempt Model. This is a base model, created as a foundational starting point for more complex LLM solutions.

As a base model, it has been trained on a large corpus of text to predict the next token but has not undergone instruction fine-tuning or RLHF. It is best suited for further fine-tuning

Specifications

Type: Base Model (Pre-trained)
Architecture: GPT-2 (Causal Language Model)
Parameters: 100M
Context Length: 1024 tokens
Tokenizer: Sergeial1972-user/FAM-tokenizer (BPE, BertNormalizer)
Training Data: 1.6B tokens from the Sergeial1972-user/cleaned_C4 dataset.
License: MIT


## Dataset Info
The model was trained on a cleaned version of the **C4 (Colossal Clean Crawled Corpus)** dataset. 
* **Total rows:** ~7.83M 
* **Focus:** General English web content.
* **Link:** [Sergeial1972-user/cleaned_C4](https://huggingface.co)

## Limitations & Intent
* **Base Model Nature:** This model will not answer questions like a chatbot; it will only continue the text. To get specific behavior (e.g., answering questions), it requires instruction fine-tuning.
* **Scale:** With 100M parameters, its knowledge and reasoning capabilities are limited compared to larger models.
* **Content:** May reflect the biases and noise present in the C4 web-crawled data.

Downloads last month: 32

Safetensors

Model size

0.1B params

Tensor type

F32