Fu01978
/

FuadeAI-50M

Text Generation

text-generation-inference

Model card Files Files and versions

FuadeAI-50M / README.md

Fu01978's picture

Update README.md

92c90aa verified 5 days ago

|

history blame contribute delete

3.61 kB

	---
	language:
	- en
	license: mit
	tags:
	- text-generation
	- causal-lm
	- gpt2
	- chat
	- conversational
	pipeline_tag: text-generation
	datasets:
	- LucidexAi/VIBE-2K
	- HuggingFaceTB/instruct-data-basics-smollm-H4
	- MuskumPillerum/General-Knowledge
	library_name: transformers
	---

	# FuadeAI-50M

	A 50 million parameter causal language model trained for conversational chat, built on a GPT-2 architecture with a custom tokenizer.

	## Model Details

	\| Property \| Value \|
	\|---\|---\|
	\| Parameters \| 51.5M \|
	\| Architecture \| GPT-2 (custom config) \|
	\| Hidden size \| 512 \|
	\| Layers \| 8 \|
	\| Attention heads \| 8 \|
	\| Context length \| 1024 tokens \|
	\| Tokenizer \| GPT-2 + custom special tokens \|
	\| Training precision \| FP16 \|

	## Special Tokens

	\| Token \| Purpose \|
	\|---\|---\|
	\| `<\\|startoftext\\|>` \| Beginning of conversation \|
	\| `<user>` / `</user>` \| Wraps user message \|
	\| `<assistant>` / `</assistant>` \| Wraps assistant response \|
	\| `<\\|endoftext\\|>` \| End of conversation \|

	## Training Data

	- [LucidexAi/VIBE-2K](https://huggingface.co/datasets/LucidexAi/VIBE-2K)
	- [HuggingFaceTB/instruct-data-basics-smollm-H4](https://huggingface.co/datasets/HuggingFaceTB/instruct-data-basics-smollm-H4)
	- [MuskumPillerum/General-Knowledge](https://huggingface.co/datasets/MuskumPillerum/General-Knowledge) (4k random rows)
	- Custom synthetic dataset for identity and conversational grounding

	## How To Use

	### Transformers
	```python
	from transformers import GPT2Tokenizer, GPT2LMHeadModel
	import torch

	# Load model and tokenizer
	tokenizer = GPT2Tokenizer.from_pretrained("Fu01978/FuadeAI-50M")
	model = GPT2LMHeadModel.from_pretrained("Fu01978/FuadeAI-50M")
	model.eval()

	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	model = model.to(device)

	# Chat function
	def chat(prompt, temperature=0.4, top_p=0.9, max_new_tokens=100):
	formatted = (
	f"{tokenizer.bos_token}"
	f"<user>{prompt}</user>"
	f"<assistant>"
	)
	inputs = tokenizer(formatted, return_tensors="pt").to(device)

	with torch.no_grad():
	output = model.generate(
	**inputs,
	max_new_tokens=max_new_tokens,
	do_sample=True,
	temperature=temperature,
	top_p=top_p,
	repetition_penalty=1.2,
	no_repeat_ngram_size=3,
	eos_token_id=tokenizer.eos_token_id,
	pad_token_id=tokenizer.pad_token_id,
	)

	generated = output[0][inputs["input_ids"].shape[-1]:]
	return tokenizer.decode(generated, skip_special_tokens=True).strip()

	# Example usage
	print(chat("Hello!"))
	print(chat("Who invented the first telephone?"))
	print(chat("Who are you?"))
	```

	### Generation Tips

	- `temperature=0.45` — balanced creativity and coherence (recommended)
	- `temperature=0.2` — more focused and deterministic answers
	- `temperature=0.8` — more creative but less reliable
	- `repetition_penalty=1.2` — keeps responses from looping (recommended)
	- `max_new_tokens=100` — increase for longer responses

	## Limitations

	- 50M parameters is small — factual recall is imperfect and some answers may be incorrect. Always verify factual claims from this model.
	- Coverage of topics is limited compared to large-scale models.
	- Not suitable for factual research, medical/legal/financial advice, or any high-stakes decision making.
	- Context window — limited to 1024 tokens total (prompt + response).

	## Intended Use

	- Learning and experimentation with small language models
	- Lightweight conversational agent for low-stakes applications
	- Fine-tuning base for domain-specific chat applications