LH-Tech-AI
/

Quark-0.5M

Text Generation

text-generation-inference

Model card Files Files and versions

LH-Tech-AI commited on 15 days ago

Commit

a5f05f9

·

verified ·

1 Parent(s): e91767a

Create README.md

Files changed (1) hide show

README.md +60 -0

README.md ADDED Viewed

	@@ -0,0 +1,60 @@

+---
+license: apache-2.0
+datasets:
+- HuggingFaceFW/fineweb-edu
+language:
+- en
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- llama
+- tiny-model
+- sub-1M
+- cpu
+- small
+- tiny
+- quark
+- 1m
+---
+# Quark-0.5M
+**Quark-0.5M** is an ultra-lightweight Llama-based model with only **465,504 parameters**.
+It was trained from scratch to demonstrate the power of high-quality data (FineWeb-Edu) on extremely small architectures.
+## Model Details
+- **Architecture:** Llama-based
+- **Parameters:** 465,504
+- **Vocabulary Size:** 500 (Custom Byte-Level BPE)
+- **Hidden Size:** 96
+- **Intermediate Size:** 192
+- **Layers:** 4
+- **Heads:** 4
+- **Context Length:** 256 tokens
+## Training
+- **Dataset:** 400 Million Tokens of `HuggingFaceFW/fineweb-edu` (Sample-10BT)
+- **Training Time:** ~42 minutes on a single Kaggle T4 GPU
+- **Final Loss:** 2.46
+- **Optimizer:** AdamW with Cosine Learning Rate Decay
+## Intended Use
+Quark is a research project to explore the limits of "Micro-LLMs". It is surprisingly capable of forming grammatically correct English sentences and structured lists, despite fitting into less than 1MB of disk space.
+## Performance Example
+> **Prompt:** "Artificial intelligence is "
+> **Output:** "Artificial intelligence is very important. These are more likely to be adapted with the people of the following:
+> - Subjects, evidence and social treatment for reduces costs..."
+## How to use
+```python
+from transformers import LlamaForCausalLM, PreTrainedTokenizerFast
+model = LlamaForCausalLM.from_pretrained("LH-Tech-AI/Quark-0.5M")
+tokenizer = PreTrainedTokenizerFast.from_pretrained("LH-Tech-AI/Quark-0.5M")
+prompt = "The scientific method is"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, temperature=0.4)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```