| | --- |
| | language: en |
| | license: apache-2.0 |
| | tags: |
| | - compact-ai |
| | - interleaved-thinking |
| | - transformer |
| | - pytorch |
| | - reasoning |
| | datasets: |
| | - custom |
| | --- |
| | |
| | # Compact AI Model with Interleaved Thinking |
| |
|
| | A compact AI model that implements interleaved thinking for enhanced reasoning capabilities. This model combines efficient transformer architecture with parallel reasoning paths to achieve better performance on complex tasks. |
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| | This is a compact AI model designed for efficient inference while maintaining strong reasoning capabilities through interleaved thinking. The model uses multiple parallel reasoning paths that work together to solve complex problems. |
| |
|
| | ### Model Architecture |
| | - **Base Architecture**: Transformer with efficient attention mechanisms |
| | - **Key Features**: |
| | - Interleaved thinking with parallel reasoning paths |
| | - Hierarchical reasoning with different abstraction levels |
| | - Adaptive memory compression |
| | - Early stopping based on confidence thresholds |
| | - RoPE positional embeddings |
| | - Flash attention support |
| |
|
| | ### Model Sizes |
| | - **Tiny**: ~50M parameters (256 dim, 8 layers, 8 heads) |
| | - **Small**: ~100M parameters (512 dim, 12 layers, 8 heads) |
| | - **Medium**: ~200M parameters (768 dim, 16 layers, 12 heads) |
| |
|
| | ## Usage |
| |
|
| | ### Installation |
| | ```bash |
| | pip install torch transformers |
| | ``` |
| |
|
| | ### Loading the Model |
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model = AutoModelForCausalLM.from_pretrained("likhonsheikh/compact-ai-model") |
| | tokenizer = AutoTokenizer.from_pretrained("likhonsheikh/compact-ai-model") |
| | ``` |
| |
|
| | ### Inference |
| | ```python |
| | inputs = tokenizer("Hello, how are you?", return_tensors="pt") |
| | outputs = model.generate(**inputs, max_length=50) |
| | response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| | ``` |
| |
|
| | ### API Usage |
| | The model also supports a FastAPI-based API server: |
| |
|
| | ```bash |
| | uvicorn compact_ai_model.api.main:app --host 0.0.0.0 --port 8000 |
| | ``` |
| |
|
| | ## Training |
| |
|
| | ### Requirements |
| | - Python 3.8+ |
| | - PyTorch 2.0+ |
| | - CUDA-compatible GPU (recommended) |
| |
|
| | ### Training Script |
| | ```bash |
| | python compact_ai_model/training/train.py |
| | ``` |
| |
|
| | ## Performance |
| |
|
| | ### Benchmarks |
| | - **MMLU**: Coming soon |
| | - **ARC**: Coming soon |
| | - **HellaSwag**: Coming soon |
| |
|
| | ### Efficiency |
| | - Memory-efficient attention mechanisms |
| | - Adaptive compression for long contexts |
| | - Early stopping to reduce computation |
| |
|
| | ## Limitations |
| |
|
| | - Currently uses a simple tokenizer for demonstration |
| | - Model is not yet fine-tuned on large datasets |
| | - API is still in development |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @misc{compact-ai-model, |
| | title={Compact AI Model with Interleaved Thinking}, |
| | author={Likhon Sheikh}, |
| | year={2024}, |
| | publisher={Hugging Face}, |
| | url={https://huggingface.co/likhonsheikh/compact-ai-model} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | This model is released under the Apache 2.0 license. |