Instructions to use Oscilla/OSLO-1.5B-MLX-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use Oscilla/OSLO-1.5B-MLX-4bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("Oscilla/OSLO-1.5B-MLX-4bit") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use Oscilla/OSLO-1.5B-MLX-4bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "Oscilla/OSLO-1.5B-MLX-4bit"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "Oscilla/OSLO-1.5B-MLX-4bit" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Oscilla/OSLO-1.5B-MLX-4bit", "messages": [ {"role": "user", "content": "Hello"} ] }'
OSLO 1.5B MLX
OSLO 1.5B MLX is the MLX-optimized version of Oscilla's first lightweight language model, designed for fast, private AI inference on Apple Silicon.
Model Details
Model Description
OSLO 1.5B MLX is an MLX-formatted version of OSLO 1.5B, optimized for efficient inference on Apple Silicon. It is designed for fast, local conversations, writing assistance, coding, and research while taking advantage of Apple's unified memory architecture.
- Developed by: Oscilla
- Model type: Causal language model
- Language(s): English
- License: More information needed
- Base model: Qwen/Qwen2.5-1.5B-Instruct
- Converted from: Oscilla/OSLO-1.5B-BETA
Uses
Direct Use
OSLO 1.5B MLX is intended for local inference using the MLX framework on Apple Silicon devices.
Common use cases include:
- Conversational AI
- Writing assistance
- Coding support
- Research
- On-device AI applications
- Apple platform development
Downstream Use
This model is designed for applications built with MLX, including native macOS and iOS experiences through the Oscilla platform.
Out-of-Scope Use
This model should not be relied upon for legal, medical, financial, or other safety-critical decisions. Outputs should always be verified where accuracy is important.
Bias, Risks, and Limitations
Like other language models, OSLO 1.5B MLX may:
- Produce inaccurate information
- Reflect biases present in training data
- Struggle with complex reasoning tasks
- Have limited world knowledge beyond its training cutoff
Recommendations
Always verify important information and use appropriate human oversight for high-impact applications.
Getting Started
Install MLX-LM:
pip install mlx-lm
Run the model:
mlx_lm.generate \
--model Oscilla/OSLO-1.5B-MLX \
--prompt "What is AI?"
Training Details
Training Data
OSLO 1.5B MLX is a converted version of OSLO 1.5B and shares the same fine-tuning process and training data.
Training Procedure
- Base Model: Qwen/Qwen2.5-1.5B-Instruct
- Fine-tuning: LoRA
- Merge: Standalone Hugging Face model
- Conversion: MLX
- Quantization: MLX Quantized
Technical Specifications
Architecture
- Decoder-only Transformer
- Instruction tuned
- MLX optimized
- Apple Silicon optimized
More Information
OSLO is Oscilla's family of language models focused on private, local AI experiences for Apple devices. The MLX versions are optimized specifically for efficient inference on Apple Silicon using Apple's MLX framework.
Model Card Authors
Oscilla
Model Card Contact
Oscilla
- Downloads last month
- 27
4-bit