VibeCoder-20B-alpha / README.md
legolasyiu's picture
Update README.md
cd82a80 verified
---
base_model: unsloth/gpt-oss-20b-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- gpt_oss
license: apache-2.0
language:
- en
new_version: EpistemeAI/VibeCoder-20B-alpha-0.001
---
# Model card
# Test our endpoint
[FriendliAI](https://friendli.ai/suite/WTHFpZnt6oAT/VGDaGrYOXeIm/dedicated-endpoints/depoqch056a4j4a/playground)
# Summary
This is an first-generation vibe-code alpha(preview) LLM. It’s optimized to produce both natural-language and code completions directly from loosely structured, “vibe coding” prompts. Compared to earlier-generation LLMs, it has a lower prompt-engineering overhead and smoother latent-space interpolation, making it easier to guide toward usable code. The following capabilities can be leveraged:
- **Agentic capabilities**: Use the OpenAI's gpt oss 20b models’ native capabilities for function calling, web browsing, Python code execution, and Structured Outputs.
- This model were trained on our [harmony response](https://github.com/openai/harmony) format and should only be used with the harmony format as it will not work correctly otherwise.
# Vibe-Code LLM
This is a **first-generation vibe-code LLM**.
It’s optimized to produce both natural-language and code completions directly from loosely structured, *“vibe coding”* prompts.
Unlike earlier LLMs that demanded rigid prompt engineering, vibe-code interaction lowers the overhead: you can sketch intent, describe functionality in free-form language, or mix pseudo-code with natural text. The model interpolates smoothly in latent space, making it easier to guide toward usable and executable code.
---
## Key Features
- **Low Prompt-Engineering Overhead**
Accepts incomplete or intuitive instructions, reducing the need for explicit formatting or rigid templates.
- **Latent-Space Interpolation**
Transitions fluidly between natural-language reasoning and syntax-aware code generation. Produces semantically coherent code blocks even when the prompt is under-specified.
- **Multi-Domain Support**
Handles a broad range of programming paradigms: Python, JavaScript, C++, shell scripting, and pseudo-code scaffolding.
- **Context-Sensitive Completion**
Leverages attention mechanisms to maintain coherence across multi-turn coding sessions.
- **Syntax-Aware Decoding**
Biases output distribution toward syntactically valid tokens, improving out-of-the-box executability of code.
- **Probabilistic Beam & Sampling Controls**
Supports temperature scaling, top-k, and nucleus (top-p) sampling to modulate creativity vs. determinism.
- **Hybrid Text + Code Responses**
Generates inline explanations, design rationales, or docstrings alongside code for improved readability and maintainability.
---
## Example Usage
```plaintext
Prompt:
"make me a fast vibe function that sorts numbers but with a cool twist"
Response:
- Natural explanation of sorting method
- Code snippet (e.g., Python quicksort variant)
- Optional playful commentary to match the vibe
```
---
## Ideal Applications
- Rapid prototyping & exploratory coding
- Creative coding workflows with minimal boilerplate
- Educational contexts where explanation + code matter equally
- Interactive REPLs, notebooks, or editor assistants that thrive on loose natural-language input
---
## Limitations
- Not tuned for production-grade formal verification.
- May require post-processing or linting to ensure strict compliance with project coding standards.
- Designed for *“fast prototyping vibes”*, not for long-horizon enterprise-scale codebases.
# Inference examples
## Transformers
You can use `gpt-oss-120b` and `gpt-oss-20b` with Transformers. If you use the Transformers chat template, it will automatically apply the [harmony response format](https://github.com/openai/harmony). If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [openai-harmony](https://github.com/openai/harmony) package.
To get started, install the necessary dependencies to setup your environment:
```
pip install -U transformers kernels torch
```
For Google Colab (free/Pro)
```
!pip install -q --upgrade torch
!pip install -q transformers triton==3.4 kernels
!pip uninstall -q torchvision torchaudio -y
```
Once, setup you can proceed to run the model by running the snippet below:
```py
from transformers import pipeline
import torch
model_id = "EpistemeAI/VibeCoder-20B-alpha"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype="auto",
device_map="auto",
)
messages = [
{"role": "user", "content": "Let’s start with the header and navigation for the landing page. Start by creating the top header section for the dashboard. We’ll add the content blocks below afterward."},
]
outputs = pipe(
messages,
max_new_tokens=3000,
)
print(outputs[0]["generated_text"][-1])
```
### Amazon SageMaker
```py
import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'EpistemeAI/VibeCoder-20B-alpha',
'SM_NUM_GPUS': json.dumps(1)
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri("huggingface",version="3.2.3"),
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
container_startup_health_check_timeout=300,
)
# send request
predictor.predict({
"inputs": "Hi, what can you help me with?",
})
```
# Uploaded finetuned model
- **Developed by:** EpistemeAI
- **License:** apache-2.0
- **Finetuned from model :** unsloth/gpt-oss-20b-unsloth-bnb-4bit
This gpt_oss model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)