Update README.md

cd82a80 verified 4 months ago

6.32 kB

	---
	base_model: unsloth/gpt-oss-20b-unsloth-bnb-4bit
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- gpt_oss
	license: apache-2.0
	language:
	- en
	new_version: EpistemeAI/VibeCoder-20B-alpha-0.001
	---
	# Model card

	# Test our endpoint
	[FriendliAI](https://friendli.ai/suite/WTHFpZnt6oAT/VGDaGrYOXeIm/dedicated-endpoints/depoqch056a4j4a/playground)

	# Summary
	This is an first-generation vibe-code alpha(preview) LLM. It’s optimized to produce both natural-language and code completions directly from loosely structured, “vibe coding” prompts. Compared to earlier-generation LLMs, it has a lower prompt-engineering overhead and smoother latent-space interpolation, making it easier to guide toward usable code. The following capabilities can be leveraged:
	- Agentic capabilities: Use the OpenAI's gpt oss 20b models’ native capabilities for function calling, web browsing, Python code execution, and Structured Outputs.
	- This model were trained on our [harmony response](https://github.com/openai/harmony) format and should only be used with the harmony format as it will not work correctly otherwise.

	# Vibe-Code LLM

	This is a first-generation vibe-code LLM.
	It’s optimized to produce both natural-language and code completions directly from loosely structured, “vibe coding” prompts.

	Unlike earlier LLMs that demanded rigid prompt engineering, vibe-code interaction lowers the overhead: you can sketch intent, describe functionality in free-form language, or mix pseudo-code with natural text. The model interpolates smoothly in latent space, making it easier to guide toward usable and executable code.

	---

	## Key Features

	- Low Prompt-Engineering Overhead
	Accepts incomplete or intuitive instructions, reducing the need for explicit formatting or rigid templates.

	- Latent-Space Interpolation
	Transitions fluidly between natural-language reasoning and syntax-aware code generation. Produces semantically coherent code blocks even when the prompt is under-specified.

	- Multi-Domain Support
	Handles a broad range of programming paradigms: Python, JavaScript, C++, shell scripting, and pseudo-code scaffolding.

	- Context-Sensitive Completion
	Leverages attention mechanisms to maintain coherence across multi-turn coding sessions.

	- Syntax-Aware Decoding
	Biases output distribution toward syntactically valid tokens, improving out-of-the-box executability of code.

	- Probabilistic Beam & Sampling Controls
	Supports temperature scaling, top-k, and nucleus (top-p) sampling to modulate creativity vs. determinism.

	- Hybrid Text + Code Responses
	Generates inline explanations, design rationales, or docstrings alongside code for improved readability and maintainability.

	---

	## Example Usage

	```plaintext
	Prompt:
	"make me a fast vibe function that sorts numbers but with a cool twist"

	Response:
	- Natural explanation of sorting method
	- Code snippet (e.g., Python quicksort variant)
	- Optional playful commentary to match the vibe
	```

	---

	## Ideal Applications

	- Rapid prototyping & exploratory coding
	- Creative coding workflows with minimal boilerplate
	- Educational contexts where explanation + code matter equally
	- Interactive REPLs, notebooks, or editor assistants that thrive on loose natural-language input

	---

	## Limitations

	- Not tuned for production-grade formal verification.
	- May require post-processing or linting to ensure strict compliance with project coding standards.
	- Designed for “fast prototyping vibes”, not for long-horizon enterprise-scale codebases.



	# Inference examples

	## Transformers

	You can use `gpt-oss-120b` and `gpt-oss-20b` with Transformers. If you use the Transformers chat template, it will automatically apply the [harmony response format](https://github.com/openai/harmony). If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [openai-harmony](https://github.com/openai/harmony) package.

	To get started, install the necessary dependencies to setup your environment:

	```
	pip install -U transformers kernels torch
	```

	For Google Colab (free/Pro)
	```
	!pip install -q --upgrade torch

	!pip install -q transformers triton==3.4 kernels

	!pip uninstall -q torchvision torchaudio -y
	```

	Once, setup you can proceed to run the model by running the snippet below:

	```py
	from transformers import pipeline
	import torch
	model_id = "EpistemeAI/VibeCoder-20B-alpha"
	pipe = pipeline(
	"text-generation",
	model=model_id,
	torch_dtype="auto",
	device_map="auto",
	)
	messages = [
	{"role": "user", "content": "Let’s start with the header and navigation for the landing page. Start by creating the top header section for the dashboard. We’ll add the content blocks below afterward."},
	]
	outputs = pipe(
	messages,
	max_new_tokens=3000,
	)
	print(outputs[0]["generated_text"][-1])
	```

	### Amazon SageMaker
	```py
	import json
	import sagemaker
	import boto3
	from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

	try:
	role = sagemaker.get_execution_role()
	except ValueError:
	iam = boto3.client('iam')
	role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

	# Hub Model configuration. https://huggingface.co/models
	hub = {
	'HF_MODEL_ID':'EpistemeAI/VibeCoder-20B-alpha',
	'SM_NUM_GPUS': json.dumps(1)
	}



	# create Hugging Face Model Class
	huggingface_model = HuggingFaceModel(
	image_uri=get_huggingface_llm_image_uri("huggingface",version="3.2.3"),
	env=hub,
	role=role,
	)

	# deploy model to SageMaker Inference
	predictor = huggingface_model.deploy(
	initial_instance_count=1,
	instance_type="ml.g5.2xlarge",
	container_startup_health_check_timeout=300,
	)

	# send request
	predictor.predict({
	"inputs": "Hi, what can you help me with?",
	})
	```

	# Uploaded finetuned model

	- Developed by: EpistemeAI
	- License: apache-2.0
	- Finetuned from model : unsloth/gpt-oss-20b-unsloth-bnb-4bit

	This gpt_oss model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)