legolasyiu commited on
Commit
4726313
·
verified ·
1 Parent(s): 00b60f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +172 -0
README.md CHANGED
@@ -10,6 +10,178 @@ language:
10
  - en
11
  ---
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  # Uploaded finetuned model
14
 
15
  - **Developed by:** EpistemeAI
 
10
  - en
11
  ---
12
 
13
+ # Summary
14
+ This is an ** version using debug dataset ** of the vibe-code LLM. It’s optimized to produce both natural-language and code completions directly from loosely structured, “vibe coding” prompts. Compared to earlier-generation LLMs, it has a lower prompt-engineering overhead and smoother latent-space interpolation, making it easier to guide toward usable code. The following capabilities can be leveraged:
15
+ - **Agentic capabilities**: Use the OpenAI's gpt oss 120b models’ native capabilities for function calling, web browsing, Python code execution, and Structured Outputs.
16
+ - This model were trained on our [harmony response](https://github.com/openai/harmony) format and should only be used with the harmony format as it will not work correctly otherwise.
17
+
18
+ # Vibe-Code LLM
19
+
20
+ This is a **first-generation vibe-code LLM**.
21
+ It’s optimized to produce both natural-language and code completions directly from loosely structured, *“vibe coding”* prompts.
22
+
23
+ Unlike earlier LLMs that demanded rigid prompt engineering, vibe-code interaction lowers the overhead: you can sketch intent, describe functionality in free-form language, or mix pseudo-code with natural text. The model interpolates smoothly in latent space, making it easier to guide toward usable and executable code.
24
+
25
+ ---
26
+
27
+ ## Key Features
28
+
29
+ - **Low Prompt-Engineering Overhead**
30
+ Accepts incomplete or intuitive instructions, reducing the need for explicit formatting or rigid templates.
31
+
32
+ - **Latent-Space Interpolation**
33
+ Transitions fluidly between natural-language reasoning and syntax-aware code generation. Produces semantically coherent code blocks even when the prompt is under-specified.
34
+
35
+ - **Multi-Domain Support**
36
+ Handles a broad range of programming paradigms: Python, JavaScript, C++, shell scripting, and pseudo-code scaffolding.
37
+
38
+ - **Context-Sensitive Completion**
39
+ Leverages attention mechanisms to maintain coherence across multi-turn coding sessions.
40
+
41
+ - **Syntax-Aware Decoding**
42
+ Biases output distribution toward syntactically valid tokens, improving out-of-the-box executability of code.
43
+
44
+ - **Probabilistic Beam & Sampling Controls**
45
+ Supports temperature scaling, top-k, and nucleus (top-p) sampling to modulate creativity vs. determinism.
46
+
47
+ - **Hybrid Text + Code Responses**
48
+ Generates inline explanations, design rationales, or docstrings alongside code for improved readability and maintainability.
49
+
50
+ - **Generate Product Requirements Documents (PRDs)**
51
+ - - Automatically creates detailed Product Requirements Documents (PRDs) that outline the purpose, features, user stories, technical considerations, and success metrics for new products or features. These PRDs serve as a single source of truth for product managers, engineers, and designers, ensuring alignment across teams, reducing miscommunication, and accelerating the product development lifecycle. The system can structure PRDs with sections such as problem statements, goals, assumptions, dependencies, user flows, and acceptance criteria, making them ready for direct integration into project management tools.
52
+
53
+ ---
54
+ ## Dataset
55
+ Debugged vibecoder dataset
56
+
57
+
58
+ ## Benchmark
59
+
60
+ ### 📊 Model Evaluation Results
61
+
62
+ ---
63
+
64
+ **Notes:**
65
+ - The `(+value)` indicates delta over baseline evaluation.
66
+ - Metrics marked with `↑` denote that higher is better.
67
+ - Dashes (`—`) indicate results not yet reported or evaluated.
68
+
69
+
70
+ ## Example Usage
71
+
72
+ ```plaintext
73
+ Prompt:
74
+ "make me a fast vibe function that sorts numbers but with a cool twist"
75
+
76
+ Response:
77
+ - Natural explanation of sorting method
78
+ - Code snippet (e.g., Python quicksort variant)
79
+ - Optional playful commentary to match the vibe
80
+ ```
81
+
82
+ ---
83
+
84
+ ## Ideal Applications
85
+
86
+ - Rapid prototyping & exploratory coding
87
+ - Creative coding workflows with minimal boilerplate
88
+ - Educational contexts where explanation + code matter equally
89
+ - Interactive REPLs, notebooks, or editor assistants that thrive on loose natural-language input
90
+
91
+ ---
92
+
93
+ ## Limitations
94
+
95
+ - Not tuned for production-grade formal verification.
96
+ - May require post-processing or linting to ensure strict compliance with project coding standards.
97
+ - Designed for *“fast prototyping vibes”*, not for long-horizon enterprise-scale codebases.
98
+
99
+
100
+
101
+ # Inference examples
102
+
103
+ ## Transformers
104
+
105
+ You can use `gpt-oss-120b` and `gpt-oss-20b` with Transformers. If you use the Transformers chat template, it will automatically apply the [harmony response format](https://github.com/openai/harmony). If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [openai-harmony](https://github.com/openai/harmony) package.
106
+
107
+ To get started, install the necessary dependencies to setup your environment:
108
+
109
+ ```
110
+ pip install -U transformers kernels torch
111
+ ```
112
+
113
+ For Google Colab (free/Pro)
114
+ ```
115
+ !pip install -q --upgrade torch
116
+
117
+ !pip install -q transformers triton==3.4 kernels
118
+
119
+ !pip uninstall -q torchvision torchaudio -y
120
+ ```
121
+
122
+ Once, setup you can proceed to run the model by running the snippet below:
123
+
124
+ ```py
125
+ from transformers import pipeline
126
+ import torch
127
+ model_id = "EpistemeAI/VCoder-120b-1.0"
128
+ pipe = pipeline(
129
+ "text-generation",
130
+ model=model_id,
131
+ torch_dtype="auto",
132
+ device_map="auto",
133
+ )
134
+ messages = [
135
+ {"role": "user", "content": "Let’s start with the header and navigation for the landing page. Start by creating the top header section for the dashboard. We’ll add the content blocks below afterward."},
136
+ ]
137
+ outputs = pipe(
138
+ messages,
139
+ max_new_tokens=3000,
140
+ )
141
+ print(outputs[0]["generated_text"][-1])
142
+ ```
143
+
144
+ ### Amazon SageMaker
145
+ ```py
146
+ import json
147
+ import sagemaker
148
+ import boto3
149
+ from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
150
+
151
+ try:
152
+ role = sagemaker.get_execution_role()
153
+ except ValueError:
154
+ iam = boto3.client('iam')
155
+ role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
156
+
157
+ # Hub Model configuration. https://huggingface.co/models
158
+ hub = {
159
+ 'HF_MODEL_ID':'EpistemeAI/VCoder-120b-1.0',
160
+ 'SM_NUM_GPUS': json.dumps(1)
161
+ }
162
+
163
+
164
+
165
+ # create Hugging Face Model Class
166
+ huggingface_model = HuggingFaceModel(
167
+ image_uri=get_huggingface_llm_image_uri("huggingface",version="3.2.3"),
168
+ env=hub,
169
+ role=role,
170
+ )
171
+
172
+ # deploy model to SageMaker Inference
173
+ predictor = huggingface_model.deploy(
174
+ initial_instance_count=1,
175
+ instance_type="ml.g5.2xlarge",
176
+ container_startup_health_check_timeout=300,
177
+ )
178
+
179
+ # send request
180
+ predictor.predict({
181
+ "inputs": "Hi, what can you help me with?",
182
+ })
183
+ ```
184
+
185
  # Uploaded finetuned model
186
 
187
  - **Developed by:** EpistemeAI