Text Generation
Transformers
Safetensors
qwen3_moe
turkish
türkiye
ai
lamapi
next-codex
coder
codex
open-source
30b
Mixture of Experts
mixture-of-experts
code-generation
coding
llm
transformer
artificial-intelligence
4-bit precision
bitsandbytes
Instructions to use thelamapi/next-codex with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use thelamapi/next-codex with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="thelamapi/next-codex")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("thelamapi/next-codex") model = AutoModelForCausalLM.from_pretrained("thelamapi/next-codex") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use thelamapi/next-codex with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "thelamapi/next-codex" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "thelamapi/next-codex", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/thelamapi/next-codex
- SGLang
How to use thelamapi/next-codex with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "thelamapi/next-codex" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "thelamapi/next-codex", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "thelamapi/next-codex" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "thelamapi/next-codex", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use thelamapi/next-codex with Docker Model Runner:
docker model run hf.co/thelamapi/next-codex
Update README.md
Browse files
README.md
CHANGED
|
@@ -101,16 +101,7 @@ Unlike traditional dense models, **Next-Codex** utilizes a sparse architecture w
|
|
| 101 |
|
| 102 |
**Next-Coder 30B** achieves state-of-the-art results among open-weights coding models, balancing extreme efficiency with high accuracy.
|
| 103 |
|
| 104 |
-
|
| 105 |
-
| :--- | :--- | :---: | :---: | :---: |
|
| 106 |
-
| **HumanEval** | Python Code Generation | **82.4%** | 48.2% | 79.3% |
|
| 107 |
-
| **MBPP** | Basic Python Programming | **86.1%** | 56.0% | 84.0% |
|
| 108 |
-
| **HumanEval-JS** | JavaScript Generation | **78.5%** | 43.1% | 74.2% |
|
| 109 |
-
| **GSM8K** | Math & Logic | **89.0%** | 40.2% | 78.0% |
|
| 110 |
-
| **LiveCodeBench** | Hard/Competition Problems | **41.2%** | 22.0% | 38.5% |
|
| 111 |
-
|
| 112 |
-
*(Benchmarks run using 0-shot and few-shot settings comparable to standard reporting)*
|
| 113 |
-
|
| 114 |
---
|
| 115 |
|
| 116 |
## 🚀 Installation & Usage
|
|
|
|
| 101 |
|
| 102 |
**Next-Coder 30B** achieves state-of-the-art results among open-weights coding models, balancing extreme efficiency with high accuracy.
|
| 103 |
|
| 104 |
+
Benchmarks are being conducted...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
---
|
| 106 |
|
| 107 |
## 🚀 Installation & Usage
|