Instructions to use OpenGenerativeAI/Bifrost-27B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OpenGenerativeAI/Bifrost-27B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="OpenGenerativeAI/Bifrost-27B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenGenerativeAI/Bifrost-27B")
model = AutoModelForCausalLM.from_pretrained("OpenGenerativeAI/Bifrost-27B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use OpenGenerativeAI/Bifrost-27B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpenGenerativeAI/Bifrost-27B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGenerativeAI/Bifrost-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/OpenGenerativeAI/Bifrost-27B

SGLang

How to use OpenGenerativeAI/Bifrost-27B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "OpenGenerativeAI/Bifrost-27B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGenerativeAI/Bifrost-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "OpenGenerativeAI/Bifrost-27B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGenerativeAI/Bifrost-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use OpenGenerativeAI/Bifrost-27B with Docker Model Runner:
```
docker model run hf.co/OpenGenerativeAI/Bifrost-27B
```

futureHQ commited on Mar 16, 2025

Commit

d1482dd

verified ·

1 Parent(s): c203405

Create README.md

Browse files

Files changed (1) hide show

README.md +79 -0

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+license: gemma
+library_name: transformers
+pipeline_tag: text-generation
+extra_gated_heading: Access Gemma on Hugging Face
+extra_gated_prompt: >-
+  To access Gemma on Hugging Face, you’re required to review and agree to
+  Google’s usage license. To do this, please ensure you’re logged in to Hugging
+  Face and click below. Requests are processed immediately.
+extra_gated_button_content: Acknowledge license
+base_model: google/gemma-3-27b-it
+tags:
+- transformers
+- gemma3
+- gemma
+- google
+- Bifröst
+- Bifrost
+- code
+---
+## Bifröst-27B
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a834a8895fd6416e29576f/sAXfe0cQdULI_GEVxBstw.png)
+Bifröst-27B is an advanced AI model built upon qwen2 architecture, specifically fine-tuned for secure and efficient enterprise-grade code generation with reasoning. Designed to meet rigorous standards of safety, accuracy, and reliability, Bifröst empowers organizations to streamline software development workflows while prioritizing security and compliance.
+### Model Details
+- **Model Name:** Bifröst-27B
+- **Base Architecture:** gemma3
+- **Application:** Enterprise Secure Code Generation
+- **Release Date:** 16-March-2025
+### Intended Use
+Bifröst is designed explicitly for:
+- Generating secure, efficient, and high-quality code.
+- Supporting development tasks within regulated enterprise environments.
+- Enhancing productivity by automating routine coding tasks without compromising security.
+### Features
+- **Security-Focused Training:** Specialized training regimen emphasizing secure coding practices, vulnerability reduction, and adherence to security standards.
+- **Enterprise-Optimized Performance:** Tailored to support various programming languages and enterprise frameworks with robust, context-aware suggestions.
+- **Compliance-Driven Design:** Incorporates features to aid in maintaining compliance with industry-specific standards (e.g., GDPR, HIPAA, SOC 2).
+### Limitations
+- Bifröst should be used under human supervision to ensure code correctness and security compliance.
+- Model-generated code should undergo appropriate security and quality assurance checks before deployment.
+### Ethical Considerations
+- Users are encouraged to perform regular audits and compliance checks on generated outputs.
+- Enterprises should implement responsible AI practices to mitigate biases or unintended consequences.
+### Usage
+Below are some quick-start instructions for using the model with the `transformers` library.
+#### Installation
+```sh
+$ pip install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3
+```
+#### Running with the `pipeline` API
+```python
+from transformers import pipeline
+import torch
+pipe = pipeline(
+    "text-generation",
+    model="OpenGenerativeAI/Bifrost-27B",
+    device="cuda",
+    torch_dtype=torch.bfloat16
+)
+messages = [{"role": "user", "content": "Generate a secure API key management system."}]
+output = pipe(text=messages, max_new_tokens=200)
+print(output[0]["generated_text"])
+```
+## Terms of Use
+This model is released under the **Gemma license**. Users must comply with [Google's Gemma Terms of Use](https://ai.google.dev/gemma/terms), including restrictions on redistribution, modification, and commercial use.