File size: 9,854 Bytes

---
license: apache-2.0
tags:
- text-generation
- text
- chat
pipeline_tag: text-generation
---

<p align="center">
  <img alt="Continue-1-OSS" src="https://github.com/SVECTOR-CORPORATION/Continue-1-OSS/blob/main/Continue-1-OSS-image-banner.jpg?raw=true" width="800">
</p>

# Continue-1-OSS

### Advanced Text Generation Model

<div align="left" style="line-height: 1;">
  <a href="https://spec-chat.tech" target="_blank" style="margin: 2px;">
    <img alt="SVECTOR" src="https://img.shields.io/badge/💬%20Spec%20Chat-Spec%20Chat-blue?style=plastic" style="display: inline-block; vertical-align: middle;"/>
  </a>
  
  <a href="https://huggingface.co/SVECTOR-CORPORATION" target="_blank" style="margin: 2px;">
    <img alt="SVECTOR" src="https://img.shields.io/badge/🤗%20Hugging%20Face-SVECTOR-536af5?color=536af5&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
  </a>
  
  <a href="https://huggingface.co/SVECTOR-CORPORATION/Continue-1-OSS/blob/main/LICENSE" style="margin: 2px;">
    <img alt="License" src="https://img.shields.io/badge/License-Apache%202.0-blue?color=1e88e5&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
  </a>
  
  <a href="https://github.com/SVECTOR-CORPORATION/Continue-1-OSS" target="_blank" style="margin: 2px;">
    <img alt="GitHub" src="https://img.shields.io/badge/GitHub-Continue--1--OSS-181717?logo=github&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
  </a>
</div>

## Introduction

We are thrilled to introduce **Continue-1-OSS**, an advanced text generation model developed by SVECTOR, built on the Continue-1 architecture optimized for high-quality text generation, instruction following, and long-context understanding.

**Continue-1-OSS** is engineered to provide:

- **Superior Instruction Following:** Accurately follows complex, multi-step instructions
- **Long Context:** Robust handling of up to 128K+ tokens
- **Natural Conversations:** Human-like dialogue with strong reasoning capabilities
- **Tool Integration:** Built-in support for function calling and external tool use
- **Open Source:** Fully accessible under Apache 2.0 license for research and commercial use

This model combines the power of transformer architecture with advanced training techniques to deliver exceptional performance across a wide range of natural language tasks.

### Model Specifications

- **Base Architecture:** Continue1ForCausalLM (transformer decoder)
- **Model Type:** continue_oss
- **Parameters:** 3 Billion
- **Context Length:** 131,072 tokens
- **Vocabulary Size:** 128,256 tokens
- **Hidden Size:** 3072
- **Number of Layers:** 28
- **Attention Heads:** 24
- **License:** Apache 2.0


## Requirements

To use Continue-1-OSS, install the required dependencies:

```bash
pip install transformers torch
pip install vllm  # For fast inference (optional but recommended)
```

## Quickstart

### Basic Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "SVECTOR-CORPORATION/Continue-1-OSS"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Prepare conversation
messages = [
    {"role": "user", "content": "What is machine learning?"}
]

# Apply chat template and generate
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

### Using vLLM (Recommended for Production)

For high-performance inference with faster generation:

```bash
pip install vllm
```

```python
from vllm import LLM, SamplingParams

# Initialize model
llm = LLM(
    model="SVECTOR-CORPORATION/Continue-1-OSS",
    trust_remote_code=True,
    max_model_len=8192
)

# Set sampling parameters
sampling_params = SamplingParams(
    temperature=0.7,
    top_p=0.9,
    max_tokens=512
)

# Generate
messages = [
    {"role": "user", "content": "Explain quantum computing in simple terms."}
]

outputs = llm.chat(messages, sampling_params=sampling_params)
print(outputs[0].outputs[0].text)
```

**Default System Prompt:** "You are Continue-1-OSS, an advanced AI assistant developed by SVECTOR. You are designed to be helpful, harmless, and honest."

## Advanced Features

### Multi-Turn Conversations

```python
messages = [
    {"role": "system", "content": "You are Continue-1-OSS, a helpful AI assistant."},
    {"role": "user", "content": "What is quantum computing?"},
    {"role": "assistant", "content": "Quantum computing is a type of computing that uses quantum mechanics principles..."},
    {"role": "user", "content": "Can you explain that more simply?"}
]
```

### Tool Calling Support

Continue-1-OSS supports function calling for tool integration:

```python
messages = [
    {"role": "user", "content": "What's the weather in San Francisco?"}
]

# Model can generate JSON function calls
# Example output: {"name": "get_weather", "parameters": {"location": "Ahmedabad"}}
```


## Use Cases

Continue-1-OSS excels at:

- **Conversational AI:** Build chatbots and virtual assistants with natural dialogue
- **Content Generation:** Generate articles, stories, and creative content
- **Code Assistance:** Help with coding tasks, debugging, and code explanations
- **Question Answering:** Answer questions based on context with high accuracy
- **Summarization:** Condense long documents into concise summaries
- **Data Extraction:** Extract structured data from unstructured text
- **Tool Integration:** Call functions and use external tools intelligently
- **Education:** Create educational content and tutoring assistance
- **Customer Service:** Automated support with natural language understanding

## Performance

- **Quality:** State-of-the-art instruction following and text generation
- **Speed:** Fast inference with vLLM optimization
- **Memory:** ~7GB GPU RAM (BF16), ~14GB (FP32)
- **Context:** Handles up to 128K tokens effectively
- **Efficiency:** Competitive with much larger models on many tasks

## Model Architecture

Continue-1-OSS uses a custom architecture based on the transformer decoder:

- **Architecture Class:** `Continue1ForCausalLM`
- **Config Class:** `Continue1Config`
- **Hidden Size:** 3072
- **Num Layers:** 28
- **Num Attention Heads:** 24
- **Intermediate Size:** 8192
- **Vocab Size:** 128,256
- **Max Position Embeddings:** 131,072

The model uses RoPE (Rotary Position Embeddings) for positional encoding and supports extended context through position interpolation.

## Training

Continue-1-OSS was developed using:
- High-quality instruction datasets covering diverse tasks
- Conversational and reasoning data for improved dialogue
- Code and technical content for developer assistance
- Multi-turn dialogue for contextual understanding

Training utilized:
- Advanced optimization techniques
- Careful hyperparameter tuning
- Quality filtering and data curation
- Evaluation on diverse benchmarks


## Limitations

As with any language model, Continue-1-OSS has certain limitations:

- **Knowledge Cutoff:** Training data is limited to information available up to December 2023
- **Factual Accuracy:** May occasionally generate incorrect or outdated information
- **Specialized Domains:** Performance may vary on highly specialized technical knowledge
- **Long Context:** Very long contexts (>64K tokens) may impact generation quality
- **Languages:** Primarily optimized for English; other languages have limited support
- **Reasoning:** Complex multi-step reasoning may require careful prompting
- **Compute:** Requires GPU for optimal performance (CPU is significantly slower)

## Ethical Considerations

SVECTOR is committed to responsible AI development. Users should:

- **Transparency:** Disclose when content is AI-generated
- **Verification:** Always fact-check important information generated by the model
- **Bias Awareness:** Be aware the model may reflect biases present in training data
- **Privacy:** Do not input personal or sensitive information without proper safeguards
- **Safety:** Implement content filtering and guardrails for production applications
- **Responsible Use:** Do not use for illegal purposes, misinformation, or harmful content
- **Attribution:** Credit the model when used in public projects or research

## Performance Tips

1. **Temperature Settings:**
   - 0.0-0.3 for factual/deterministic tasks
   - 0.7-0.9 for creative tasks
   
2. **Context Management:**
   - Model supports 128K tokens but consider truncating for faster inference
   - Use sliding window for very long documents

3. **Batch Processing:**
   - Use vLLM for efficient batched inference in production
   - Group similar-length prompts together

```python
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
import torch

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
    "SVECTOR-CORPORATION/Continue-1-OSS",
    trust_remote_code=True,
    quantization_config=quantization_config,
    device_map="auto"
)
```


## License

This model is released under the **Apache License 2.0**. You are free to use, modify, and distribute this model for both commercial and non-commercial purposes. See the [LICENSE](https://huggingface.co/SVECTOR-CORPORATION/Continue-1-OSS/blob/main/LICENSE) file for complete details.

---

<p align="center">
    <i>Developed by <a href="https://www.svector.co.in">SVECTOR</a></i>
</p>