You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

🧩 Arche 3.5 Codium 3B – LoRA Adapter

QLoRA fine-tune of Qwen2.5-Coder-3B-Instruct

This repository contains only the LoRA adapter (checkpoint-2200), not a merged model. Use it with the base model to generate code, or continue training from where we left off.

📧 Contact: opensynapselabs@proton.me

🚀 Why This Adapter?

Lightweight – only ~100 MB, downloads in seconds.
Fully trainable – you can continue fine-tuning or merge it into the base model.
Community-ready – perfect for collaborative improvements (more data, longer training).
Local & private – no API keys, all inference runs on your machine.

⚠️ Note: This checkpoint (step 2200) is not fully converged (loss ~0.66). We stopped early due to Colab T4 GPU session limits. The adapter still generates valid code for many tasks – and you can easily continue training!

📦 Usage

1. Load base model + adapter (inference)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_name = "Qwen/Qwen2.5-Coder-3B-Instruct"
adapter_path = "opensynapselabs/arche3.5-codium-3b-lora"

# Load base model (fp16 recommended)
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load LoRA adapter
model = PeftModel.from_pretrained(model, adapter_path)

# Generate
messages = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "Write a Python function that returns the two largest numbers from a list."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.2,
    do_sample=True,
    top_p=0.95,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

2. Merge adapter into base model (optional)

merged_model = model.merge_and_unload()
merged_model.save_pretrained("./merged_model")
tokenizer.save_pretrained("./merged_model")

3. Resume training from this checkpoint

The checkpoint includes full optimizer and scheduler states. Use transformers.Trainer with:

trainer.train(resume_from_checkpoint="opensynapselabs/arche3.5-codium-3b-lora")

See our training notebook for details.

🔧 Adapter Configuration

Parameter	Value
LoRA rank `r`	16
LoRA alpha	32
Dropout	0.05
Target modules	`q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj`
Quantization	4-bit NF4 (double quant)
Optimizer	paged_adamw_8bit
Max sequence length	1024
Training steps	2200
Effective batch size	24 (3 × 8 accumulation)
Loss at step 2200	~0.66

📊 Real Example Generations (Zero-shot)

Below are actual outputs from the adapter on HumanEval tasks (temperature=0.2). The code is usually correct – occasional glitches are due to early stopping.

📌 has_close_elements (HE/0)

Prompt:

from typing import List

def has_close_elements(numbers: List[float], threshold: float) -> bool:
    """ Check if in given list of numbers, are any two numbers closer to each other
    than given threshold.
    >>> has_close_elements([1.0, 2.0, 3.0], 0.5)
    False
    >>> has_close_elements([1.0, 2.8, 3.0, 4.0, 5.0, 2.0], 0.3)
    True
    """

Generated:

    for i in range(len(numbers)):
        for j in range(i + 1, len(numbers)):
            if abs(numbers[i] - numbers[j]) < threshold:
                return True
    return False

📌 separate_paren_groups (HE/1)

Prompt:

from typing import List

def separate_paren_groups(paren_string: str) -> List[str]:
    """ Input to this function is a string containing multiple groups of nested parentheses.
    Separate those groups into separate strings and return the list of those.
    >>> separate_paren_groups("( ) (( )) (( )( ))")
    ['()', '(())', '(()())']
    """

Generated:

from typing import List

def separate_paren_groups(paren_string: str) -> List[str:
    result = []
    current_group = ""
    depth = 0
    for char in paren_string:
        if char == '(':
            depth += 1
            current_group += char
        elif char == ')':
            depth -= 1
            current_group += char
            if depth == 0:
                result.append(current_group)
                current_group = ""
    return result

⚠️ Note: Contains a minor syntax typo (List[str: instead of List[str]), typical for early checkpoint.

📌 truncate_number (HE/2)

Prompt:

def truncate_number(number: float) -> float:
    """ Return the decimal part of a positive floating point number.
    >>> truncate_number(3.5)
    0.5
    """

Generated:

def truncate_number(number: float) -> float:
    return number % 1

📌 below_zero (HE/3)

Prompt:

from typing import List

def below_zero(operations: List[int]) -> bool:
    """ Return True if balance falls below zero at any point, otherwise False.
    >>> below_zero([1, 2, 3])
    False
    >>> below_zero([1, 2, -4, 5])
    True
    """

Generated:

    balance = 0
    for op in operations:
        balance += op
        if balance < 0:
            return True
    return False

📌 mean_absolute_deviation (HE/4)

Prompt:

from typing import List

def mean_absolute_deviation(numbers: List[float]) -> float:
    """ Calculate Mean Absolute Deviation: MAD = average | x - mean(x) |
    >>> mean_absolute_deviation([1.0, 2.0, 3.0, 4.0])
    1.0
    """

Generated:

    mean = sum(numbers) / len(numbers)
    mad = sum(abs(num - mean) for num in numbers) / len(numbers)
    return mad

📌 intersperse (HE/5)

Prompt:

from typing import List

def intersperse(numbers: List[int], delimeter: int) -> List[int]:
    """ Insert delimeter between every two consecutive elements.
    >>> intersperse([1, 2, 3], 4)
    [1, 4, 2, 4, 3]
    """

Generated:

    result = []
    for i, n in enumerate(numbers):
        result.append(n)
        if i < len(numbers) - 1:
            result.append(delimeter)
    return result

📌 parse_nested_parens (HE/6)

Prompt:

from typing import List

def parse_nested_parens(paren_string: str) -> List[int]:
    """ For each group of nested parentheses, output the deepest level of nesting.
    >>> parse_nested_parens("(()()) ((())) () ((())(()))")
    [2, 3, 1, 3]
    """

Generated:

    stack = []
    result = []
    for char in paren_string:
        if char == '(':
            stack.append(char)
        elif char == ')':
            depth = len(stack)
            stack.pop()
            result.append(depth)
    return result

📌 filter_by_substring (HE/7)

Prompt:

from typing import List

def filter_by_substring(strings: List[str], substring: str) -> List[str]:
    """ Filter list of strings for ones that contain given substring.
    >>> filter_by_substring(['abc', 'bacd', 'cde', 'array'], 'a')
    ['abc', 'bacd', 'array']
    """

Generated:

    return [s for s in strings if substring in s]

📌 sum_product (HE/8)

Prompt:

from typing import List, Tuple

def sum_product(numbers: List[int]) -> Tuple[int, int]:
    """ Return tuple of (sum, product) of all integers. Empty list: (0, 1).
    >>> sum_product([1, 2, 3, 4])
    (10, 24)
    """

Generated:

from typing import List, Tuple

def sum_product(numbers: List[int]) -> Tuple[int, int]:
    """ Return tuple of (sum, product) of all integers. Empty list: (0, 1).
    >>> sum_product([1, 2, 3, 4])
    (10, 24)
    """
    if not numbers:
        return (0, 1)
    else:
        total_sum = sum(numbers)
        total_product = 1
        for num in numbers:
            total_product *= num
        return (total_sum, total_product)

📌 rolling_max (HE/9)

Prompt:

from typing import List

def rolling_max(numbers: List[int]) -> List[int]:
    """ Generate list of rolling maximum element found until given moment.
    >>> rolling_max([1, 2, 3, 2, 3, 4, 2])
    [1, 2, 3, 3, 3, 4, 4]
    """

Generated:

from typing import List

def rolling_max(numbers: List[int]) -> List[int]:
    """ Generate list of rolling maximum element found until given moment.
    >>> rolling_max([1, 2, 3, 2, 3, 4, 2])
    [1, 2, 3, 3, 3, 4, 4]
    """
    if len(numbers) == 0:
        return []
    max_so_far = numbers[0]
    result = [max_so_far]
    for num in numbers[1:]:
        if num > max_so_far:
            max_so_far = num
        result.append(max_so_far)
    return result

🧪 Training Details

Dataset Mix (50K samples)

Dataset	Purpose	Samples	Share
`iamtarun/python_code_instructions_18k_alpaca`	Python generation	20,000	40%
`likaixin/InstructCoder`	Code editing	10,000	20%
`m-a-p/CodeFeedback-Filtered-Instruction`	Instruction following	10,000	20%
`TokenBender/code_instructions_122k_alpaca_style`	General coding	10,000	20%

All formatted with Qwen's chat template (<|im_start|>system/user/assistant).

Training Setup

Effective batch size: 24 (per-device batch = 3, gradient accumulation = 8)
Hardware: Google Colab T4 GPU (16 GB VRAM)
Checkpointing: every 200 steps to Google Drive with full optimizer + scheduler state
Resume support: automatic continuation from last checkpoint after session timeout
Early stop reason: Colab T4 session limit (~12 hours), not convergence
Steps completed: 2200 (out of ~5200 estimated for 1 epoch on 50K samples)
Loss at step 2200: ~0.66

Note on step count: 2200 steps means ~52,800 training examples processed (2200 × 24 effective batch). The full dataset is 50K samples, so this is approximately 1.05 epochs worth of data seen. Training was interrupted before completing the planned epoch.

🤝 Join Open Synapse Labs

We are an open community building local-first coding models. You are invited to contribute – especially to continue training this adapter!

🔐 Gated Access (Model Card)

This model card uses gated access. Anyone can request access, and every request is personally reviewed by the team.

How to request access:

Create a Hugging Face account
Visit this model's page: opensynapselabs/arche3.5-codium-3b-lora
Click "Request access" – write a short intro (e.g., "I want to help with code models")
We review every request manually and will reach out to you via HF messages or email

💡 Tip: Mention your GitHub, relevant skills, or what you'd like to work on – it helps us approve faster.

Other ways to help:

Train the adapter further (more steps, better data, longer context)
Merge and share the fully trained model
Improve Arche Code – our CLI agent
Open issues, suggest datasets, write documentation

Direct contact: opensynapselabs@proton.me

📜 License

Apache License 2.0 – you may use, modify, and distribute freely, including commercially.

🙏 Acknowledgements

Qwen team for the base model
Dataset creators and open-source ML community
Google Colab for free T4 GPU time

⚡ What's Next?

Continue training to loss < 0.4 (need ~2-3 days on a GPU)
Release Arche Codium 12B MoE – expert mixture with lazy loading
Tighter integration with Arche Code CLI

Built with ❤️ by Open Synapse Labs – code that stays on your machine.

Downloads last month: -

Model tree for opensynapselabs/arche3.5-codium-3b

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-Coder-3B

Finetuned

Qwen/Qwen2.5-Coder-3B-Instruct

Adapter

(64)

this model

opensynapselabs
/

arche3.5-codium-3b

You need to agree to share your contact information to access this model

🧩 Arche 3.5 Codium 3B – LoRA Adapter

🚀 Why This Adapter?

📦 Usage

1. Load base model + adapter (inference)

2. Merge adapter into base model (optional)

3. Resume training from this checkpoint

🔧 Adapter Configuration

📊 Real Example Generations (Zero-shot)

📌 has_close_elements (HE/0)

📌 separate_paren_groups (HE/1)

📌 truncate_number (HE/2)

📌 below_zero (HE/3)

📌 mean_absolute_deviation (HE/4)

📌 intersperse (HE/5)

📌 parse_nested_parens (HE/6)

📌 filter_by_substring (HE/7)

📌 sum_product (HE/8)

📌 rolling_max (HE/9)

🧪 Training Details

Dataset Mix (50K samples)

Training Setup

🤝 Join Open Synapse Labs

🔐 Gated Access (Model Card)

📜 License

🙏 Acknowledgements

⚡ What's Next?

Model tree for opensynapselabs/arche3.5-codium-3b

Datasets used to train opensynapselabs/arche3.5-codium-3b