YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Anni Logo
Anni

Hugging Face ModelScope Model GitHub Repo

Anni is a high-performance code assistant built upon the Qwen3 14B architecture. Fine-tuned on the OpenCodeReasoning-2 dataset, Anni is engineered to excel in deep algorithmic reasoning, competitive programming logic, and the implementation of complex, high-efficiency data structures.


πŸš€ Model Overview

Property Value
Base Model Qwen3 14B
Model Type Language Model for Code
Context Length 32,000 tokens
Precision BF16 / safetensors (merged)
Inference Framework vLLM compatible

πŸ’» Usage

Get started immediately using the provided Google Colab notebooks:

  • (Recommended) GGUF Inference : Open the Colab Notebook to run standard inference.

  • vLLM Serving: Open the Colab Notebook to run inference using the vLLM server.


πŸ› οΈ Development Setup

Prerequisites

  1. Python Dependencies:
    pip install -r requirements.txt
    
  2. System Tools: Ensure tmux is installed on your system (required for training scripts).

Configuration

  1. Environment Variables: Rename the example environment file and add your API tokens (WandB, HuggingFace, ModelScope).

    mv config/example.env config/.env
    # Edit config/.env with your keys
    
  2. Training Config: Edit config/config.yaml to adjust hyperparameters.

    • Note: Specify the LOCAL_STORAGE_PATH in src/train.py before starting training.

Running Training

To start the training process, run the shell script:

./scripts/train.sh

πŸ“‚ Project Structure

Source (src/)

File Description
preprocess.py Downloads the OpenCodeReasoning-2 dataset and preprocesses it for training.
train.py Downloads the base model and fine-tunes it on the preprocessed dataset.
save.py Loads the fine-tuned LoRA adapters and saves the model as merged 16-bit and GGUF formats.
upload.py Uploads the merged model to Hugging Face and ModelScope.

Scripts (scripts/)

File Description
train.sh Runs the training script with specified parameters.
eval.sh Evaluates the model on the LiveCodeBench dataset.
serve.sh Serves the model using the vLLM server.
terminate_train.sh Terminates the training process.

Frontend (web/)

The frontend code for Anni is available in the web directory. πŸ‘‰ View Frontend Documentation


βš–οΈ License

This repository’s model and its training code are released under the MIT License.
All other elements, such as frontend code, project name and logo, are trademarks of the developer and owner of this repository (Hans) and may not be used without explicit permission.


πŸ“š Training Dataset Notice

The training dataset includes openly licensed sources under CC-BY-4.0, which permits commercial use with attribution.

Attribution:

Note: The dataset itself is not included in this model release.


⚠️ Disclaimer

This model may generate incorrect or unsafe code. Evaluate and verify outputs before using in production.

Downloads last month
1
Safetensors
Model size
15B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for BigJuicyData/Anni

Quantizations
2 models