Instructions to use ken123777/codellama-7b-code-review-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ken123777/codellama-7b-code-review-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ken123777/codellama-7b-code-review-v1")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ken123777/codellama-7b-code-review-v1")
model = AutoModelForCausalLM.from_pretrained("ken123777/codellama-7b-code-review-v1")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ken123777/codellama-7b-code-review-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ken123777/codellama-7b-code-review-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ken123777/codellama-7b-code-review-v1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/ken123777/codellama-7b-code-review-v1

SGLang

How to use ken123777/codellama-7b-code-review-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ken123777/codellama-7b-code-review-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ken123777/codellama-7b-code-review-v1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ken123777/codellama-7b-code-review-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ken123777/codellama-7b-code-review-v1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use ken123777/codellama-7b-code-review-v1 with Docker Model Runner:
```
docker model run hf.co/ken123777/codellama-7b-code-review-v1
```

ken123777 commited on Jun 25, 2025

Commit

9923ab6

verified ·

1 Parent(s): 457001c

Create README.md

Browse files

Files changed (1) hide show

README.md +300 -0

README.md ADDED Viewed

	@@ -0,0 +1,300 @@

+---
+license: mit
+language:
+- en
+- ko
+- code
+library_name: transformers
+tags:
+- code-llama
+- code-review
+- fine-tuning
+- SFT
+- LoRA
+pipeline_tag: text-generation
+base_model:
+- codellama/CodeLlama-7b-hf
+---
+# Model Card for codellama-7b-code-review
+이 모델 카드는 `codellama-7b-code-review` 모델에 대한 정보를 담고 있습니다.
+## Model Details / 모델 상세 정보
+<details>
+<summary><strong>English</strong></summary>
+This model is fine-tuned from Meta's `codellama/CodeLlama-7b-hf` to review and provide feedback on code changes (`diffs`) from GitHub Pull Requests. It has been primarily trained on JavaScript and React code reviews, aiming to generate constructive feedback from a senior engineer's perspective on topics like code quality, architecture, performance, and conventions.
+- **Developed by:** [ken123777](https://huggingface.co/ken123777)
+- **Model type:** Causal Language Model
+- **Language(s):** English, Korean, Diff format
+- **License:** apache-2.0
+- **Finetuned from model:** `codellama/CodeLlama-7b-hf`
+</details>
+<details>
+<summary><strong>한국어</strong></summary>
+이 모델은 Meta의 `codellama/CodeLlama-7b-hf` 모델을 기반으로, GitHub Pull Request의 코드 변경사항(`diff`)을 리뷰하고 피드백을 제공하도록 파인튜닝되었습니다. 주로 JavaScript와 React 코드 리뷰에 중점을 두고 학습되었으며, 시니어 엔지니어의 관점에서 코드 품질, 아키텍처, 성능, 컨벤션 등에 대한 건설적인 피드백을 생성하는 것을 목표로 합니다.
+- **개발자:** [ken123777](https://huggingface.co/ken123777)
+- **모델 종류:** 인과 관계 언어 모델 (Causal Language Model)
+- **언어:** 영어, 한국어, Diff 형식
+- **라이선스:** apache-2.0
+- **파인튜닝 기반 모델:** `codellama/CodeLlama-7b-hf`
+</details>
+### Model Sources / 모델 소스
+- **Repository:** [https://huggingface.co/ken12377/codellama-7b-code-review](https://huggingface.co/ken12377/codellama-7b-code-review)
+## Uses / 사용 정보
+<details>
+<summary><strong>English</strong></summary>
+### Direct Use
+This model can be used directly for code review automation. By providing code changes in `diff` format as input, the model will generate review comments.
+**Warning:** The content generated by the model always requires review. The final decision must be made by a human developer.
+### Downstream Use
+This model can be reused as a base for further fine-tuning on specific project's internal coding conventions or more specialized review criteria.
+### Out-of-Scope Use
+This model is specialized for code review tasks. It may not perform well for other purposes such as general-purpose chatbots, code generation, or translation. Especially, inputting code that is not in `diff` format may lead to unexpected results.
+</details>
+<details>
+<summary><strong>한국어</strong></summary>
+### 직접 사용
+이 모델은 코드 리뷰 자동화에 직접 사용될 수 있습니다. `diff` 형식의 코드 변경사항을 입력으로 제공하면, 모델은 해당 코드에 대한 리뷰 코멘트를 생성합니다.
+**경고**: 모델이 생성하는 내용은 항상 검토가 필요하며, 최종 결정은 개발자가 직접 내려야 합니다.
+### 다운스트림 사용
+이 모델은 특정 프로젝트의 내부 코딩 컨벤션이나 더 전문화된 리뷰 기준을 학습시키기 위한 기반 모델로 재사용될 수 있습니다.
+### 사용 범위 외
+이 모델은 코드 리뷰 태스크에 특화되어 있으므로, 일반적인 챗봇 대화나 코드 생성, 번역 등의 다른 목적으로는 좋은 성능을 보이지 않을 수 있습니다. 특히 `diff` 형식이 아닌 코드를 입력하면 예상치 못한 결과가 나올 수 있습니다.
+</details>
+## Bias, Risks, and Limitations / 편향, 위험 및 한계
+<details>
+<summary><strong>English</strong></summary>
+- **Data Bias:** The model was trained on public GitHub Pull Request data, so it may be biased towards specific coding styles or patterns present in that data.
+- **Inaccuracy (Hallucination):** The model may occasionally generate feedback that is factually incorrect or out of context. The generated reviews always need verification.
+- **Limited Knowledge:** The model's knowledge is limited to the data at the time of fine-tuning and may not reflect the latest library or framework updates.
+</details>
+<details>
+<summary><strong>한국어</strong></summary>
+- **데이터 편향:** 모델은 공개된 GitHub Pull Request 데이터를 기반으로 학습되었으므로, 해당 데이터에 존재하는 특정 코딩 스타일이나 패턴에 편향되어 있을 수 있습니다.
+- **부정확성(환각):** 모델은 때때로 사실과 다르거나 문맥에 맞지 않는 피드백을 생성할 수 있습니다. 생성된 리뷰는 항상 검증이 필요합니다.
+- **제한된 지식:** 모델의 지식은 파인튜닝 시점의 데이터로 한정되어 있으며, 최신 라이브러리나 프레임워크 변경사항을 반영하지 못할 수 있습니다.
+</details>
+### Recommendations / 권장 사항
+<details>
+<summary><strong>English</strong></summary>
+Users should treat the code reviews generated by the model as a 'draft' or 'assistive tool' to help the development process, not as a final judgment. It is recommended that a human expert reviews critical changes.
+</details>
+<details>
+<summary><strong>한국어</strong></summary>
+사용자는 모델이 생성한 코드 리뷰를 최종적인 판단이 아닌, 개발 과정을 돕는 '초안' 또는 '보조 도구'로 활용해야 합니다. 중요한 변경사항에 대해서는 반드시 인간 전문가의 검토를 거치는 것을 권장합니다.
+</details>
+## How to Get Started with the Model / 모델 시작하기
+<details>
+<summary><strong>English</strong></summary>
+**Note:** This model may be available in two versions: **Adapter** and **Merged**. Use the appropriate code for your model type.
+#### 1. Using the Adapter Model (`ken12377/codellama-7b-code-review-adapter`)
+To use the adapter model, you must first load the base model and then apply the adapter using the `peft` library.
+#### 2. Using the Merged Model (`ken12377/codellama-7b-code-review`)
+If the model is fully merged with the base model, you can load it directly without `peft`.
+</details>
+<details>
+<summary><strong>한국어</strong></summary>
+**참고:** 이 모델은 **어댑터(Adapter)** 와 **병합된(Merged)** 두 가지 버전으로 제공될 수 있습니다. 자신의 모델 타입에 맞는 코드를 사용하세요.
+#### 1. 어댑터 모델 사용법 (`ken123777/codellama-7b-code-review-adapter`)
+어댑터 모델을 사용하려면, 기반 모델을 먼저 로드한 후 `peft` 라이브러리를 사용해 어댑터를 적용해야 합니다.
+#### 2. 병합된 모델 사용법 (`ken123777/codellama-7b-code-review`)
+모델이 기반 모델과 완전히 병합된 경우, `peft` 없이 직접 모델을 로드하여 사용할 수 있습니다.
+</details>
+````python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+# --- Configuration (Choose one) ---
+# 1. For Adapter Model
+use_adapter = True
+base_model_name = "codellama/CodeLlama-7b-hf"
+adapter_or_model_name = "ken12377/codellama-7b-code-review-adapter"
+# 2. For Merged Model
+# use_adapter = False
+# adapter_or_model_name = "ken12377/codellama-7b-code-review"
+# --- Load Model and Tokenizer ---
+if use_adapter:
+    base_model = AutoModelForCausalLM.from_pretrained(
+        base_model_name,
+        torch_dtype=torch.float16,
+        device_map="auto",
+    )
+    tokenizer = AutoTokenizer.from_pretrained(adapter_or_model_name)
+    model = PeftModel.from_pretrained(base_model, adapter_or_model_name)
+else:
+    tokenizer = AutoTokenizer.from_pretrained(adapter_or_model_name)
+    model = AutoModelForCausalLM.from_pretrained(
+        adapter_or_model_name,
+        torch_dtype=torch.float16,
+        device_map="auto",
+    )
+model.eval()
+# --- Inference ---
+diff_code = """
+--- a/src/components/LoginForm.js
++++ b/src/components/LoginForm.js
+-import React from 'react';
++import React, { useState } from 'react';
+-const LoginForm = () => (
+-  <form>
+-    <label>Email: <input type="email" /></label>
+-    <br />
+-    <label>Password: <input type="password" /></label>
+-    <br />
+-    <button type="submit">Log In</button>
+-  </form>
+-);
++const LoginForm = () => {
++  const [credentials, setCredentials] = useState({ email: '', password: '' });
++  /* ... (rest of the diff code) ... */
++};
+ export default LoginForm;
+"""
+# Prompt in Korean
+prompt = f"""### 지시:
+제공된 코드는 pull request의 diff 내용입니다. 코드의 개선할 수 있는 부분에 대해 최소 3가지 항목으로 나누어 상세하고 구체적인 피드백을 제공해주세요.
+### 입력:
+```diff
+{diff_code}
+````
+### 응답:
+1. """
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(\*\*inputs, max_new_tokens=512, temperature=0.7, repetition_penalty=1.2)
+response = tokenizer.decode(outputs[0]len(inputs.input_ids[0]):], skip_special_tokens=True)
+print(response)
+```
+## Training Details / 학습 상세 정보
+<details>
+<summary><strong>English</strong></summary>
+### Training Data
+This model was fine-tuned using the `review_dataset.json` file, which contains public Pull Request data collected from GitHub. The dataset is structured in a `instruction`, `input`(diff), `output`(review comment) format.
+### Training Procedure
+The model was fine-tuned using the QLoRA technique. It utilized the `SFTTrainer` from the `trl` library, applying 4-bit quantization and LoRA (Low-Rank Adaptation) for efficient training.
+#### Training Hyperparameters
+- **model:** `codellama/CodeLlama-7b-hf`
+- **max_seq_length:** 4096
+- **lora_alpha:** 128
+- **lora_dropout:** 0.1
+- **lora_r:** 64
+- **learning_rate:** 2e-4
+- **optimizer:** paged_adamw_32bit
+- **gradient_accumulation_steps:** 8
+- **per_device_train_batch_size:** 2
+- **max_steps:** 1900
+</details>
+<details>
+<summary><strong>한국어</strong></summary>
+### 학습 데이터
+이 모델은 GitHub에서 수집된 공개 Pull Request 데이터를 포함하는 `review_dataset.json` 파일을 사용하여 파인튜닝되었습니다. 데이터셋은 `instruction`, `input`(diff), `output`(리뷰 코멘트) 형식으로 구성되어 있습니다.
+### 학습 절차
+모델은 QLoRA 기법을 사용하여 파인튜닝되었습니다. `trl` 라이브러리의 `SFTTrainer`를 사용했으며, 4-bit 양자화와 LoRA(Low-Rank Adaptation)를 적용하여 효율적인 학습을 진행했습니다.
+#### 학습 하이퍼파라미터
+- **모델:** `codellama/CodeLlama-7b-hf`
+- **최대 시퀀스 길이:** 4096
+- **LoRA Alpha:** 128
+- **LoRA Dropout:** 0.1
+- **LoRA Rank (r):** 64
+- **학습률:** 2e-4
+- **옵티마이저:** paged_adamw_32bit
+- **Gradient Accumulation Steps:** 8
+- **장치별 학습 배치 크기:** 2
+- **최대 스텝 수:** 1900
+</details>
+## Compute Infrastructure / 컴퓨팅 인프라
+<details>
+<summary><strong>English</strong></summary>
+- **Hardware Type:** RunPod Cloud GPU
+- **Cloud Provider:** RunPod
+</details>
+<details>
+<summary><strong>한국어</strong></summary>
+- **하드웨어 종류:** RunPod 클라우드 GPU
+- **클라우드 제공업체:** RunPod
+</details>
+```