|
|
--- |
|
|
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B |
|
|
library_name: peft |
|
|
model_name: typescript-slm-7b-reasoning |
|
|
tags: |
|
|
- typescript |
|
|
- code-generation |
|
|
- react |
|
|
- nextjs |
|
|
- angular |
|
|
- nodejs |
|
|
- lora |
|
|
- sft |
|
|
- 7b |
|
|
- reasoning |
|
|
- transformers |
|
|
- trl |
|
|
license: mit |
|
|
pipeline_tag: text-generation |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
# TypeScript SLM 7B - Reasoning Variant |
|
|
|
|
|
7B TypeScript model with reasoning capabilities for TypeScript code generation, optimized for React, Next.js, Angular, and Node.js. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) |
|
|
- **Model Size**: 7B parameters |
|
|
- **Training Method**: LoRA (Low-Rank Adaptation) |
|
|
- **Context Length**: 2048 tokens |
|
|
- **LoRA Rank**: 64 |
|
|
- **Training Dataset**: 5,000 high-quality TypeScript samples |
|
|
|
|
|
## Reasoning Capabilities |
|
|
This variant includes chain-of-thought reasoning for better code understanding and generation. |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
- Batch Size: 2 |
|
|
- Gradient Accumulation: 16 |
|
|
- Effective Batch Size: 32 |
|
|
- Learning Rate: 0.0001 |
|
|
- Epochs: 3 |
|
|
- Hardware: Google Colab A100 40GB |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
|
|
|
base_model = "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B" |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
base_model, |
|
|
device_map="auto", |
|
|
torch_dtype="auto" |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model) |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(model, "sylvester-francis/typescript-slm-7b-reasoning") |
|
|
|
|
|
# Generate code |
|
|
prompt = "Write a React component with TypeScript:" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=512) |
|
|
print(tokenizer.decode(outputs[0])) |
|
|
``` |
|
|
|
|
|
## Repository |
|
|
|
|
|
https://github.com/sylvester-francis/slm-typescript-model |
|
|
|