|
|
--- |
|
|
library_name: peft |
|
|
base_model: codellama/CodeLlama-7b-Instruct-hf |
|
|
tags: |
|
|
- terraform |
|
|
- terraform-configuration |
|
|
- infrastructure-as-code |
|
|
- iac |
|
|
- hashicorp |
|
|
- codellama |
|
|
- lora |
|
|
- qlora |
|
|
- peft |
|
|
- code-generation |
|
|
- devops |
|
|
- cloud |
|
|
- automation |
|
|
- configuration-management |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# terraform-codellama-7b |
|
|
|
|
|
A specialized LoRA fine-tuned model for Terraform infrastructure-as-code generation, built on CodeLlama-7b-Instruct-hf. This model excels at generating Terraform configurations, HCL (HashiCorp Configuration Language) code, and infrastructure automation scripts. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is a LoRA (Low-Rank Adaptation) fine-tuned version of CodeLlama-7b-Instruct-hf, specifically optimized for generating Terraform configuration files. It was trained on public Terraform Registry documentation to understand Terraform syntax, resource configurations, and best practices. |
|
|
|
|
|
### Key Features |
|
|
|
|
|
- **Specialized for Terraform**: Fine-tuned specifically for infrastructure-as-code generation |
|
|
- **Efficient Training**: Uses QLoRA (4-bit quantization + LoRA) for memory-efficient training |
|
|
- **Public Data Only**: Trained exclusively on public Terraform Registry documentation |
|
|
- **Production Ready**: Optimized for real-world Terraform development workflows |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Developed by**: Rafi Al Attrach, Patrick Schmitt, Nan Wu, Helena Schneider, Stefania Saju (TUM + IBM Research Project) |
|
|
- **Model type**: LoRA fine-tuned CodeLlama |
|
|
- **Language(s)**: English |
|
|
- **License**: Apache 2.0 |
|
|
- **Finetuned from**: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) |
|
|
- **Training method**: QLoRA (4-bit quantization + LoRA) |
|
|
|
|
|
### Technical Specifications |
|
|
|
|
|
- **Base Model**: CodeLlama-7b-Instruct-hf |
|
|
- **LoRA Rank**: 64 |
|
|
- **LoRA Alpha**: 16 |
|
|
- **Target Modules**: q_proj, v_proj |
|
|
- **Training Epochs**: 3 |
|
|
- **Max Sequence Length**: 512 |
|
|
- **Quantization**: 4-bit (fp4) |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
This model is designed for: |
|
|
- Generating Terraform configuration files |
|
|
- Infrastructure-as-code development |
|
|
- Terraform resource configuration |
|
|
- DevOps automation |
|
|
- Cloud infrastructure planning |
|
|
|
|
|
### Example Use Cases |
|
|
|
|
|
```python |
|
|
# Generate AWS EC2 instance configuration |
|
|
prompt = "Create a Terraform configuration for an AWS EC2 instance with t3.medium instance type" |
|
|
``` |
|
|
|
|
|
```python |
|
|
# Generate Azure resource group |
|
|
prompt = "Create a Terraform configuration for an Azure resource group in West Europe" |
|
|
``` |
|
|
|
|
|
```python |
|
|
# Generate GCP compute instance |
|
|
prompt = "Create a Terraform configuration for a GCP compute instance with Ubuntu 20.04" |
|
|
``` |
|
|
|
|
|
## How to Get Started |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install transformers torch peft accelerate bitsandbytes |
|
|
``` |
|
|
|
|
|
### Loading the Model |
|
|
|
|
|
#### GPU Usage (Recommended) |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model with 4-bit quantization (GPU) |
|
|
base_model = "codellama/CodeLlama-7b-Instruct-hf" |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
base_model, |
|
|
load_in_4bit=True, |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(model, "rafiaa/terraform-codellama-7b") |
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model) |
|
|
|
|
|
# Set pad token |
|
|
if tokenizer.pad_token is None: |
|
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
``` |
|
|
|
|
|
#### CPU Usage (Alternative) |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model (CPU compatible) |
|
|
base_model = "codellama/CodeLlama-7b-Instruct-hf" |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
base_model, |
|
|
torch_dtype=torch.float32, |
|
|
device_map="cpu" |
|
|
) |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(model, "rafiaa/terraform-codellama-7b") |
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model) |
|
|
|
|
|
# Set pad token |
|
|
if tokenizer.pad_token is None: |
|
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
``` |
|
|
|
|
|
### Usage Example |
|
|
|
|
|
```python |
|
|
def generate_terraform(prompt, max_length=512): |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_length=max_length, |
|
|
temperature=0.7, |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
# Example usage |
|
|
prompt = "Create a Terraform configuration for an AWS S3 bucket with versioning enabled" |
|
|
result = generate_terraform(prompt) |
|
|
print(result) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
- **Source**: Public Terraform Registry documentation |
|
|
- **Data Type**: Terraform configuration files and documentation |
|
|
- **Preprocessing**: Standard text preprocessing with sequence length of 512 tokens |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
- **Method**: QLoRA (4-bit quantization + LoRA) |
|
|
- **LoRA Rank**: 64 |
|
|
- **LoRA Alpha**: 16 |
|
|
- **Target Modules**: q_proj, v_proj |
|
|
- **Training Epochs**: 3 |
|
|
- **Max Sequence Length**: 512 |
|
|
- **Quantization**: 4-bit (fp4) |
|
|
|
|
|
### Training Hyperparameters |
|
|
|
|
|
- **Training regime**: 4-bit mixed precision |
|
|
- **LoRA Dropout**: 0.0 |
|
|
- **Learning Rate**: Optimized for QLoRA training |
|
|
- **Batch Size**: Optimized for memory efficiency |
|
|
|
|
|
## Limitations and Bias |
|
|
|
|
|
### Known Limitations |
|
|
|
|
|
- **Context Length**: Limited to 512 tokens due to training configuration |
|
|
- **Domain Specificity**: Optimized for Terraform, may not perform well on other infrastructure tools |
|
|
- **Base Model Limitations**: Inherits limitations from CodeLlama-7b-Instruct-hf |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
- Use for Terraform-specific tasks only |
|
|
- Validate generated configurations before deployment |
|
|
- Consider the 512-token context limit for complex configurations |
|
|
- For production use, always review and test generated code |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
- **Training Method**: QLoRA reduces computational requirements significantly |
|
|
- **Hardware**: Trained using efficient 4-bit quantization |
|
|
- **Carbon Footprint**: Reduced compared to full fine-tuning due to QLoRA efficiency |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{terraform-codellama-7b, |
|
|
title={terraform-codellama-7b: A LoRA Fine-tuned Model for Terraform Code Generation}, |
|
|
author={Rafi Al Attrach and Patrick Schmitt and Nan Wu and Helena Schneider and Stefania Saju}, |
|
|
year={2024}, |
|
|
url={https://huggingface.co/rafiaa/terraform-codellama-7b} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Related Models |
|
|
|
|
|
- **Base Model**: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) |
|
|
- **Enhanced Version**: [rafiaa/terraform-cloud-codellama-7b](https://huggingface.co/rafiaa/terraform-cloud-codellama-7b) (Recommended - includes cloud provider documentation) |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
- **Author**: rafiaa |
|
|
- **Model Repository**: [HuggingFace Model](https://huggingface.co/rafiaa/terraform-codellama-7b) |
|
|
- **Issues**: Please report issues through the HuggingFace model page |
|
|
|
|
|
--- |
|
|
|
|
|
*This model is part of a research project conducted in early 2024, focusing on specialized code generation for infrastructure-as-code tools.* |
|
|
|