README.md · rafiaa/terraform-codellama-7b at main

File size: 7,122 Bytes

c80b2e7

---
library_name: peft
base_model: codellama/CodeLlama-7b-Instruct-hf
tags:
- terraform
- terraform-configuration
- infrastructure-as-code
- iac
- hashicorp
- codellama
- lora
- qlora
- peft
- code-generation
- devops
- cloud
- automation
- configuration-management
license: apache-2.0
language:
- en
pipeline_tag: text-generation
---

# terraform-codellama-7b

A specialized LoRA fine-tuned model for Terraform infrastructure-as-code generation, built on CodeLlama-7b-Instruct-hf. This model excels at generating Terraform configurations, HCL (HashiCorp Configuration Language) code, and infrastructure automation scripts.

## Model Description

This model is a LoRA (Low-Rank Adaptation) fine-tuned version of CodeLlama-7b-Instruct-hf, specifically optimized for generating Terraform configuration files. It was trained on public Terraform Registry documentation to understand Terraform syntax, resource configurations, and best practices.

### Key Features

- **Specialized for Terraform**: Fine-tuned specifically for infrastructure-as-code generation
- **Efficient Training**: Uses QLoRA (4-bit quantization + LoRA) for memory-efficient training
- **Public Data Only**: Trained exclusively on public Terraform Registry documentation
- **Production Ready**: Optimized for real-world Terraform development workflows

## Model Details

- **Developed by**: Rafi Al Attrach, Patrick Schmitt, Nan Wu, Helena Schneider, Stefania Saju (TUM + IBM Research Project)
- **Model type**: LoRA fine-tuned CodeLlama
- **Language(s)**: English
- **License**: Apache 2.0
- **Finetuned from**: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf)
- **Training method**: QLoRA (4-bit quantization + LoRA)

### Technical Specifications

- **Base Model**: CodeLlama-7b-Instruct-hf
- **LoRA Rank**: 64
- **LoRA Alpha**: 16
- **Target Modules**: q_proj, v_proj
- **Training Epochs**: 3
- **Max Sequence Length**: 512
- **Quantization**: 4-bit (fp4)

## Uses

### Direct Use

This model is designed for:
- Generating Terraform configuration files
- Infrastructure-as-code development
- Terraform resource configuration
- DevOps automation
- Cloud infrastructure planning

### Example Use Cases

```python
# Generate AWS EC2 instance configuration
prompt = "Create a Terraform configuration for an AWS EC2 instance with t3.medium instance type"
```

```python
# Generate Azure resource group
prompt = "Create a Terraform configuration for an Azure resource group in West Europe"
```

```python
# Generate GCP compute instance
prompt = "Create a Terraform configuration for a GCP compute instance with Ubuntu 20.04"
```

## How to Get Started

### Installation

```bash
pip install transformers torch peft accelerate bitsandbytes
```

### Loading the Model

#### GPU Usage (Recommended)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model with 4-bit quantization (GPU)
base_model = "codellama/CodeLlama-7b-Instruct-hf"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    load_in_4bit=True,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "rafiaa/terraform-codellama-7b")
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Set pad token
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
```

#### CPU Usage (Alternative)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model (CPU compatible)
base_model = "codellama/CodeLlama-7b-Instruct-hf"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float32,
    device_map="cpu"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "rafiaa/terraform-codellama-7b")
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Set pad token
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
```

### Usage Example

```python
def generate_terraform(prompt, max_length=512):
    inputs = tokenizer(prompt, return_tensors="pt")
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=max_length,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
prompt = "Create a Terraform configuration for an AWS S3 bucket with versioning enabled"
result = generate_terraform(prompt)
print(result)
```

## Training Details

### Training Data

- **Source**: Public Terraform Registry documentation
- **Data Type**: Terraform configuration files and documentation
- **Preprocessing**: Standard text preprocessing with sequence length of 512 tokens

### Training Procedure

- **Method**: QLoRA (4-bit quantization + LoRA)
- **LoRA Rank**: 64
- **LoRA Alpha**: 16
- **Target Modules**: q_proj, v_proj
- **Training Epochs**: 3
- **Max Sequence Length**: 512
- **Quantization**: 4-bit (fp4)

### Training Hyperparameters

- **Training regime**: 4-bit mixed precision
- **LoRA Dropout**: 0.0
- **Learning Rate**: Optimized for QLoRA training
- **Batch Size**: Optimized for memory efficiency

## Limitations and Bias

### Known Limitations

- **Context Length**: Limited to 512 tokens due to training configuration
- **Domain Specificity**: Optimized for Terraform, may not perform well on other infrastructure tools
- **Base Model Limitations**: Inherits limitations from CodeLlama-7b-Instruct-hf

### Recommendations

- Use for Terraform-specific tasks only
- Validate generated configurations before deployment
- Consider the 512-token context limit for complex configurations
- For production use, always review and test generated code

## Environmental Impact

- **Training Method**: QLoRA reduces computational requirements significantly
- **Hardware**: Trained using efficient 4-bit quantization
- **Carbon Footprint**: Reduced compared to full fine-tuning due to QLoRA efficiency

## Citation

If you use this model in your research, please cite:

```bibtex
@misc{terraform-codellama-7b,
  title={terraform-codellama-7b: A LoRA Fine-tuned Model for Terraform Code Generation},
  author={Rafi Al Attrach and Patrick Schmitt and Nan Wu and Helena Schneider and Stefania Saju},
  year={2024},
  url={https://huggingface.co/rafiaa/terraform-codellama-7b}
}
```

## Related Models

- **Base Model**: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf)
- **Enhanced Version**: [rafiaa/terraform-cloud-codellama-7b](https://huggingface.co/rafiaa/terraform-cloud-codellama-7b) (Recommended - includes cloud provider documentation)

## Model Card Contact

- **Author**: rafiaa
- **Model Repository**: [HuggingFace Model](https://huggingface.co/rafiaa/terraform-codellama-7b)
- **Issues**: Please report issues through the HuggingFace model page

---

*This model is part of a research project conducted in early 2024, focusing on specialized code generation for infrastructure-as-code tools.*