File size: 9,690 Bytes

81148b8

---
library_name: peft
base_model: codellama/CodeLlama-7b-Instruct-hf
tags:
- terraform
- terraform-configuration
- infrastructure-as-code
- iac
- hashicorp
- codellama
- lora
- qlora
- peft
- code-generation
- devops
- cloud
- aws
- azure
- gcp
- multi-cloud
- automation
- configuration-management
- cloud-infrastructure
license: apache-2.0
language:
- en
pipeline_tag: text-generation
---

# terraform-cloud-codellama-7b

**RECOMMENDED MODEL** - An advanced LoRA fine-tuned model for comprehensive Terraform infrastructure-as-code generation, supporting multiple cloud providers (AWS, Azure, GCP). This model generates Terraform configurations, HCL code, and multi-cloud infrastructure automation scripts.

## Model Description

This is the **enhanced model** - an advanced version of terraform-codellama-7b that has been additionally trained on AWS, Azure, and GCP public documentation. It provides superior performance for multi-cloud Terraform development with deep understanding of cloud provider-specific resources and best practices.

### Key Features

- **Multi-Cloud Support**: Trained on AWS, Azure, and GCP documentation
- **Enhanced Performance**: Superior to the base terraform-codellama-7b model
- **Production Ready**: Optimized for real-world multi-cloud infrastructure development
- **Comprehensive Coverage**: Handles complex cloud provider-specific configurations
- **Efficient Training**: Uses QLoRA (4-bit quantization + LoRA) for memory efficiency

## Model Details

- **Developed by**: Rafi Al Attrach, Patrick Schmitt, Nan Wu, Helena Schneider, Stefania Saju (TUM + IBM Research Project)
- **Model type**: LoRA fine-tuned CodeLlama (Enhanced)
- **Language(s)**: English
- **License**: Apache 2.0
- **Finetuned from**: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf)
- **Training method**: QLoRA (4-bit quantization + LoRA)
- **Base Model**: Built on [rafiaa/terraform-codellama-7b](https://huggingface.co/rafiaa/terraform-codellama-7b)

### Technical Specifications

- **Base Model**: CodeLlama-7b-Instruct-hf
- **LoRA Rank**: 64
- **LoRA Alpha**: 16
- **Target Modules**: q_proj, v_proj
- **Training Epochs**: 3 (Stage 1) + Additional training (Stage 2)
- **Max Sequence Length**: 512
- **Quantization**: 4-bit (fp4)

## Uses

### Direct Use

This model is designed for:
- **Multi-cloud Terraform development**
- **AWS resource configuration** (EC2, S3, RDS, Lambda, etc.)
- **Azure resource management** (Virtual Machines, Storage Accounts, App Services, etc.)
- **GCP resource deployment** (Compute Engine, Cloud Storage, Cloud SQL, etc.)
- **Complex infrastructure orchestration**
- **Cloud provider-specific best practices**

### Example Use Cases

```python
# Generate AWS multi-service infrastructure
prompt = "Create a Terraform configuration for an AWS application with VPC, EC2, RDS, and S3"
```

```python
# Generate Azure App Service with database
prompt = "Create a Terraform configuration for an Azure App Service with PostgreSQL database"
```

```python
# Generate GCP Kubernetes cluster
prompt = "Create a Terraform configuration for a GCP GKE cluster with node pools"
```

```python
# Generate multi-cloud setup
prompt = "Create a Terraform configuration for a hybrid cloud setup using AWS and Azure"
```

## How to Get Started

### Installation

```bash
pip install transformers torch peft accelerate bitsandbytes
```

### Loading the Model

#### GPU Usage (Recommended)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model with 4-bit quantization (GPU)
base_model = "codellama/CodeLlama-7b-Instruct-hf"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    load_in_4bit=True,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "rafiaa/terraform-cloud-codellama-7b")
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Set pad token
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
```

#### CPU Usage (Alternative)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model (CPU compatible)
base_model = "codellama/CodeLlama-7b-Instruct-hf"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float32,
    device_map="cpu"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "rafiaa/terraform-cloud-codellama-7b")
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Set pad token
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
```

### Usage Example

```python
def generate_terraform(prompt, max_length=512):
    inputs = tokenizer(prompt, return_tensors="pt")
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=max_length,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example: Multi-cloud infrastructure
prompt = """
Create a Terraform configuration for a multi-cloud setup:
- AWS: VPC with public/private subnets, EC2 instances
- Azure: Storage account and App Service
- GCP: Cloud SQL database
"""

result = generate_terraform(prompt)
print(result)
```

### Advanced Usage

```python
# Cloud-specific prompts
aws_prompt = "Create a Terraform configuration for AWS EKS cluster with managed node groups"
azure_prompt = "Create a Terraform configuration for Azure Kubernetes Service (AKS)"
gcp_prompt = "Create a Terraform configuration for GCP Cloud Run service"

# Generate configurations
aws_config = generate_terraform(aws_prompt)
azure_config = generate_terraform(azure_prompt)
gcp_config = generate_terraform(gcp_prompt)
```

## Training Details

### Training Data

**Stage 1**: Public Terraform Registry documentation
**Stage 2**: Additional training on:
- **AWS Documentation**: EC2, S3, RDS, Lambda, VPC, IAM, etc.
- **Azure Documentation**: Virtual Machines, Storage Accounts, App Services, Key Vault, etc.
- **GCP Documentation**: Compute Engine, Cloud Storage, Cloud SQL, GKE, etc.

### Training Procedure

- **Method**: QLoRA (4-bit quantization + LoRA)
- **Two-Stage Training**: 
  1. Terraform Registry documentation
  2. Cloud provider documentation (AWS, Azure, GCP)
- **LoRA Rank**: 64
- **LoRA Alpha**: 16
- **Target Modules**: q_proj, v_proj
- **Training Epochs**: 3 (Stage 1) + Additional training (Stage 2)
- **Max Sequence Length**: 512
- **Quantization**: 4-bit (fp4)

### Training Hyperparameters

- **Training regime**: 4-bit mixed precision
- **LoRA Dropout**: 0.0
- **Learning Rate**: Optimized for QLoRA training
- **Batch Size**: Optimized for memory efficiency

## Performance Comparison

| Model | Terraform Knowledge | AWS Support | Azure Support | GCP Support | Multi-Cloud Capability |
|-------|-------------------|-------------|---------------|-------------|-------------------|
| terraform-codellama-7b | Excellent | Limited | Limited | Limited | Basic |
| **terraform-cloud-codellama-7b** | Excellent | Excellent | Excellent | Excellent | Advanced |

## Limitations and Bias

### Known Limitations

- **Context Length**: Limited to 512 tokens due to training configuration
- **Domain Specificity**: Optimized for Terraform and cloud infrastructure
- **Base Model Limitations**: Inherits limitations from CodeLlama-7b-Instruct-hf
- **Cloud Provider Updates**: May not include the latest cloud provider features

### Recommendations

- Use for Terraform and cloud infrastructure tasks
- Validate generated configurations before deployment
- Consider the 512-token context limit for complex configurations
- For production use, always review and test generated code
- Stay updated with cloud provider documentation for latest features

## Environmental Impact

- **Training Method**: QLoRA reduces computational requirements significantly
- **Hardware**: Trained using efficient 4-bit quantization
- **Carbon Footprint**: Reduced compared to full fine-tuning due to QLoRA efficiency
- **Two-Stage Approach**: Efficient incremental training

## Citation

If you use this model in your research, please cite:

```bibtex
@misc{terraform-cloud-codellama-7b,
  title={terraform-cloud-codellama-7b: A Multi-Cloud LoRA Fine-tuned Model for Terraform Code Generation},
  author={Rafi Al Attrach and Patrick Schmitt and Nan Wu and Helena Schneider and Stefania Saju},
  year={2024},
  url={https://huggingface.co/rafiaa/terraform-cloud-codellama-7b}
}
```

## Related Models

- **Base Model**: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf)
- **Stage 1 Model**: [rafiaa/terraform-codellama-7b](https://huggingface.co/rafiaa/terraform-codellama-7b)
- **This Model**: [rafiaa/terraform-cloud-codellama-7b](https://huggingface.co/rafiaa/terraform-cloud-codellama-7b) (Recommended)

## Model Card Contact

- **Author**: rafiaa
- **Model Repository**: [HuggingFace Model](https://huggingface.co/rafiaa/terraform-cloud-codellama-7b)
- **Issues**: Please report issues through the HuggingFace model page

## Acknowledgments

- **Research Project**: Early 2024 research project at TUM + IBM
- **Training Data**: Public documentation from Terraform Registry, AWS, Azure, and GCP
- **Base Model**: Meta's CodeLlama-7b-Instruct-hf
- **Training Method**: QLoRA for efficient fine-tuning

---

*This model represents the culmination of a two-stage fine-tuning approach, combining Terraform expertise with comprehensive cloud provider knowledge for superior infrastructure-as-code generation.*