README.md · rafiaa/terraform-codellama-7b at main

terraform-codellama-7b / README.md

rafiaa

Upload folder using huggingface_hub

c80b2e7 verified 3 months ago

preview code

raw

history blame contribute delete

7.12 kB

	---
	library_name: peft
	base_model: codellama/CodeLlama-7b-Instruct-hf
	tags:
	- terraform
	- terraform-configuration
	- infrastructure-as-code
	- iac
	- hashicorp
	- codellama
	- lora
	- qlora
	- peft
	- code-generation
	- devops
	- cloud
	- automation
	- configuration-management
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	---

	# terraform-codellama-7b

	A specialized LoRA fine-tuned model for Terraform infrastructure-as-code generation, built on CodeLlama-7b-Instruct-hf. This model excels at generating Terraform configurations, HCL (HashiCorp Configuration Language) code, and infrastructure automation scripts.

	## Model Description

	This model is a LoRA (Low-Rank Adaptation) fine-tuned version of CodeLlama-7b-Instruct-hf, specifically optimized for generating Terraform configuration files. It was trained on public Terraform Registry documentation to understand Terraform syntax, resource configurations, and best practices.

	### Key Features

	- Specialized for Terraform: Fine-tuned specifically for infrastructure-as-code generation
	- Efficient Training: Uses QLoRA (4-bit quantization + LoRA) for memory-efficient training
	- Public Data Only: Trained exclusively on public Terraform Registry documentation
	- Production Ready: Optimized for real-world Terraform development workflows

	## Model Details

	- Developed by: Rafi Al Attrach, Patrick Schmitt, Nan Wu, Helena Schneider, Stefania Saju (TUM + IBM Research Project)
	- Model type: LoRA fine-tuned CodeLlama
	- Language(s): English
	- License: Apache 2.0
	- Finetuned from: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf)
	- Training method: QLoRA (4-bit quantization + LoRA)

	### Technical Specifications

	- Base Model: CodeLlama-7b-Instruct-hf
	- LoRA Rank: 64
	- LoRA Alpha: 16
	- Target Modules: q_proj, v_proj
	- Training Epochs: 3
	- Max Sequence Length: 512
	- Quantization: 4-bit (fp4)

	## Uses

	### Direct Use

	This model is designed for:
	- Generating Terraform configuration files
	- Infrastructure-as-code development
	- Terraform resource configuration
	- DevOps automation
	- Cloud infrastructure planning

	### Example Use Cases

	```python
	# Generate AWS EC2 instance configuration
	prompt = "Create a Terraform configuration for an AWS EC2 instance with t3.medium instance type"
	```

	```python
	# Generate Azure resource group
	prompt = "Create a Terraform configuration for an Azure resource group in West Europe"
	```

	```python
	# Generate GCP compute instance
	prompt = "Create a Terraform configuration for a GCP compute instance with Ubuntu 20.04"
	```

	## How to Get Started

	### Installation

	```bash
	pip install transformers torch peft accelerate bitsandbytes
	```

	### Loading the Model

	#### GPU Usage (Recommended)
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel
	import torch

	# Load base model with 4-bit quantization (GPU)
	base_model = "codellama/CodeLlama-7b-Instruct-hf"
	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	load_in_4bit=True,
	torch_dtype=torch.float16,
	device_map="auto"
	)

	# Load LoRA adapter
	model = PeftModel.from_pretrained(model, "rafiaa/terraform-codellama-7b")
	tokenizer = AutoTokenizer.from_pretrained(base_model)

	# Set pad token
	if tokenizer.pad_token is None:
	tokenizer.pad_token = tokenizer.eos_token
	```

	#### CPU Usage (Alternative)
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel
	import torch

	# Load base model (CPU compatible)
	base_model = "codellama/CodeLlama-7b-Instruct-hf"
	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	torch_dtype=torch.float32,
	device_map="cpu"
	)

	# Load LoRA adapter
	model = PeftModel.from_pretrained(model, "rafiaa/terraform-codellama-7b")
	tokenizer = AutoTokenizer.from_pretrained(base_model)

	# Set pad token
	if tokenizer.pad_token is None:
	tokenizer.pad_token = tokenizer.eos_token
	```

	### Usage Example

	```python
	def generate_terraform(prompt, max_length=512):
	inputs = tokenizer(prompt, return_tensors="pt")

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_length=max_length,
	temperature=0.7,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id
	)

	return tokenizer.decode(outputs[0], skip_special_tokens=True)

	# Example usage
	prompt = "Create a Terraform configuration for an AWS S3 bucket with versioning enabled"
	result = generate_terraform(prompt)
	print(result)
	```

	## Training Details

	### Training Data

	- Source: Public Terraform Registry documentation
	- Data Type: Terraform configuration files and documentation
	- Preprocessing: Standard text preprocessing with sequence length of 512 tokens

	### Training Procedure

	- Method: QLoRA (4-bit quantization + LoRA)
	- LoRA Rank: 64
	- LoRA Alpha: 16
	- Target Modules: q_proj, v_proj
	- Training Epochs: 3
	- Max Sequence Length: 512
	- Quantization: 4-bit (fp4)

	### Training Hyperparameters

	- Training regime: 4-bit mixed precision
	- LoRA Dropout: 0.0
	- Learning Rate: Optimized for QLoRA training
	- Batch Size: Optimized for memory efficiency

	## Limitations and Bias

	### Known Limitations

	- Context Length: Limited to 512 tokens due to training configuration
	- Domain Specificity: Optimized for Terraform, may not perform well on other infrastructure tools
	- Base Model Limitations: Inherits limitations from CodeLlama-7b-Instruct-hf

	### Recommendations

	- Use for Terraform-specific tasks only
	- Validate generated configurations before deployment
	- Consider the 512-token context limit for complex configurations
	- For production use, always review and test generated code

	## Environmental Impact

	- Training Method: QLoRA reduces computational requirements significantly
	- Hardware: Trained using efficient 4-bit quantization
	- Carbon Footprint: Reduced compared to full fine-tuning due to QLoRA efficiency

	## Citation

	If you use this model in your research, please cite:

	```bibtex
	@misc{terraform-codellama-7b,
	title={terraform-codellama-7b: A LoRA Fine-tuned Model for Terraform Code Generation},
	author={Rafi Al Attrach and Patrick Schmitt and Nan Wu and Helena Schneider and Stefania Saju},
	year={2024},
	url={https://huggingface.co/rafiaa/terraform-codellama-7b}
	}
	```

	## Related Models

	- Base Model: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf)
	- Enhanced Version: [rafiaa/terraform-cloud-codellama-7b](https://huggingface.co/rafiaa/terraform-cloud-codellama-7b) (Recommended - includes cloud provider documentation)

	## Model Card Contact

	- Author: rafiaa
	- Model Repository: [HuggingFace Model](https://huggingface.co/rafiaa/terraform-codellama-7b)
	- Issues: Please report issues through the HuggingFace model page

	---

	This model is part of a research project conducted in early 2024, focusing on specialized code generation for infrastructure-as-code tools.