Update README.md

b6ceb6d verified about 1 month ago

4.47 kB

	---
	language:
	- es
	tags:
	- docker
	- devops
	- cli
	- bash
	- text-to-code
	- grape-malbec
	license: apache-2.0
	base_model:
	- Salesforce/codet5-small
	library_name: adapter-transformers
	datasets:
	- jrodriiguezg/malbec-nl2docker-es
	---

	# Model Card for grape-malbec

	Model Name: grape-malbec
	Task: Text-to-Text Generation (Natural Language to Docker)
	Language: Spanish (Input), Bash/Docker CLI (Output)
	Domain: DevOps, Containerization
	Base Model: Salesforce/codet5-small

	## Model Details

	### Model Description
	grape-malbec is a specialized language model designed to act as an intelligent Command Line Interface (CLI) assistant for Docker operations. It translates natural language instructions—often characterized by typos, technical slang, or ambiguity—into precise, safe, and context-aware Docker commands.

	The model is engineered with a "safety-first" architecture, specifically trained to distinguish between informational queries and destructive actions to prevent accidental data loss in production environments.

	* Developed for: DevOps Engineers, Developers, and CLI Tool Integrations.
	* Model Type: Fine-tuned Causal Language Model.
	* Knowledge Domain: Docker Container Lifecycle (Run, Stop, Remove, Exec, Logs, Network, Volume).

	## Uses

	### Direct Use
	The model is intended to be integrated into terminal emulators, shell plugins (like Zsh/Fish autosuggestions), or chat-based DevOps assistants. It takes a user's intent and the current directory context as input and outputs a single, executable shell command.

	### Key Capabilities
	1. Safety-Driven Intent Inference:
	* Strictly differentiates between status checks (e.g., "cómo va", "estado") and destructive commands. A request for status never results in a `docker rm` command.
	* Distinguishes between creating a new container (`docker run`) and entering an existing one (`docker exec`) based on subtle linguistic cues.

	2. Smart Port Mapping:
	* Possesses knowledge of standard internal ports for popular services (e.g., 80 for Nginx, 3306 for MySQL).
	* If a user requests "Nginx en el puerto 90", the model correctly generates `-p 90:80`, mapping the custom host port to the correct standard container port.

	3. Context Isolation:
	* The model receives file system context (`pwd`, `ls`) but is trained to ignore irrelevant files (media, documents) and only utilize relevant files (like `Dockerfile` or `docker-compose.yml`) when necessary for the command.

	4. Robustness to Noise:
	* Highly tolerant of misspellings (e.g., "nginix", "posgres", "aranca") and informal verbs (e.g., "cárgate", "levanta", "tira").

	## Examples

	Example 1: Smart Port Mapping (Mapeo Inteligente)
	* Input: `Contexto: [] \| despliega un apache en el puerto 90`
	* Output: `docker run -d -p 90:80 httpd`

	Example 2: Deployment Scenario (Despliegue)
	* Input: `Contexto: [] \| levanta un redis persistente llamado cache-prod`
	* Output: `docker run -d --name cache-prod -v redis_data:/data redis`

	Example 3: Safety correction (Status vs Remove)
	* Input: `Contexto: [] \| dime el estado del contenedor ubuntu`
	* Output: `docker ps -a --filter name=ubuntu`
	(Note: Prevents generation of `docker rm` which can occur in untrained models due to token similarity in small datasets)

	Example 4: Run vs Exec (Ambigüedad Ejecutar)
	* Input: `Contexto: [] \| ejecuta un bash en el contenedor db`
	* Output: `docker exec -it db bash`

	Example 5: Typo Tolerance (Tolerancia a errores)
	* Input: `Contexto: [] \| stopea el contenedor nginix`
	* Output: `docker stop nginx`

	## How to Get Started

	You can use this model with the transformers library.
	```python
	from transformers import AutoTokenizer, T5ForConditionalGeneration

	model_name = "jrodriiguezg/grape-malbec"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = T5ForConditionalGeneration.from_pretrained(model_name)

	# Define the context (simulated environment)
	context = "Contexto: ['pwd=/home/user', 'ls=data.zip, notes.txt']"
	instruction = "despleiga un apache en el puerto 90"

	# Prepare input
	input_text = f"translate Spanish to Bash: {context} \| {instruction}"
	input_ids = tokenizer(input_text, return_tensors="pt").input_ids

	# Generate command
	outputs = model.generate(input_ids, max_length=128)
	command = tokenizer.decode(outputs[0], skip_special_tokens=True)

	print(command)
	# Output: docker run -d -p 90:80 httpd
	```