arda24
/

humanizer-final

Model card Files Files and versions

humanizer-final / README.md

arda24's picture

Upload folder using huggingface_hub

ee05833 verified 6 months ago

|

history blame contribute delete

2.91 kB

	# Humanizer LoRA Adapter

	This is a LoRA (Low-Rank Adaptation) adapter for Llama3 8B Instruct that converts formal text into more natural, human-like language.

	## Model Details

	- Base Model: meta-llama/Meta-Llama-3-8B-Instruct
	- Adapter Type: LoRA (Low-Rank Adaptation)
	- LoRA Rank: 32
	- LoRA Alpha: 64
	- Target Modules: {'k_proj', 'gate_proj', 'q_proj', 'v_proj', 'up_proj', 'down_proj', 'o_proj'}
	- Task: Text humanization - converting formal/academic text to conversational style

	## Files Included

	This adapter includes all necessary files:
	- `adapter_config.json` - LoRA configuration
	- `adapter_model.safetensors` - LoRA weights
	- `special_tokens_map.json` - Special tokens mapping
	- `tokenizer.json` - Tokenizer vocabulary
	- `tokenizer_config.json` - Tokenizer configuration
	- `training_args.bin` - Training arguments

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load base model and tokenizer
	base_model = "meta-llama/Meta-Llama-3-8B-Instruct"
	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	torch_dtype=torch.float16,
	device_map="auto",
	trust_remote_code=True
	)
	tokenizer = AutoTokenizer.from_pretrained(base_model)

	# Load LoRA adapter
	adapter_name = "arda24/Humanizer"
	model = PeftModel.from_pretrained(model, adapter_name)

	# Prepare input
	prompt = "### Instruction:
	rewrite this text in a natural and human like way

	### Input:
	The system requires authentication before proceeding.

	### Response:
	"

	# Generate humanized text
	inputs = tokenizer(prompt, return_tensors="pt")
	if torch.cuda.is_available():
	inputs = {k: v.cuda() for k, v in inputs.items()}

	outputs = model.generate(
	**inputs,
	max_new_tokens=256,
	temperature=0.3,
	do_sample=True,
	top_p=0.7,
	repetition_penalty=1.05,
	no_repeat_ngram_size=2
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	humanized_text = response.split("### Response:")[1].strip()
	print(humanized_text)
	```

	## Example

	Input: "The system requires authentication before proceeding."

	Output: "You need to log in first before you can access the system."

	## Training Configuration

	- LoRA Rank: 32
	- LoRA Alpha: 64
	- Learning Rate: 1e-5
	- Batch Size: 1
	- Gradient Accumulation Steps: 16
	- Training Steps: ~4000

	## Advantages of LoRA

	- Smaller size: Only ~50MB vs several GB for full model
	- Faster loading: Loads quickly on top of base model
	- Flexible: Can be combined with other adapters
	- Efficient: Uses minimal additional parameters

	## Limitations

	- Works best with formal/academic text
	- May occasionally add citations if not properly controlled
	- Conservative settings recommended for minimal changes
	- Not suitable for creative writing or fiction

	## License

	This adapter is based on Llama3 8B Instruct and follows the same license terms.