| # Humanizer LoRA Adapter | |
| This is a LoRA (Low-Rank Adaptation) adapter for Llama3 8B Instruct that converts formal text into more natural, human-like language. | |
| ## Model Details | |
| - **Base Model**: meta-llama/Meta-Llama-3-8B-Instruct | |
| - **Adapter Type**: LoRA (Low-Rank Adaptation) | |
| - **LoRA Rank**: 32 | |
| - **LoRA Alpha**: 64 | |
| - **Target Modules**: {'k_proj', 'gate_proj', 'q_proj', 'v_proj', 'up_proj', 'down_proj', 'o_proj'} | |
| - **Task**: Text humanization - converting formal/academic text to conversational style | |
| ## Files Included | |
| This adapter includes all necessary files: | |
| - `adapter_config.json` - LoRA configuration | |
| - `adapter_model.safetensors` - LoRA weights | |
| - `special_tokens_map.json` - Special tokens mapping | |
| - `tokenizer.json` - Tokenizer vocabulary | |
| - `tokenizer_config.json` - Tokenizer configuration | |
| - `training_args.bin` - Training arguments | |
| ## Usage | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| from peft import PeftModel | |
| # Load base model and tokenizer | |
| base_model = "meta-llama/Meta-Llama-3-8B-Instruct" | |
| model = AutoModelForCausalLM.from_pretrained( | |
| base_model, | |
| torch_dtype=torch.float16, | |
| device_map="auto", | |
| trust_remote_code=True | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained(base_model) | |
| # Load LoRA adapter | |
| adapter_name = "arda24/Humanizer" | |
| model = PeftModel.from_pretrained(model, adapter_name) | |
| # Prepare input | |
| prompt = "### Instruction: | |
| rewrite this text in a natural and human like way | |
| ### Input: | |
| The system requires authentication before proceeding. | |
| ### Response: | |
| " | |
| # Generate humanized text | |
| inputs = tokenizer(prompt, return_tensors="pt") | |
| if torch.cuda.is_available(): | |
| inputs = {k: v.cuda() for k, v in inputs.items()} | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=256, | |
| temperature=0.3, | |
| do_sample=True, | |
| top_p=0.7, | |
| repetition_penalty=1.05, | |
| no_repeat_ngram_size=2 | |
| ) | |
| response = tokenizer.decode(outputs[0], skip_special_tokens=True) | |
| humanized_text = response.split("### Response:")[1].strip() | |
| print(humanized_text) | |
| ``` | |
| ## Example | |
| **Input**: "The system requires authentication before proceeding." | |
| **Output**: "You need to log in first before you can access the system." | |
| ## Training Configuration | |
| - **LoRA Rank**: 32 | |
| - **LoRA Alpha**: 64 | |
| - **Learning Rate**: 1e-5 | |
| - **Batch Size**: 1 | |
| - **Gradient Accumulation Steps**: 16 | |
| - **Training Steps**: ~4000 | |
| ## Advantages of LoRA | |
| - **Smaller size**: Only ~50MB vs several GB for full model | |
| - **Faster loading**: Loads quickly on top of base model | |
| - **Flexible**: Can be combined with other adapters | |
| - **Efficient**: Uses minimal additional parameters | |
| ## Limitations | |
| - Works best with formal/academic text | |
| - May occasionally add citations if not properly controlled | |
| - Conservative settings recommended for minimal changes | |
| - Not suitable for creative writing or fiction | |
| ## License | |
| This adapter is based on Llama3 8B Instruct and follows the same license terms. | |