NewsScope-lora / README.md
nidhipandya's picture
Update README.md
16ddedf verified
---
license: mit
language:
- en
library_name: peft
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
tags:
- llama
- lora
- claim-extraction
- fact-checking
- news
pipeline_tag: text-generation
---
# NewsScope LoRA Adapter
This repository contains a **LoRA adapter** fine-tuned for **schema-grounded claim extraction** from news articles.
It produces structured JSON outputs with:
- domain
- headline
- key_points
- whos_involved
- how_it_unfolded
- claims (2-3 verifiable claims with evidence)
## Key Result (Human Evaluation)
- **NewsScope:** 89.4% accuracy
- **GPT-4o-mini baseline:** 93.7%
- Reported difference is not statistically significant (p=0.07)
## Important: LLaMA License
You must accept the **Meta LLaMA** license for the base model on Hugging Face:
`meta-llama/Meta-Llama-3.1-8B-Instruct`
Then either:
- run `huggingface-cli login`, or
- set `HF_TOKEN` in your environment.
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "nidhipandya/NewsScope-lora")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
```
## Training Details
- **Base model:** meta-llama/Meta-Llama-3.1-8B-Instruct
- **LoRA rank:** 16
- **Training set size:** 315 articles (URLs + annotations; article text not publicly redistributed)
- **Notes:** Training reproduction requires fetching article text from URLs due to copyright.
## Links
- **Code:** https://github.com/nidhip1611/NewsScope
- **Benchmark:** GitHub Releases (benchmark.zip)
- **Paper:** arXiv (TBD)
## Citation
```bibtex
@article{pandyaNewsscope,
title={NewsScope: Schema-Grounded Cross-Domain News Claim Extraction with Open Models},
author={Pandya, Nidhi},
journal={arXiv preprint arXiv:TBD},
year={TBD}
}
```