Upload toxicthesis-vicunauc-roberta-classification-3 model

bf0645c verified 27 days ago

4.68 kB

	---
	license: mit
	tags:
	- toxicity-detection
	- roberta
	- vicunaUC
	- pytorch-lightning
	datasets:
	- simocorbo/toxicthesis-vicunaUC-dataset
	language:
	- en
	---

	# ToxicThesis: RoBERTa Model for Vicunauc

	This model is part of the ToxicThesis framework for analyzing toxicity in text using multiple neural architectures.

	## Model Details

	- Architecture: RoBERTa
	- System Under Test (SUT): vicunaUC
	- Task: Classification (3 classes)
	- Loss Function: Cross-Entropy
	- Framework: PyTorch Lightning
	- Input: Text strings
	- Output: Class probabilities (3 classes)



	## Training Data

	This model was trained on the vicunaUC dataset, which consists of text samples labeled for toxicity. The training process involved:
	- Preprocessing and tokenization appropriate for the architecture
	- Data augmentation and balancing techniques
	- Validation-based early stopping
	- Hyperparameter tuning via grid/random search

	## Usage

	### Installation

	```bash
	pip install torch transformers huggingface_hub
	```

	### Download and Load

	```python
	from huggingface_hub import hf_hub_download
	import torch
	from transformers import RobertaTokenizer

	# Download checkpoint
	checkpoint_path = hf_hub_download(
	repo_id="simocorbo/toxicthesis-vicunauc-roberta-classification-3",
	filename="checkpoints/best.pt"
	)

	# Load tokenizer
	tokenizer = RobertaTokenizer.from_pretrained('roberta-base')

	# Load checkpoint
	checkpoint = torch.load(checkpoint_path, map_location='cpu')

	# Note: Full model reconstruction requires the ToxicThesis repository
	# The model uses RoBERTa with custom classification head
	```

	### Predict

	```python
	# Tokenize input
	text = "Your text here"
	inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True, max_length=512)

	# Run inference (requires full model from ToxicThesis repo)
	with torch.no_grad():
	output = model(**inputs)
	prediction = torch.sigmoid(output).item()

	print(f"Toxicity score: {prediction:.4f}")
	```

	## Output Interpretation

	- Classification output: Probabilities for 3 toxicity classes
	- Threshold for binary decisions can be adjusted based on your use case
	- Consider the trade-off between precision and recall when setting thresholds

	## Limitations

	- Model performance may degrade on out-of-distribution data
	- Bias may exist based on the training data characteristics
	- Context-dependent toxicity may not always be captured accurately
	- Performance varies across different demographic groups and topics

	## Ethical Considerations

	This model is designed for toxicity detection research and should be used responsibly:
	- Do not use for automated censorship without human oversight
	- Be aware of potential biases in toxicity detection
	- Consider the impact on free speech and expression
	- Use in combination with human moderation for production systems

	## Training Details

	This model was trained as part of the ToxicThesis framework comparing multiple architectures:
	- RNTN (Recursive Neural Tensor Networks): Compositional semantics via parse trees
	- TreeLSTM: Tree-structured LSTM networks for hierarchical processing
	- Linear: FastText embeddings + logistic regression baseline
	- RoBERTa: Transformer-based pre-trained language model

	### Hyperparameters

	See `hparams.yaml` for complete training configuration including:
	- Learning rate and optimizer settings
	- Batch size and number of epochs
	- Architecture-specific parameters
	- Regularization and dropout rates

	## Repository

	Full code and training scripts: [ToxicThesis](https://github.com/simo-corbo/ToxicThesis)

	For complete usage examples and model reconstruction code, please refer to the repository.

	## Citation

	```bibtex
	@software{toxicthesis2025,
	title={ToxicThesis: Multi-Architecture Toxicity Analysis Framework},
	author={Simone Corbo},
	year={2025},
	url={https://github.com/simo-corbo/ToxicThesis}
	}
	```

	## Files

	- `checkpoints/best.pt` - Best model checkpoint (by validation loss)
	- `hparams.yaml` - Complete hyperparameter configuration
	- `train.csv` - Training metrics per epoch
	- `val.csv` - Validation metrics per epoch
	- `test.csv` - Final test set evaluation (if available)
	- `patterns.json` - Mined syntactic patterns (decision tree structures)
	- `README.md` - This documentation

	### Analysis Files (if generated)

	- `predictions.csv` - Model predictions on test set
	- `word_scores.csv` - Word-level toxicity scores
	- `word_toxicity_variance.csv` - Variance analysis per word
	- `word_variance_rank.csv` - Ranked words by variance

	## Contact

	For questions, issues, or contributions, please open an issue on the [ToxicThesis repository](https://github.com/simo-corbo/ToxicThesis).