simocorbo's picture
Upload toxicthesis-deepseek-tree-classification-4 model
baf1ee9 verified
metadata
license: mit
tags:
  - toxicity-detection
  - tree
  - deepseek
  - pytorch-lightning
datasets:
  - simocorbo/toxicthesis-deepseek-dataset
language:
  - en

ToxicThesis: TreeLSTM Model for Deepseek

This model is part of the ToxicThesis framework for analyzing toxicity in text using multiple neural architectures.

Model Details

  • Architecture: TreeLSTM
  • System Under Test (SUT): deepseek
  • Task: Classification (4 classes)
  • Loss Function: Cross-Entropy
  • Framework: PyTorch Lightning
  • Input: Text strings
  • Output: Class probabilities (4 classes)

Training Data

This model was trained on the deepseek dataset, which consists of text samples labeled for toxicity. The training process involved:

  • Preprocessing and tokenization appropriate for the architecture
  • Data augmentation and balancing techniques
  • Validation-based early stopping
  • Hyperparameter tuning via grid/random search

Usage

Installation

pip install torch huggingface_hub stanza numpy

Download and Load

from huggingface_hub import hf_hub_download
import torch
import stanza

# Download checkpoint
checkpoint_path = hf_hub_download(
    repo_id="simocorbo/toxicthesis-deepseek-tree-classification-4",
    filename="checkpoints/best.pt"
)

# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location='cpu')

# Initialize Stanza for constituency parsing
stanza.download('en')
nlp = stanza.Pipeline('en', processors='tokenize,pos,constituency')

# Note: Full model reconstruction requires the ToxicThesis repository
# Clone: git clone https://github.com/simo-corbo/ToxicThesis
# Then import the appropriate model class

Predict

# This model requires constituency parse trees
# See the ToxicThesis repository for complete usage:
# https://github.com/simo-corbo/ToxicThesis

# Basic usage pattern:
text = "This is a sample text"
doc = nlp(text)
# Parse tree construction and model inference
# requires the full ToxicThesis codebase

Output Interpretation

  • Classification output: Probabilities for 4 toxicity classes
  • Threshold for binary decisions can be adjusted based on your use case
  • Consider the trade-off between precision and recall when setting thresholds

Limitations

  • Model performance may degrade on out-of-distribution data
  • Bias may exist based on the training data characteristics
  • Context-dependent toxicity may not always be captured accurately
  • Performance varies across different demographic groups and topics

Ethical Considerations

This model is designed for toxicity detection research and should be used responsibly:

  • Do not use for automated censorship without human oversight
  • Be aware of potential biases in toxicity detection
  • Consider the impact on free speech and expression
  • Use in combination with human moderation for production systems

Training Details

This model was trained as part of the ToxicThesis framework comparing multiple architectures:

  • RNTN (Recursive Neural Tensor Networks): Compositional semantics via parse trees
  • TreeLSTM: Tree-structured LSTM networks for hierarchical processing
  • Linear: FastText embeddings + logistic regression baseline
  • RoBERTa: Transformer-based pre-trained language model

Hyperparameters

See hparams.yaml for complete training configuration including:

  • Learning rate and optimizer settings
  • Batch size and number of epochs
  • Architecture-specific parameters
  • Regularization and dropout rates

Repository

Full code and training scripts: ToxicThesis

For complete usage examples and model reconstruction code, please refer to the repository.

Citation

@software{toxicthesis2025,
  title={ToxicThesis: Multi-Architecture Toxicity Analysis Framework},
  author={Simone Corbo},
  year={2025},
  url={https://github.com/simo-corbo/ToxicThesis}
}

Files

  • checkpoints/best.pt - Best model checkpoint (by validation loss)
  • hparams.yaml - Complete hyperparameter configuration
  • train.csv - Training metrics per epoch
  • val.csv - Validation metrics per epoch
  • test.csv - Final test set evaluation (if available)
  • patterns.json - Mined syntactic patterns (decision tree structures)
  • README.md - This documentation

Analysis Files (if generated)

  • predictions.csv - Model predictions on test set
  • word_scores.csv - Word-level toxicity scores
  • word_toxicity_variance.csv - Variance analysis per word
  • word_variance_rank.csv - Ranked words by variance

Contact

For questions, issues, or contributions, please open an issue on the ToxicThesis repository.