Upload toxicthesis-deepseek-tree-classification-4 model

baf1ee9 verified 9 days ago

4.72 kB

metadata

license: mit
tags:
  - toxicity-detection
  - tree
  - deepseek
  - pytorch-lightning
datasets:
  - simocorbo/toxicthesis-deepseek-dataset
language:
  - en

ToxicThesis: TreeLSTM Model for Deepseek

This model is part of the ToxicThesis framework for analyzing toxicity in text using multiple neural architectures.

Model Details

Architecture: TreeLSTM
System Under Test (SUT): deepseek
Task: Classification (4 classes)
Loss Function: Cross-Entropy
Framework: PyTorch Lightning
Input: Text strings
Output: Class probabilities (4 classes)

Training Data

This model was trained on the deepseek dataset, which consists of text samples labeled for toxicity. The training process involved:

Preprocessing and tokenization appropriate for the architecture
Data augmentation and balancing techniques
Validation-based early stopping
Hyperparameter tuning via grid/random search

Usage

Installation

pip install torch huggingface_hub stanza numpy

Download and Load

from huggingface_hub import hf_hub_download
import torch
import stanza

# Download checkpoint
checkpoint_path = hf_hub_download(
    repo_id="simocorbo/toxicthesis-deepseek-tree-classification-4",
    filename="checkpoints/best.pt"
)

# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location='cpu')

# Initialize Stanza for constituency parsing
stanza.download('en')
nlp = stanza.Pipeline('en', processors='tokenize,pos,constituency')

# Note: Full model reconstruction requires the ToxicThesis repository
# Clone: git clone https://github.com/simo-corbo/ToxicThesis
# Then import the appropriate model class

Predict

# This model requires constituency parse trees
# See the ToxicThesis repository for complete usage:
# https://github.com/simo-corbo/ToxicThesis

# Basic usage pattern:
text = "This is a sample text"
doc = nlp(text)
# Parse tree construction and model inference
# requires the full ToxicThesis codebase

Output Interpretation

Classification output: Probabilities for 4 toxicity classes
Threshold for binary decisions can be adjusted based on your use case
Consider the trade-off between precision and recall when setting thresholds

Limitations

Model performance may degrade on out-of-distribution data
Bias may exist based on the training data characteristics
Context-dependent toxicity may not always be captured accurately
Performance varies across different demographic groups and topics

Ethical Considerations

This model is designed for toxicity detection research and should be used responsibly:

Do not use for automated censorship without human oversight
Be aware of potential biases in toxicity detection
Consider the impact on free speech and expression
Use in combination with human moderation for production systems

Training Details

This model was trained as part of the ToxicThesis framework comparing multiple architectures:

RNTN (Recursive Neural Tensor Networks): Compositional semantics via parse trees
TreeLSTM: Tree-structured LSTM networks for hierarchical processing
Linear: FastText embeddings + logistic regression baseline
RoBERTa: Transformer-based pre-trained language model

Hyperparameters

See hparams.yaml for complete training configuration including:

Learning rate and optimizer settings
Batch size and number of epochs
Architecture-specific parameters
Regularization and dropout rates

Repository

Full code and training scripts: ToxicThesis

For complete usage examples and model reconstruction code, please refer to the repository.

Citation

@software{toxicthesis2025,
  title={ToxicThesis: Multi-Architecture Toxicity Analysis Framework},
  author={Simone Corbo},
  year={2025},
  url={https://github.com/simo-corbo/ToxicThesis}
}

Files

checkpoints/best.pt - Best model checkpoint (by validation loss)
hparams.yaml - Complete hyperparameter configuration
train.csv - Training metrics per epoch
val.csv - Validation metrics per epoch
test.csv - Final test set evaluation (if available)
patterns.json - Mined syntactic patterns (decision tree structures)
README.md - This documentation

Analysis Files (if generated)

predictions.csv - Model predictions on test set
word_scores.csv - Word-level toxicity scores
word_toxicity_variance.csv - Variance analysis per word
word_variance_rank.csv - Ranked words by variance

Contact

For questions, issues, or contributions, please open an issue on the ToxicThesis repository.