herman3996
/

comments_classifier

Text Classification

PyTorch

English

Model card Files Files and versions

xet

Community

herman3996 commited on Feb 19

Commit

f5fddd0

verified ·

1 Parent(s): 400950c

Update README.md

Browse files

Files changed (1) hide show

README.md +87 -10

README.md CHANGED Viewed

@@ -6,19 +6,96 @@ tags:
 - pytorch
 ---
-# Comment Network
-## Setup
-### YouTube Data API Key Setup
-1. Go to [Google Cloud Console - API Credentials](https://console.cloud.google.com/apis/credentials)
-2. Create a new project or select an existing one
-3. Click **Create Credentials** → **API Key**
-4. Copy the generated API key + https://console.developers.google.com/apis/api/youtube.googleapis.com/overview
-5. Create a `.env` file in the project root (or copy from `.env-example`)
-6. Paste your API key in the `.env` file:
 ```
-YOUTUBE_API_KEY=your_api_key_here
 ```

 - pytorch
 ---
+# Comments Classifier (RuBERT fine-tune)
+A Russian-language comment classification model fine-tuned on top of **RuBERT**. Developed as part of the Lubarsky Comments Model project.
+## Overview
+The model was fine-tuned on a labeled dataset of Russian-language comments. Its goal is to automatically determine the category/type of a given comment.
+The repository contains three ready-to-use **standalone applications** built with **PyInstaller** — no Python installation or dependencies required:
+| File | Size | Description |
+|---|---|---|
+| `run_trainer.zip` | ~2.6 GB | Application for fine-tuning the model |
+| `run_prediction.zip` | ~2.5 GB | Application for running predictions |
+| `run_classifier.zip` | ~60 MB | Application for manual comment classification |
+| `QA_dataset.csv` | ~75 kB | Quality assurance dataset |
+---
+## Quick Start
+> ⚠️ **No Python installation required** — all three programs are self-contained `.exe` applications.
+### 1. Download the ZIP archive
+Download one or more archives from this page.
+### 2. Extract the archive
+Extract the downloaded archive to a convenient location. The folder structure will look like this:
 ```
+run_classifier/
+├── _internal/          # internal dependencies (do not modify)
+└── run_classifier.exe  # executable file
 ```
+### 3. Run the `.exe`
+Simply double-click the `.exe` file or launch it from the terminal:
+```bash
+.\run_classifier.exe
+.\run_prediction.exe
+.\run_trainer.exe
+```
+---
+## Application Descriptions
+**`run_classifier`** — a tool for manual or batch comment classification. Useful for quick review and labeling.
+**`run_prediction`** — the main inference application. Takes comments as input and returns predicted classes.
+**`run_trainer`** — fine-tunes the model on new data. Allows you to retrain the classifier on your own dataset.
+---
+## Environment Configuration
+The repository includes a `.env` file with environment variables (e.g., file paths, parameters). Edit it as needed before running the applications.
+---
+## Source Code
+The full source code (training, data labeling, scripts) is available on GitHub:
+👉 [gerageragera39/Lubarsky_Comments_Model](https://github.com/gerageragera39/Lubarsky_Comments_Model)
+Source repository structure:
+- `data_hand_classifier/` — tools for manual data labeling
+- `rubert_trainer/` — RuBERT fine-tuning scripts
+- `dataset.csv` — main training dataset
+- `test_comments.csv` — test set
+- `result.png` — training results visualization
+---
+## Technical Details
+- **Base model:** RuBERT (DeepPavlov)
+- **Framework:** PyTorch + HuggingFace Transformers
+- **Build:** PyInstaller (standalone Windows executables)
+- **Data language:** Russian
+- **Task:** Text Classification
+---
+## License
+MIT License