README / README.md
poperskop's picture
Update README.md
58da1ea verified
---
title: README
emoji: 🦀
colorFrom: red
colorTo: red
sdk: static
pinned: false
---
## Description
Our goal was to create a Proof of Concept (PoC) solution for matching messages from Telegram marketplaces.
There are two models that we developed:
- **RoSBERTa-hermes-ru**: Trained for **location recognition**, **categories labeling**, and **inside-outside location classification**.
- **rubert-tiny-separater**: Trained for **supply and demand** classification.
## Architecture and Pretraining
### [RoSBERTa-hermes-ru](https://huggingface.co/poc-embeddings/RoSBERTa-hermes-ru)
RoSBERTa is based on [ai-forever/ru-en-RoSBERTa](https://huggingface.co/ai-forever/ru-en-RoSBERTa) with multiple heads for downstream tasks:
- **Backbone**: Fully unfrozen, with the **NER head** fine-tuned for location recognition.
- **Allocator head**: Trained to determine whether or not a message contains the actual location of the user.
- **Tags head with 1 layer of adapter**: Trained to mark messages with different categories describing the message's context, such as tools, medicine, clothing, and more.
### [rubert-tiny-separater](https://huggingface.co/poc-embeddings/rubert-tiny-separater)
Rubert is based on [sergeyzh/rubert-tiny-turbo](https://huggingface.co/sergeyzh/rubert-tiny-turbo) with a linear layer on top. The whole model was trained for classifying message types from Telegram marketplaces.
**Labels**:
- **Supply**: Somebody willing to sell something or provide a service.
- **Demand**: Somebody wants to buy something or hire someone.
- **Noise**: Messages unrelated to the topic.
## Supported Languages
Russian, with English included.