Update gated prompt

75b1aab verified 24 days ago

5.11 kB

	---
	model-index:
	- name: poltextlab/media2-25-26-v1-1001
	results:
	- task:
	type: text-classification
	metrics:
	- name: Accuracy
	type: accuracy
	value: 71%
	- name: F1-Score
	type: f1
	value: 70%
	tags:
	- text-classification
	- transformers
	- roberta
	metrics:
	- accuracy
	- f1_score
	language:
	- en
	base_model:
	- xlm-roberta-large
	pipeline_tag: text-classification
	library_name: transformers
	license: cc-by-4.0
	extra_gated_prompt: Our models are intended for academic projects and academic research
	only. If you are not affiliated with an academic institution, please reach out to
	us at huggingface [at] poltextlab [dot] com for further inquiry. If we cannot clearly
	determine your academic affiliation and use case based on your form data, your request
	may be rejected. Please allow us a few business days to manually review subscriptions.
	extra_gated_fields:
	Country: country
	Institution: text
	Institution Email: text
	Full Name: text
	Please specify your academic project/use case you want to use the models for: text
	---

	# media2-25-26-v1-1001

	This model uses the poltextLAB Media2 codebook built on top of the CAP codebook.


	# How to use the model

	```python
	from transformers import AutoTokenizer, pipeline

	tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-large")
	pipe = pipeline(
	model="poltextlab/media2-25-26-v1-1001",
	task="text-classification",
	tokenizer=tokenizer,
	use_fast=False,
	token="<your_hf_read_only_token>"
	)

	text = "<text_to_classify>"
	pipe(text)
	```


	# Classification Report

	## Overall Performance:

	Evaluated on a test set of 1601 English samples.

	* Accuracy: 71%
	* Macro Avg: Precision: 0.67, Recall: 0.62, F1-score: 0.62
	* Weighted Avg: Precision: 0.74, Recall: 0.71, F1-score: 0.70

	## Per-Class Metrics:

	\| Label \| Precision \| Recall \| F1-score \| Support \|
	\|--------:\|------------:\|---------:\|-----------:\|----------:\|
	\| 1 \| 0.77 \| 0.8 \| 0.78 \| 50 \|
	\| 2 \| 0.74 \| 0.78 \| 0.76 \| 50 \|
	\| 3 \| 0.74 \| 0.74 \| 0.74 \| 50 \|
	\| 4 \| 0.7 \| 0.86 \| 0.77 \| 50 \|
	\| 5 \| 0.86 \| 0.76 \| 0.81 \| 50 \|
	\| 6 \| 0.83 \| 0.98 \| 0.9 \| 50 \|
	\| 7 \| 0.85 \| 0.88 \| 0.86 \| 50 \|
	\| 8 \| 0.87 \| 0.94 \| 0.9 \| 50 \|
	\| 9 \| 0.87 \| 0.82 \| 0.85 \| 50 \|
	\| 10 \| 0.77 \| 0.94 \| 0.85 \| 50 \|
	\| 12 \| 0.56 \| 0.88 \| 0.69 \| 50 \|
	\| 13 \| 0.88 \| 0.86 \| 0.87 \| 50 \|
	\| 14 \| 0.73 \| 0.76 \| 0.75 \| 50 \|
	\| 15 \| 0.51 \| 0.86 \| 0.64 \| 50 \|
	\| 16 \| 0.75 \| 0.86 \| 0.8 \| 50 \|
	\| 17 \| 0.63 \| 0.76 \| 0.69 \| 50 \|
	\| 18 \| 0.91 \| 0.82 \| 0.86 \| 50 \|
	\| 19 \| 0.51 \| 0.82 \| 0.63 \| 50 \|
	\| 20 \| 0.62 \| 0.92 \| 0.74 \| 50 \|
	\| 21 \| 0.75 \| 0.8 \| 0.78 \| 50 \|
	\| 23 \| 0.52 \| 0.78 \| 0.62 \| 50 \|
	\| 24 \| 0.71 \| 0.57 \| 0.63 \| 42 \|
	\| 25 \| 0.92 \| 0.48 \| 0.63 \| 23 \|
	\| 26 \| 0.92 \| 0.56 \| 0.7 \| 43 \|
	\| 27 \| 0 \| 0 \| 0 \| 18 \|
	\| 28 \| 0 \| 0 \| 0 \| 9 \|
	\| 29 \| 0.43 \| 0.27 \| 0.33 \| 33 \|
	\| 30 \| 0.72 \| 0.28 \| 0.41 \| 46 \|
	\| 31 \| 0.89 \| 0.44 \| 0.59 \| 36 \|
	\| 32 \| 0 \| 0 \| 0 \| 20 \|
	\| 33 \| 0.12 \| 0.08 \| 0.1 \| 12 \|
	\| 34 \| 0.07 \| 0.14 \| 0.1 \| 7 \|
	\| 35 \| 0.93 \| 0.71 \| 0.81 \| 35 \|
	\| 36 \| 0 \| 0 \| 0 \| 3 \|
	\| 37 \| 1 \| 0.82 \| 0.9 \| 44 \|
	\| 38 \| 0.81 \| 0.81 \| 0.81 \| 42 \|
	\| 39 \| 1 \| 0.39 \| 0.57 \| 33 \|
	\| 40 \| 0.88 \| 0.21 \| 0.34 \| 33 \|
	\| 41 \| 1 \| 0.78 \| 0.88 \| 32 \|
	\| 998 \| 0.92 \| 0.55 \| 0.69 \| 40 \|

	# Inference platform
	This model is used by the [CAP Babel Machine](https://babel.poltextlab.com), an open-source and free natural language processing tool, designed to simplify and speed up projects for comparative research.

	# Cooperation
	Model performance can be significantly improved by extending our training sets. We appreciate every submission of CAP-coded corpora (of any domain and language) at poltextlab{at}poltextlab{dot}com or by using the [CAP Babel Machine](https://babel.poltextlab.com).
	## Debugging and issues
	This architecture uses the `sentencepiece` tokenizer. In order to run the model before `transformers==4.27` you need to install it manually.