File size: 3,979 Bytes

---
library_name: agri-awwer
tags:
  - asr
  - evaluation
  - agriculture
  - metrics
arxiv: "2602.03868"
language:
  - hi
  - te
  - or
---

# Agri AWWER — Agriculture-Weighted Word Error Rate Evaluation Toolkit

A lightweight Python toolkit for evaluating Automatic Speech Recognition (ASR) systems in agricultural domains. Provides the **Agriculture-Weighted Word Error Rate (AWWER)** metric alongside standard metrics (WER, CER, MER).

AWWER penalises errors on domain-critical agricultural terms more heavily than errors on general vocabulary, giving a more realistic picture of how well an ASR system serves agricultural applications.

## Installation

```bash
# From HuggingFace (recommended)
pip install git+https://huggingface.co/DigiGreen/Agri_AWWER_Toolkit

# For improved WER/CER/MER via jiwer
pip install "agri-awwer[jiwer]"
```

**Zero required dependencies** — the toolkit works out of the box with only the Python standard library. `jiwer` is optional and used automatically when available for standard metrics.

## Quick Start

### AWWER — Domain-Weighted Error Rate

```python
from agri_awwer import calculate_awwer

# Define domain word weights (1-4 scale)
weights = {
    "gehun": 4,    # wheat — core agriculture term
    "keet": 4,     # pest
    "mitti": 3,    # soil
    "barish": 3,   # rain
    "gaon": 1,     # village — general vocabulary
}

reference  = "gehun mein keet laga hai"
hypothesis = "gaon mein keet laga hai"

awwer = calculate_awwer(reference, hypothesis, weights)
print(f"AWWER: {awwer:.3f}")
# gehun→gaon is a weight-4 error, so AWWER > standard WER
```

### Standard Metrics

```python
from agri_awwer import calculate_wer, calculate_cer, calculate_mer

ref = "gehun mein keet laga hai"
hyp = "gaon mein keet laga hai"

print(f"WER: {calculate_wer(ref, hyp):.3f}")
print(f"CER: {calculate_cer(ref, hyp):.3f}")
print(f"MER: {calculate_mer(ref, hyp):.3f}")
```

### Detailed AWWER Breakdown

```python
from agri_awwer import calculate_awwer_components

result = calculate_awwer_components(reference, hypothesis, weights)
print(f"AWWER:          {result['awwer']:.3f}")
print(f"Substitutions:  {result['n_substitutions']}")
print(f"Deletions:      {result['n_deletions']}")
print(f"Insertions:     {result['n_insertions']}")
print(f"High-weight errors: {result['high_weight_errors']}")
```

### Parse Weights from JSON

```python
import json
from agri_awwer import calculate_awwer_from_string

weights_json = json.dumps([["gehun", 4], ["keet", 4], ["mitti", 3]])
awwer = calculate_awwer_from_string(ref, hyp, weights_json)
```

## Weight Scale

| Weight | Category | Examples |
|--------|----------|----------|
| **4** | Core agriculture terms | Crop names, pests, farming practices |
| **3** | Strongly agriculture-related | Soil types, weather, planting seasons |
| **2** | Indirectly related | Quantities, measurement units, locations |
| **1** | General vocabulary | Default for words not in the lexicon |

## Language Support

Built-in text normalization for:
- **Hindi** (default) — chandrabindu, visarga, nukta removal
- **Telugu** — candrabindu, visarga removal
- **Odia** — candrabindu, visarga, nukta, isshar removal

Pass the `language` parameter to any metric function:

```python
calculate_awwer(ref, hyp, weights, language="telugu")
calculate_wer(ref, hyp, language="odia")
```

## Related Resources

- **Paper**: [Benchmarking Automatic Speech Recognition for Indian Languages in Agricultural Contexts](https://arxiv.org/abs/2602.03868)
- **Dataset**: [Agri STT Benchmarking Dataset](https://huggingface.co/datasets/DigiGreen/Agri_STT_Benchmarking_Dataset) — 10,864 audio-transcript pairs across Hindi, Telugu, and Odia

## Citation

```bibtex
@misc{digigreen2025awwer,
  title   = {Agri {AWWER}: Agriculture-Weighted Word Error Rate Evaluation Toolkit},
  author  = {{Digital Green}},
  year    = {2025},
  url     = {https://huggingface.co/DigiGreen/Agri_AWWER_Toolkit},
}
```

## License

Apache 2.0