File size: 3,010 Bytes
316672e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
title: "python-crfsuite"
type: "resource"
url: "https://github.com/scrapinghub/python-crfsuite"
---

## Overview

python-crfsuite provides Python bindings for the CRFsuite conditional random field toolkit, enabling efficient sequence labeling in Python.

## Key Information

| Field | Value |
|-------|-------|
| **GitHub** | https://github.com/scrapinghub/python-crfsuite |
| **PyPI** | https://pypi.org/project/python-crfsuite/ |
| **Documentation** | https://python-crfsuite.readthedocs.io/ |
| **License** | MIT (python-crfsuite), BSD (CRFsuite) |
| **Latest Version** | 0.9.12 (December 2025) |
| **Stars** | 771 |

## Features

- **Fast Performance**: Faster than official SWIG wrapper
- **No External Dependencies**: CRFsuite bundled; NumPy/SciPy not required
- **Python 2 & 3 Support**: Works with both Python versions
- **Cython-based**: High-performance C++ bindings

## Installation

```bash
# Using pip
pip install python-crfsuite

# Using conda
conda install -c conda-forge python-crfsuite
```

## Usage

### Training

```python
import pycrfsuite

# Create trainer
trainer = pycrfsuite.Trainer(verbose=True)

# Add training data
for xseq, yseq in zip(X_train, y_train):
    trainer.append(xseq, yseq)

# Set parameters
trainer.set_params({
    'c1': 1.0,           # L1 regularization
    'c2': 0.001,         # L2 regularization
    'max_iterations': 100,
    'feature.possible_transitions': True
})

# Train model
trainer.train('model.crfsuite')
```

### Inference

```python
import pycrfsuite

# Load model
tagger = pycrfsuite.Tagger()
tagger.open('model.crfsuite')

# Predict
y_pred = tagger.tag(x_seq)
```

### Feature Format

Features are lists of strings in `name=value` format:

```python
features = [
    ['word=hello', 'pos=NN', 'is_capitalized=True'],
    ['word=world', 'pos=NN', 'is_capitalized=False'],
]
```

## Training Algorithms

| Algorithm | Description |
|-----------|-------------|
| `lbfgs` | Limited-memory BFGS (default) |
| `l2sgd` | SGD with L2 regularization |
| `ap` | Averaged Perceptron |
| `pa` | Passive Aggressive |
| `arow` | Adaptive Regularization of Weights |

## Parameters (L-BFGS)

| Parameter | Default | Description |
|-----------|---------|-------------|
| `c1` | 0 | L1 regularization coefficient |
| `c2` | 1.0 | L2 regularization coefficient |
| `max_iterations` | unlimited | Maximum iterations |
| `num_memories` | 6 | Number of memories for L-BFGS |
| `epsilon` | 1e-5 | Convergence threshold |

## Related Projects

- **sklearn-crfsuite**: Scikit-learn compatible wrapper
- **CRFsuite**: Original C++ implementation

## Citation

```bibtex
@misc{python-crfsuite,
  author = {Scrapinghub},
  title = {python-crfsuite: Python binding to CRFsuite},
  year = {2014},
  publisher = {GitHub},
  url = {https://github.com/scrapinghub/python-crfsuite}
}

@misc{crfsuite,
  author = {Okazaki, Naoaki},
  title = {CRFsuite: A fast implementation of Conditional Random Fields},
  year = {2007},
  url = {http://www.chokkan.org/software/crfsuite/}
}
```