metadata
title: python-crfsuite
type: resource
url: https://github.com/scrapinghub/python-crfsuite
Overview
python-crfsuite provides Python bindings for the CRFsuite conditional random field toolkit, enabling efficient sequence labeling in Python.
Key Information
| Field | Value |
|---|---|
| GitHub | https://github.com/scrapinghub/python-crfsuite |
| PyPI | https://pypi.org/project/python-crfsuite/ |
| Documentation | https://python-crfsuite.readthedocs.io/ |
| License | MIT (python-crfsuite), BSD (CRFsuite) |
| Latest Version | 0.9.12 (December 2025) |
| Stars | 771 |
Features
- Fast Performance: Faster than official SWIG wrapper
- No External Dependencies: CRFsuite bundled; NumPy/SciPy not required
- Python 2 & 3 Support: Works with both Python versions
- Cython-based: High-performance C++ bindings
Installation
# Using pip
pip install python-crfsuite
# Using conda
conda install -c conda-forge python-crfsuite
Usage
Training
import pycrfsuite
# Create trainer
trainer = pycrfsuite.Trainer(verbose=True)
# Add training data
for xseq, yseq in zip(X_train, y_train):
trainer.append(xseq, yseq)
# Set parameters
trainer.set_params({
'c1': 1.0, # L1 regularization
'c2': 0.001, # L2 regularization
'max_iterations': 100,
'feature.possible_transitions': True
})
# Train model
trainer.train('model.crfsuite')
Inference
import pycrfsuite
# Load model
tagger = pycrfsuite.Tagger()
tagger.open('model.crfsuite')
# Predict
y_pred = tagger.tag(x_seq)
Feature Format
Features are lists of strings in name=value format:
features = [
['word=hello', 'pos=NN', 'is_capitalized=True'],
['word=world', 'pos=NN', 'is_capitalized=False'],
]
Training Algorithms
| Algorithm | Description |
|---|---|
lbfgs |
Limited-memory BFGS (default) |
l2sgd |
SGD with L2 regularization |
ap |
Averaged Perceptron |
pa |
Passive Aggressive |
arow |
Adaptive Regularization of Weights |
Parameters (L-BFGS)
| Parameter | Default | Description |
|---|---|---|
c1 |
0 | L1 regularization coefficient |
c2 |
1.0 | L2 regularization coefficient |
max_iterations |
unlimited | Maximum iterations |
num_memories |
6 | Number of memories for L-BFGS |
epsilon |
1e-5 | Convergence threshold |
Related Projects
- sklearn-crfsuite: Scikit-learn compatible wrapper
- CRFsuite: Original C++ implementation
Citation
@misc{python-crfsuite,
author = {Scrapinghub},
title = {python-crfsuite: Python binding to CRFsuite},
year = {2014},
publisher = {GitHub},
url = {https://github.com/scrapinghub/python-crfsuite}
}
@misc{crfsuite,
author = {Okazaki, Naoaki},
title = {CRFsuite: A fast implementation of Conditional Random Fields},
year = {2007},
url = {http://www.chokkan.org/software/crfsuite/}
}