CommerceExpert v1
Fine-tuned LLMs for Salesforce Commerce Cloud error classification
Three open-source models (Llama-3.2-1B, LFM-2.5-Thinking, Qwen-2.5-7B) trained on curated SFCC error data for real-time error categorization and event classification.
π Overview
CommerceExpert classifies SFCC runtime errors into standard categories. Choose the model that fits your deployment:
- Llama-3.2-1B β Lightweight, edge deployment (Cloudflare Workers, on-device)
- LFM-2.5-Thinking β Reasoning-focused, explainable predictions (audit trails)
- Qwen-2.5-7B β Highest accuracy, GPU-required (cloud, batch processing)
All three models are trained on the same dataset and ready for production integration.
π― Use Cases
β
Real-time SFCC error classification
β
Automated error triage and categorization
β
Integration with alerting systems
β
Event-driven ecommerce analytics
β
Compliance and audit logging (with LFM reasoning output)
π Model Details
| Property | Value |
|---|---|
| Training Data | 1,000 SFCC errors (200 real + 800 synthetic) |
| Eval Data | 300 held-out examples |
| Error Classes | 9 categories (Network, Auth, Quota, Validation, NotFound, Payment, InternalError, BusinessLogic, Integration) |
| Fine-Tune Method | LoRA (r=16, alpha=32) |
| Training Epochs | 1 (smoke test) |
| Format | Alpaca (Llama, Qwen), ChatML (LFM) |
| License | Apache 2.0 |
π Quick Start
Installation
pip install transformers peft unsloth torch
Load & Infer (Llama)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="darleison/CommerceExpert-Llama-v1",
max_seq_length=2048,
load_in_4bit=True,
)
model = FastLanguageModel.for_inference(model)
# Classify an error
error_json = '{"code": "ECONNREFUSED", "message": "Connection refused", "context": "api-gateway"}'
prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Classify the following Salesforce Commerce Cloud runtime error into a standard category.
### Input:
{error_json}
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=16)
result = tokenizer.batch_decode(outputs)[0]
print(result) # β "Network"
Load & Infer (Qwen)
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="darleison/CommerceExpert-Qwen-v1",
max_seq_length=4096,
load_in_4bit=True,
)
# Same inference as above
Load & Infer (LFM - with reasoning)
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="darleison/CommerceExpert-LFM-v1",
max_seq_length=4096,
load_in_4bit=True,
)
# Same inference, longer output includes reasoning chain
outputs = model.generate(**inputs, max_new_tokens=256)
Error Categories
The model predicts one of 9 error classes:
- Network β Connection, DNS, timeout errors
- Authentication β Auth failures, invalid tokens, permission denied
- Quota β Rate limits, API quotas, concurrent limits
- Validation β JSON, schema, required field errors
- NotFound β 404, missing resources
- Payment β Payment processing failures
- InternalError β Server errors, crashes, exceptions
- BusinessLogic β Out of stock, invalid coupon, shipping unavailable
- Integration β Webhook failures, external service errors
π Performance
Accuracy (on 300-example golden test set)
| Model | Baseline | Fine-Tuned | Improvement |
|---|---|---|---|
| Llama-3.2-1B | [INSERT]% | [INSERT]% | [INSERT]+% |
| LFM-2.5-Thinking | [INSERT]% | [INSERT]% | [INSERT]+% |
| Qwen-2.5-7B | [INSERT]% | [INSERT]% | [INSERT]+% |
Results from evaluation on held-out test set. Baseline = zero-shot pre-trained model.
Inference Speed
| Model | Tokens/sec (GPU) | Tokens/sec (CPU) | Model Size |
|---|---|---|---|
| Llama-3.2-1B | ~500 | ~50 | ~650 MB |
| LFM-2.5-Thinking | ~300 | ~30 | ~650 MB |
| Qwen-2.5-7B | ~100 | ~10 | ~3.5 GB |
βοΈ Training Configuration
LoRA:
rank: 16
alpha: 32
dropout: 0
target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
Training:
epochs: 1
batch_size: 2 (per device)
gradient_accumulation_steps: 4
learning_rate: 2e-4
optimizer: adamw_8bit
scheduler: linear
warmup_steps: 5
weight_decay: 0.01
Data:
seed: 42 (reproducible)
train: 1000 examples (800 synthetic, 200 real)
test: 300 examples (held-out)
π¬ Data Generation
Training data expanded from seed dataset using:
- Error Taxonomy β 9 categories with subcategories
- Synthetic Generation β Realistic error patterns from templates
- Cartridge Integration β Error codes from 18 real SFCC integrations (payment, shipping, fraud, etc.)
- B2C Patterns β Real log examples and error formats
Final dataset: balanced class distribution, diverse error contexts, realistic for production SFCC systems.
π¦ Integration with PulsarJS
CommerceExpert integrates as the ML classification layer in PulsarJS, an event-centric knowledge graph platform for ecommerce attribution.
from unsloth import FastLanguageModel
class CommerceExpertClassifier:
def __init__(self, model_name="llama"):
models = {
"llama": "darleison/CommerceExpert-Llama-v1",
"qwen": "darleison/CommerceExpert-Qwen-v1",
"lfm": "darleison/CommerceExpert-LFM-v1",
}
self.model, self.tokenizer = FastLanguageModel.from_pretrained(
model_name=models[model_name],
max_seq_length=4096,
load_in_4bit=True,
)
self.model = FastLanguageModel.for_inference(self.model)
def classify(self, error_dict):
"""Classify an SFCC error."""
prompt = f"""...[format prompt]..."""
inputs = self.tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = self.model.generate(**inputs, max_new_tokens=16)
return self.tokenizer.batch_decode(outputs)[0]
# Usage
classifier = CommerceExpertClassifier("llama")
error = {"code": "E_TIER_QUOTA", "message": "Quota exceeded"}
category = classifier.classify(error) # β "Quota"
β οΈ Limitations
- Single epoch training β Smoke test only. Production use may benefit from multi-epoch training.
- Synthetic data β Training data includes generated examples. Real-world error variations may differ.
- SFCC-specific β Domain-specific to Salesforce Commerce Cloud. Performance on generic errors may be lower.
- Language β English only. Non-English error messages not tested.
- Model selection β Choose based on deployment constraints (edge vs. accuracy vs. explainability trade-off).
π Roadmap
| Version | Focus | Status |
|---|---|---|
| v1.0 | Initial release, 9 error classes | β Live |
| v1.1 | Multi-epoch training, expanded taxonomy | π Planned |
| v2.0 | Event classification, remediation suggestions | π‘ Future |
π Reproducibility
All training code and data generation scripts are included:
src/data/data_generator.pyβ Synthetic error generationsrc/data/cartridge_extractor.pyβ Extract from GitHub cartridgessrc/data/data_validator.pyβ Quality assurancesrc/data/data_pipeline.pyβ Full pipeline orchestrationsrc/eval/eval_compare_models.pyβ Model evaluation
Training notebooks (Colab) available in training/notebooks/.
π Citation
@software{CommerceExpert2026,
title={CommerceExpert v1: Fine-Tuned LLMs for SFCC Error Classification},
author={Filho, Darleison},
year={2026},
month={April},
url={https://huggingface.co/darleison/sfcc-commerce-expert-v1}
}
π License
Apache 2.0 β Free for research, commercial, and derivative use. See LICENSE for details.
π Support
- Issues: GitHub Issues
- Discussions: HuggingFace Discussions
Ready to use. Fully reproducible. Open source.
- Downloads last month
- -
Model tree for DarleisonBarrosFilho/sfcc-commerce-expert-v1
Base model
meta-llama/Llama-3.2-1B-Instruct