Can a single 1.1B-parameter LLM be cheaply specialised per English variety with LoRA adapters?
Omkar fine-tuned TinyLlama-1.1B-Chat-v1.0 with a separate LoRA adapter for each English variety in the
BESSTIE sarcasm dataset (en-AU, en-IN, en-UK) and evaluated every adapter
across all three test sets, producing a full 3×3 cross-variety evaluation matrix.
r=16, alpha=32, dropout=0.05, all-linear target modules."yes" or "no" to a single sarcasm question, with completion-only loss applied to that one label token.WeightedRandomSampler rebalances training exposure without duplicating rows; evaluation uses a validation-tuned threshold on logit_yes − logit_no.Macro-F1 / Macro-P / Macro-R reported as mean ± std over 3 seeds. Rows are the variety the adapter was trained on; the test variety changes within each row.
On its own dialect, the en-UK adapter leads with Macro-F1 0.7724 ± 0.0088, closely
followed by en-AU at 0.7603 ± 0.0291. The en-IN adapter lags at
0.5964 ± 0.0817 with a much wider seed spread — sarcasm in Indian English is the hardest of
the three for this 1.1B-parameter base.
Cross-variety transfer collapses just like in Ryan's RoBERTa sarcasm matrix: every off-diagonal cell drops well
below the same-dialect score, with the worst pair (en-AU → en-IN) falling to
0.50. Sarcasm cues clearly remain dialect-specific even when the underlying model is a much
larger LLM.