Post
1448
Couple months ago I fine‑tuned Qwen3 Embeddings with LoRA on the LSPC dataset. This time I went the opposite way: a small, task‑specific 80M encoder with bidirectional attention, trained end‑to‑end. It outperforms the Qwen3 LoRA baseline on the same data (0.9315 macro‑F1 vs 0.8360). Details and code: https://blog.ivan.digital/beating-qwen3-lora-with-a-tiny-pytorch-encoder-on-the-large-scale-product-corpus-afe536de205f