Request: DOI
#18
by
alesvigne - opened
I want to try this model
i wouldn't. i trained this a lot, and it seemed very static, and resistant to updating memory. my training set may not have been appropriate, but I've moved onto Qwen-2.5, and it trains well in contrast to distilbert2, understands context better, and responses are meaty and relevant.