Thought on architecture
#6
by
sometimesanotion - opened
This is already one of my favorite models for its size! Could a derivative of it have dense layers appended to the head, to yield a model that can easily be finetuned without expert collapse? I see there are two dense layers at the foot.
Is that a sensible strategy for a model deployed on laptops and workstations? Is LiquidAI interested in making such models?
sometimesanotion changed discussion status to
closed