Initialize rope embeddings properly for the entropy model (#72) 0da051f unverified Srinivasan Iyer sviyer commited on Feb 25, 2025
disable reshard after forward (#56) 9d907fe unverified Srinivasan Iyer sviyer commited on Feb 13, 2025
Changes for training entropy model and correcting attention in local models (#25) 6ffeb66 unverified par-meta commited on Jan 17, 2025