Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

PRISM: Demystifying Retention and Interaction in Mid-Training

community
https://bharat-runwal.github.io/PRISM/
Activity Feed

AI & ML interests

Mid-Training

Bharat Runwal's profile picture Ashish Sunil Agrawal's profile picture Rameswar Panda's profile picture

ashish23 
authored 2 papers 5 months ago

Translation Errors Significantly Impact Low-Resource Languages in Cross-Lingual Learning

Paper • 2402.02080 • Published Feb 3, 2024 • 2

TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments

Paper • 2510.01179 • Published Oct 1, 2025 • 27
rpand002 
authored a paper over 1 year ago

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Paper • 2406.12034 • Published Jun 17, 2024 • 16
rpand002 
authored a paper almost 2 years ago

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33
bharatR 
authored 3 papers about 2 years ago

Reprogramming under constraints: Revisiting efficient and reliable transferability of lottery tickets

Paper • 2308.14969 • Published Aug 29, 2023

APP: Anytime Progressive Pruning

Paper • 2204.01640 • Published Apr 4, 2022

From PEFT to DEFT: Parameter Efficient Finetuning for Reducing Activation Density in Transformers

Paper • 2402.01911 • Published Feb 2, 2024 • 2
rpand002 
authored a paper about 2 years ago

Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 25
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs