--- title: README emoji: šŸ‘ colorFrom: blue colorTo: indigo sdk: static pinned: false --- # ShallowMind - Just like DeepMind, but way more stupid🧠 Hi there! My name is Alessandro, i'm a ai research engineer. ShallowMind is my workspace for training and experimenting with language models. The name is playful, but the goal is straightforward: to build increasingly capable models while exploring new ideas in pretraining and reasoning. --- ## Research Interests - **Information-theoretic pretraining** Looking at ways to identify and prioritize the most informative tokens, to see whether current scaling laws can be adjusted. (Work in progress — I’ll share results once experiments are further along.) - **Reasoning models** Testing approaches that improve step-by-step and compositional reasoning. - **Architectural variations** Extending my training pipeline to support Mixture-of-Experts (MoE) and other non-standard components. --- ## Current Work - Built a **custom pre-training pipeline** and pre-trained a first model from scratch (~1B scale) as a proof of concept. - Iterating on the pipeline to add **MoE layers** and **information-gain–based logic**. - Next steps: - Fine-tune the first model into **Promptasaurus-Zero**. - Train **Blahblahthron-7B** as a larger-scale follow-up experiment. --- ## Roadmap - Share ablations and code from early experiments. - Scale training to larger models. - Document results on token selection and reasoning tasks. ---