Spaces:

ShallowMind-abeat
/

README

Running

README / README.md

Update README.md

eb0f766 verified 5 months ago

1.52 kB

	---
	title: README
	emoji: 👁
	colorFrom: blue
	colorTo: indigo
	sdk: static
	pinned: false
	---

	# ShallowMind - Just like DeepMind, but way more stupid🧠

	Hi there! My name is Alessandro, i'm a ai research engineer.
	ShallowMind is my workspace for training and experimenting with language models.
	The name is playful, but the goal is straightforward: to build increasingly capable models while exploring new ideas in pretraining and reasoning.

	---

	## Research Interests

	- Information-theoretic pretraining
	Looking at ways to identify and prioritize the most informative tokens, to see whether current scaling laws can be adjusted. (Work in progress — I’ll share results once experiments are further along.)

	- Reasoning models
	Testing approaches that improve step-by-step and compositional reasoning.

	- Architectural variations
	Extending my training pipeline to support Mixture-of-Experts (MoE) and other non-standard components.

	---

	## Current Work

	- Built a custom pre-training pipeline and pre-trained a first model from scratch (~1B scale) as a proof of concept.
	- Iterating on the pipeline to add MoE layers and information-gain–based logic.
	- Next steps:
	- Fine-tune the first model into Promptasaurus-Zero.
	- Train Blahblahthron-7B as a larger-scale follow-up experiment.

	---

	## Roadmap

	- Share ablations and code from early experiments.
	- Scale training to larger models.
	- Document results on token selection and reasoning tasks.

	---