I’m deeply interested in building and understanding modern AI systems, especially Large Language Models (LLMs) and Small Language Models (SLMs).
My focus is on how transformers actually work under the hood — tokenization, attention, training dynamics, evaluation, and optimization techniques like fine-tuning, quantization, and distillation.
I enjoy experimenting with open-source models, training lightweight models on limited resources, and exploring practical setups like RAG pipelines, basic agentic workflows, and prompt engineering grounded in real use cases (not just demos).
Currently learning more about:
LLM evaluation & failure modes
Tokenizers (BPE, byte-level, unigram)
Efficient training with small datasets
Retrieval + generation systems
Applied ML over theoretical hype
Long-term goal is to build useful, reliable, and explainable AI systems, not just chase benchmarks.