--- title: TorchCode emoji: ๐Ÿ”ฅ colorFrom: red colorTo: yellow sdk: docker app_port: 7860 pinned: false ---
# ๐Ÿ”ฅ TorchCode **Crack the PyTorch interview.** Practice implementing operators and architectures from scratch โ€” the exact skills top ML teams test for. *Like LeetCode, but for tensors. Self-hosted. Jupyter-based. Instant feedback.* [![PyTorch](https://img.shields.io/badge/PyTorch-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white)](https://pytorch.org) [![Jupyter](https://img.shields.io/badge/Jupyter-F37626?style=for-the-badge&logo=jupyter&logoColor=white)](https://jupyter.org) [![Docker](https://img.shields.io/badge/Docker-2496ED?style=for-the-badge&logo=docker&logoColor=white)](https://www.docker.com) [![Python](https://img.shields.io/badge/Python_3.11-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://python.org) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow?style=for-the-badge)](LICENSE) [![GitHub stars](https://img.shields.io/github/stars/duoan/TorchCode?style=social)](https://github.com/duoan/TorchCode) [![GitHub Container Registry](https://img.shields.io/badge/ghcr.io-TorchCode-blue?style=flat-square&logo=github)](https://ghcr.io/duoan/torchcode) [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Spaces-TorchCode-blue?style=flat-square)](https://huggingface.co/spaces/duoan/TorchCode) ![Problems](https://img.shields.io/badge/problems-40-orange?style=flat-square) ![GPU](https://img.shields.io/badge/GPU-not%20required-brightgreen?style=flat-square) [![Star History Chart](https://api.star-history.com/svg?repos=duoan/TorchCode&type=Date)](https://star-history.com/#duoan/TorchCode&Date)
--- ## ๐ŸŽฏ Why TorchCode? Top companies (Meta, Google DeepMind, OpenAI, etc.) expect ML engineers to implement core operations **from memory on a whiteboard**. Reading papers isn't enough โ€” you need to write `softmax`, `LayerNorm`, `MultiHeadAttention`, and full Transformer blocks code. TorchCode gives you a **structured practice environment** with: | | Feature | | |---|---|---| | ๐Ÿงฉ | **40 curated problems** | The most frequently asked PyTorch interview topics | | โš–๏ธ | **Automated judge** | Correctness checks, gradient verification, and timing | | ๐ŸŽจ | **Instant feedback** | Colored pass/fail per test case, just like competitive programming | | ๐Ÿ’ก | **Hints when stuck** | Nudges without full spoilers | | ๐Ÿ“– | **Reference solutions** | Study optimal implementations after your attempt | | ๐Ÿ“Š | **Progress tracking** | What you've solved, best times, and attempt counts | | ๐Ÿ”„ | **One-click reset** | Toolbar button to reset any notebook back to its blank template โ€” practice the same problem as many times as you want | | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](#) | **Open in Colab** | Every notebook has an "Open in Colab" badge + toolbar button โ€” run problems in Google Colab with zero setup | No cloud. No signup. No GPU needed. Just `make run` โ€” or try it instantly on Hugging Face. --- ## ๐Ÿš€ Quick Start ### Option 0 โ€” Try it online (zero install) **[Launch on Hugging Face Spaces](https://huggingface.co/spaces/duoan/TorchCode)** โ€” opens a full JupyterLab environment in your browser. Nothing to install. Or open any problem directly in Google Colab โ€” every notebook has an [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/duoan/TorchCode/blob/master/templates/01_relu.ipynb) badge. ### Option 0b โ€” Use the judge in Colab (pip) In Google Colab, install the judge from PyPI so you can run `check(...)` without cloning the repo: ```bash !pip install torch-judge ``` Then in a notebook cell: ```python from torch_judge import check, status, hint, reset_progress status() # list all problems and your progress check("relu") # run tests for the "relu" task hint("relu") # show a hint ``` ### Option 1 โ€” Pull the pre-built image (fastest) ```bash docker run -p 8888:8888 -e PORT=8888 ghcr.io/duoan/torchcode:latest ``` ### Option 2 โ€” Build locally ```bash make run ``` Open **** โ€” that's it. Works with both Docker and Podman (auto-detected). --- ## ๐Ÿ“‹ Problem Set > **Frequency**: ๐Ÿ”ฅ = very likely in interviews, โญ = commonly asked, ๐Ÿ’ก = emerging / differentiator ### ๐Ÿงฑ Fundamentals โ€” "Implement X from scratch" The bread and butter of ML coding interviews. You'll be asked to write these without `torch.nn`. | # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts | |:---:|---------|----------------------|:----------:|:----:|--------------| | 1 | ReLU Open In Colab | `relu(x)` | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | ๐Ÿ”ฅ | Activation functions, element-wise ops | | 2 | Softmax Open In Colab | `my_softmax(x, dim)` | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | ๐Ÿ”ฅ | Numerical stability, exp/log tricks | | 16 | Cross-Entropy Loss Open In Colab | `cross_entropy_loss(logits, targets)` | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | ๐Ÿ”ฅ | Log-softmax, logsumexp trick | | 17 | Dropout Open In Colab | `MyDropout` (nn.Module) | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | ๐Ÿ”ฅ | Train/eval mode, inverted scaling | | 18 | Embedding Open In Colab | `MyEmbedding` (nn.Module) | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | ๐Ÿ”ฅ | Lookup table, `weight[indices]` | | 19 | GELU Open In Colab | `my_gelu(x)` | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | โญ | Gaussian error linear unit, `torch.erf` | | 20 | Kaiming Init Open In Colab | `kaiming_init(weight)` | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | โญ | `std = sqrt(2/fan_in)`, variance scaling | | 21 | Gradient Clipping Open In Colab | `clip_grad_norm(params, max_norm)` | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | โญ | Norm-based clipping, direction preservation | | 31 | Gradient Accumulation Open In Colab | `accumulated_step(model, opt, ...)` | ![Easy](https://img.shields.io/badge/Easy-4CAF50?style=flat-square) | ๐Ÿ’ก | Micro-batching, loss scaling | | 40 | Linear Regression Open In Colab | `LinearRegression` (3 methods) | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | ๐Ÿ”ฅ | Normal equation, GD from scratch, nn.Linear | | 3 | Linear Layer Open In Colab | `SimpleLinear` (nn.Module) | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | ๐Ÿ”ฅ | `y = xW^T + b`, Kaiming init, `nn.Parameter` | | 4 | LayerNorm Open In Colab | `my_layer_norm(x, ฮณ, ฮฒ)` | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | ๐Ÿ”ฅ | Normalization, running stats, affine transform | | 7 | BatchNorm Open In Colab | `my_batch_norm(x, ฮณ, ฮฒ)` | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | โญ | Batch vs layer statistics, train/eval behavior | | 8 | RMSNorm Open In Colab | `rms_norm(x, weight)` | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | โญ | LLaMA-style norm, simpler than LayerNorm | | 15 | SwiGLU MLP Open In Colab | `SwiGLUMLP` (nn.Module) | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | โญ | Gated FFN, `SiLU(gate) * up`, LLaMA/Mistral-style | | 22 | Conv2d Open In Colab | `my_conv2d(x, weight, ...)` | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | ๐Ÿ”ฅ | Convolution, unfold, stride/padding | ### ๐Ÿง  Attention Mechanisms โ€” The heart of modern ML interviews If you're interviewing for any role touching LLMs or Transformers, expect at least one of these. | # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts | |:---:|---------|----------------------|:----------:|:----:|--------------| | 23 | Cross-Attention Open In Colab | `MultiHeadCrossAttention` (nn.Module) | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | โญ | Encoder-decoder, Q from decoder, K/V from encoder | | 5 | Scaled Dot-Product Attention Open In Colab | `scaled_dot_product_attention(Q, K, V)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ”ฅ | `softmax(QK^T/โˆšd_k)V`, the foundation of everything | | 6 | Multi-Head Attention Open In Colab | `MultiHeadAttention` (nn.Module) | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ”ฅ | Parallel heads, split/concat, projection matrices | | 9 | Causal Self-Attention Open In Colab | `causal_attention(Q, K, V)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ”ฅ | Autoregressive masking with `-inf`, GPT-style | | 10 | Grouped Query Attention Open In Colab | `GroupQueryAttention` (nn.Module) | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | โญ | GQA (LLaMA 2), KV sharing across heads | | 11 | Sliding Window Attention Open In Colab | `sliding_window_attention(Q, K, V, w)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | โญ | Mistral-style local attention, O(nยทw) complexity | | 12 | Linear Attention Open In Colab | `linear_attention(Q, K, V)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ’ก | Kernel trick, `ฯ†(Q)(ฯ†(K)^TV)`, O(nยทdยฒ) | | 14 | KV Cache Attention Open In Colab | `KVCacheAttention` (nn.Module) | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ”ฅ | Incremental decoding, cache K/V, prefill vs decode | | 24 | RoPE Open In Colab | `apply_rope(q, k)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ”ฅ | Rotary position embedding, relative position via rotation | | 25 | Flash Attention Open In Colab | `flash_attention(Q, K, V, block_size)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ’ก | Tiled attention, online softmax, memory-efficient | ### ๐Ÿ—๏ธ Architecture & Adaptation โ€” Put it all together | # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts | |:---:|---------|----------------------|:----------:|:----:|--------------| | 26 | LoRA Open In Colab | `LoRALinear` (nn.Module) | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | โญ | Low-rank adaptation, frozen base + `BA` update | | 27 | ViT Patch Embedding Open In Colab | `PatchEmbedding` (nn.Module) | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | ๐Ÿ’ก | Image โ†’ patches โ†’ linear projection | | 13 | GPT-2 Block Open In Colab | `GPT2Block` (nn.Module) | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | โญ | Pre-norm, causal MHA + MLP (4x, GELU), residual connections | | 28 | Mixture of Experts Open In Colab | `MixtureOfExperts` (nn.Module) | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | โญ | Mixtral-style, top-k routing, expert MLPs | ### โš™๏ธ Training & Optimization | # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts | |:---:|---------|----------------------|:----------:|:----:|--------------| | 29 | Adam Optimizer Open In Colab | `MyAdam` | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | โญ | Momentum + RMSProp, bias correction | | 30 | Cosine LR Scheduler Open In Colab | `cosine_lr_schedule(step, ...)` | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | โญ | Linear warmup + cosine annealing | ### ๐ŸŽฏ Inference & Decoding | # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts | |:---:|---------|----------------------|:----------:|:----:|--------------| | 32 | Top-k / Top-p Sampling Open In Colab | `sample_top_k_top_p(logits, ...)` | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | ๐Ÿ”ฅ | Nucleus sampling, temperature scaling | | 33 | Beam Search Open In Colab | `beam_search(log_prob_fn, ...)` | ![Medium](https://img.shields.io/badge/Medium-FF9800?style=flat-square) | ๐Ÿ”ฅ | Hypothesis expansion, pruning, eos handling | | 34 | Speculative Decoding Open In Colab | `speculative_decode(target, draft, ...)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ’ก | Accept/reject, draft model acceleration | ### ๐Ÿ”ฌ Advanced โ€” Differentiators | # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts | |:---:|---------|----------------------|:----------:|:----:|--------------| | 35 | BPE Tokenizer Open In Colab | `SimpleBPE` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ’ก | Byte-pair encoding, merge rules, subword splits | | 36 | INT8 Quantization Open In Colab | `Int8Linear` (nn.Module) | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ’ก | Per-channel quantize, scale/zero-point, buffer vs param | | 37 | DPO Loss Open In Colab | `dpo_loss(chosen, rejected, ...)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ’ก | Direct preference optimization, alignment training | | 38 | GRPO Loss Open In Colab | `grpo_loss(logps, rewards, group_ids, eps)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ’ก | Group relative policy optimization, RLAIF, within-group normalized advantages | | 39 | PPO Loss Open In Colab | `ppo_loss(new_logps, old_logps, advantages, clip_ratio)` | ![Hard](https://img.shields.io/badge/Hard-F44336?style=flat-square) | ๐Ÿ’ก | PPO clipped surrogate loss, policy gradient, trust region | --- ## โš™๏ธ How It Works Each problem has **two** notebooks: | File | Purpose | |------|---------| | `01_relu.ipynb` | โœ๏ธ Blank template โ€” write your code here | | `01_relu_solution.ipynb` | ๐Ÿ“– Reference solution โ€” check when stuck | ### Workflow ```text 1. Open a blank notebook โ†’ Read the problem description 2. Implement your solution โ†’ Use only basic PyTorch ops 3. Debug freely โ†’ print(x.shape), check gradients, etc. 4. Run the judge cell โ†’ check("relu") 5. See instant colored feedback โ†’ โœ… pass / โŒ fail per test case 6. Stuck? Get a nudge โ†’ hint("relu") 7. Review the reference solution โ†’ 01_relu_solution.ipynb 8. Click ๐Ÿ”„ Reset in the toolbar โ†’ Blank slate โ€” practice again! ``` ### In-Notebook API ```python from torch_judge import check, hint, status check("relu") # Judge your implementation hint("causal_attention") # Get a hint without full spoiler status() # Progress dashboard โ€” solved / attempted / todo ``` --- ## ๐Ÿ“… Suggested Study Plan > **Total: ~12โ€“16 hours spread across 3โ€“4 weeks. Perfect for interview prep on a deadline.** | Week | Focus | Problems | Time | |:----:|-------|----------|:----:| | **1** | ๐Ÿงฑ Foundations | ReLU โ†’ Softmax โ†’ CE Loss โ†’ Dropout โ†’ Embedding โ†’ GELU โ†’ Linear โ†’ LayerNorm โ†’ BatchNorm โ†’ RMSNorm โ†’ SwiGLU MLP โ†’ Conv2d | 2โ€“3 hrs | | **2** | ๐Ÿง  Attention Deep Dive | SDPA โ†’ MHA โ†’ Cross-Attn โ†’ Causal โ†’ GQA โ†’ KV Cache โ†’ Sliding Window โ†’ RoPE โ†’ Linear Attn โ†’ Flash Attn | 3โ€“4 hrs | | **3** | ๐Ÿ—๏ธ Architecture + Training | GPT-2 Block โ†’ LoRA โ†’ MoE โ†’ ViT Patch โ†’ Adam โ†’ Cosine LR โ†’ Grad Clip โ†’ Grad Accumulation โ†’ Kaiming Init | 3โ€“4 hrs | | **4** | ๐ŸŽฏ Inference + Advanced | Top-k/p Sampling โ†’ Beam Search โ†’ Speculative Decoding โ†’ BPE โ†’ INT8 Quant โ†’ DPO Loss โ†’ GRPO Loss โ†’ PPO Loss + speed run | 3โ€“4 hrs | --- ## ๐Ÿ›๏ธ Architecture ```text โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Docker / Podman Container โ”‚ โ”‚ โ”‚ โ”‚ JupyterLab (:8888) โ”‚ โ”‚ โ”œโ”€โ”€ templates/ (reset on each run) โ”‚ โ”‚ โ”œโ”€โ”€ solutions/ (reference impl) โ”‚ โ”‚ โ”œโ”€โ”€ torch_judge/ (auto-grading) โ”‚ โ”‚ โ”œโ”€โ”€ torchcode-labext (JLab plugin) โ”‚ โ”‚ โ”‚ ๐Ÿ”„ Reset โ€” restore template โ”‚ โ”‚ โ”‚ ๐Ÿ”— Colab โ€” open in Colab โ”‚ โ”‚ โ””โ”€โ”€ PyTorch (CPU), NumPy โ”‚ โ”‚ โ”‚ โ”‚ Judge checks: โ”‚ โ”‚ โœ“ Output correctness (allclose) โ”‚ โ”‚ โœ“ Gradient flow (autograd) โ”‚ โ”‚ โœ“ Shape consistency โ”‚ โ”‚ โœ“ Edge cases & numerical stability โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` Single container. Single port. No database. No frontend framework. No GPU. ## ๐Ÿ› ๏ธ Commands ```bash make run # Build & start (http://localhost:8888) make stop # Stop the container make clean # Stop + remove volumes + reset all progress ``` ## ๐Ÿงฉ Adding Your Own Problems TorchCode uses auto-discovery โ€” just drop a new file in `torch_judge/tasks/`: ```python TASK = { "id": "my_task", "title": "My Custom Problem", "difficulty": "medium", "function_name": "my_function", "hint": "Think about broadcasting...", "tests": [ ... ], } ``` No registration needed. The judge picks it up automatically. --- ## ๐Ÿ“ฆ Publishing `torch-judge` to PyPI (maintainers) The judge is published as a separate package so Colab/users can `pip install torch-judge` without cloning the repo. ### Automatic (GitHub Action) Pushing to `master` after changing the package version triggers [`.github/workflows/pypi-publish.yml`](.github/workflows/pypi-publish.yml), which builds and uploads to PyPI. No git tag is required. 1. **Bump version** in `torch_judge/_version.py` (e.g. `__version__ = "0.1.1"`). 2. **Configure PyPI Trusted Publisher** (one-time): - PyPI โ†’ Your project **torch-judge** โ†’ **Publishing** โ†’ **Add a new pending publisher** - Owner: `duoan`, Repository: `TorchCode`, Workflow: `pypi-publish.yml`, Environment: (leave empty) - Run the workflow once (push a version bump to `master` or **Actions โ†’ Publish torch-judge to PyPI โ†’ Run workflow**); PyPI will then link the publisher. 3. **Release**: commit the version bump and `git push origin master`. Alternatively, use an API token: add repository secret `PYPI_API_TOKEN` (value = `pypi-...` from PyPI) and set `TWINE_USERNAME=__token__` and `TWINE_PASSWORD` from that secret in the workflow if you prefer not to use Trusted Publishing. ### Manual ```bash pip install build twine python -m build twine upload dist/* ``` Version is in `torch_judge/_version.py`; bump it before each release. --- ## โ“ FAQ
Do I need a GPU?
No. Everything runs on CPU. The problems test correctness and understanding, not throughput.
Can I keep my solutions between runs?
Blank templates reset on every make run so you practice from scratch. Save your work under a different filename if you want to keep it. You can also click the ๐Ÿ”„ Reset button in the notebook toolbar at any time to restore the blank template without restarting.
Can I use Google Colab instead?
Yes! Every notebook has an Open in Colab badge at the top. Click it to open the problem directly in Google Colab โ€” no Docker or local setup needed. You can also use the Colab toolbar button inside JupyterLab.
How are solutions graded?
The judge runs your function against multiple test cases using torch.allclose for numerical correctness, verifies gradients flow properly via autograd, and checks edge cases specific to each operation.
Who is this for?
Anyone preparing for ML/AI engineering interviews at top tech companies, or anyone who wants to deeply understand how PyTorch operations work under the hood.
--- ## ๐Ÿค Contributors Thanks to everyone who has contributed to TorchCode. Auto-generated from the [GitHub contributors graph](https://github.com/duoan/TorchCode/graphs/contributors) with avatars and GitHub usernames. ---
**Built for engineers who want to deeply understand what they build.** If this helped your interview prep, consider giving it a โญ --- ### โ˜• Buy Me a Coffee Buy Me A Coffee BMC QR Code *Scan to support*