---
license: bsd-2-clause
task_categories:
- text-generation
language:
- en
tags:
- common-lisp
- macros
- code-generation
- program-transformation
pretty_name: Common Lisp Macro Transformations
size_categories:
- n<1K
---

# Common Lisp Macro Transformations

A fine-tuning dataset for training models to generate Common Lisp macros. Each example is a **(before-code) → (macro-definition) → (after-expansion)** triple.

## Idea

Instead of fine-tuning a model to "write code", fine-tune it to generate **CL macros** — code that writes code. The model learns to recognize AST patterns and generate transformations, not final output.

## Sources

- **Let Over Lambda** — Doug Hoyte's production macro collection (thephoeron/let-over-lambda)
- **On Lisp** — Paul Graham's classic Common Lisp macro utilities

## Dataset Structure

Each record contains:
- `instruction` — Task description with the code pattern to address
- `input` — The "before" code showing the pattern that needs a macro
- `output` — The `defmacro` form that solves it
- `category` — Macro category (capture-management, anaphoric, dispatch, control-flow, DSL, compiler-macro, efficiency, scope)
- `technique` — Comma-separated techniques used (gensym, nested-backquote, dlambda, anaphor, code-walking, symbol-macrolet, defsetf, tagbody-go, once-only, macrolet, compiler-macro, recursive-expansion)
- `complexity` — basic, intermediate, or advanced
- `quality_score` — Classifier score from 0.0 to 1.0

## Categories

| Category | Description | Examples |
|---|---|---|
| capture-management | Hygienic macro writing utilities | defmacro/g!, defmacro!, with-gensyms |
| anaphoric | Deliberate variable capture for conciseness | aif, alambda, alet, aand |
| dispatch | Keyword-based dispatch and inter-closure protocols | dlambda, pandoriclet, with-pandoric |
| control-flow | New evaluation semantics via macros | nlet-tail, condlet, if-match, choose |
| DSL | Domain-specific embedded languages | defunits, _f (generalized setf), dbind |
| compiler-macro | Compile-time optimization of function calls | fformat compiler macro |
| efficiency | Performance-oriented macro techniques | sortf (sorting networks) |
| scope | Lexical scope manipulation | pandoric-eval |

## Use for Fine-tuning

The data is in instruction-input-output JSONL format, ready for fine-tuning:

```python
from datasets import load_dataset
ds = load_dataset("j14i/cl-macros", split="train")
```

Target model size: ≤ 30B parameters (the domain is narrow — pattern matching on ASTs and transformations — so a smaller model suffices).