--- name: exam-sheet description: > Extract structured data from photos of exam sheets and return it via the submit_exam tool call. The pipeline renders the data into LaTeX deterministically — you never write LaTeX yourself. allowed-tools: Read, Write, Edit, Bash, Glob, MultiTool --- # Data Extraction Skill You receive photos of exam sheets (typically Moroccan lycée math exams but could be any structured document with questions and point values). Your job: **extract the content** into a structured JSON object and call the `submit_exam` tool. The pipeline takes care of all LaTeX formatting. ## What to extract ### header | Field | What goes there | Example | |---|---|---| | `top_left` | Page number or identifier | `"1/3"`, `"Page 4"` | | `school` | School / ministry / institution. Use `\n` for line breaks. **NO Arabic characters** — transliterate to French. | `"Ministère de l'éducation Nationale\nGSCP CASA"` | | `exam_info` | Exam type, date. Use `\n` for line breaks. | `"Bac-blanc --Maths\n18-05-2018"` | | `right` | Duration, code, symbol, etc. | `"Durée :3h"` | ### intro_page (OPTIONAL, only when the FIRST photo is a cover page) A cover page is a standalone first page that contains **no exam questions** — just the exam title, a matière/niveau/durée/coefficient table, a bullet list describing each exercise, and usually a calculator-usage notice. **Only fill `intro_page` when all of these are true:** - It is the **first** photo (`image_index == 0`). - The photo has **no numbered questions** (`1)`, `2)`, `a)`...). - The photo is clearly introductory / administrative metadata. Leave `intro_page` unset (null/omitted) for any exam whose first photo already starts with questions. | Field | Type | Description | |---|---|---| | `title` | string | Main heading (e.g. `"DEVOIR SURVEILLÉ 3"`). `""` if none. | | `subtitle` | string | Optional secondary heading. `""` if none. | | `info_rows` | array | Key/value pairs from the administrative table. Each item: `{"label": "Matière", "value": "Mathématiques"}`. Empty array if none. | | `bullets` | array | One item per bullet on the cover page. Each item: `{"description": "Le problème se rapporte à l'analyse", "bareme": "10,75pts"}`. The renderer draws the dotted line and parentheses — do NOT include them in `description`. Leave `bareme` as `""` if the bullet has no point value. Keep inline `$math$` in `description` if present. | | `footer` | string | Closing notice (e.g. `"L'usage de la calculatrice est autorisé"`). `""` if none. | **Note on `title`/`subtitle`**: the renderer shares the main header bar (`school`, `exam_info`, …) with the cover page, so the cover does NOT re-display the exam type. Leave `title` and `subtitle` empty unless the cover has genuinely different text that would otherwise be lost. ### sections (array) Each exercise, partie, or problème is one section, **in document order**. | Field | Type | Description | |---|---|---| | `title` | string | Exactly as printed: `"Exercice1"`, `"Partie II :"`, `"Problème"` | | `bareme` | string | **Per-exercise** barème, e.g. `"3points"`, `"11points"`. Leave `""` if barème is per-question instead. | | `intro` | string | Optional setup paragraph that appears between the title and the first question. Raw text with inline `$math$`. `""` if none. | | `rows` | array | One entry per question / sub-question / sub-part header / figure, in order. | ### rows (array inside each section) | Field | Type | Required | Description | |---|---|---|---| | `bareme` | string | no | Per-question barème: `"0,5"`, `"0,75"`, `"0,25"`. Empty or omitted if this row has no individual barème. | | `content` | string | yes | The question text, with inline `$math$`, question numbering (`1) a)`), sub-part headers (`\textbf{I-} Soit $g$...`). The renderer wraps this in `\textit{}` automatically — do NOT add `\textit{}` yourself. | | `figure_id` | string | no | If this row is a **figure placeholder** (blank rectangle for hand-drawing), set this to the figure's id (e.g. `"fig1"`). The `content` field is ignored. | | `figure_width_cm` | number | no | Width of the blank box in cm. Required if `figure_id` is set. Typically 5-8 cm. | ### figures (array) Same as before — one entry per figure on the page: | Field | Type | Description | |---|---|---| | `id` | string | Short unique name matching the `figure_id` in a row (e.g. `"fig1"`) | | `image_index` | integer | 0-based index into the uploaded photos | | `bbox` | [x1,y1,x2,y2] | Fractional bounding box in the source photo. Used for sizing the blank rectangle. Be precise about the width/height ratio. | ## Critical rules 1. **NO LaTeX document structure.** You don't write `\documentclass`, `\begin{document}`, longtable code, or any preamble. You return structured JSON via the tool call. 2. **Faithfulness.** Transcribe every question, every sub-question, every barème value exactly as printed. Don't skip, summarize, or rephrase. 3. **Arabic → French.** Transliterate all Arabic text to French equivalents. Common translations: | Arabic | French | |---|---| | الامتحان الوطني الموحد للبكالوريا | Examen National Unifié du Baccalauréat | | الدورة العادية 2025 | Session Normale 2025 | | مادة الرياضيات | Mathématiques | | مسلك العلوم الفيزيائية | Filière Sciences Physiques | | (خيار فرنسية) | (Option Français) | | الصفحة / صفحة | Page | | الموضوع | Le Sujet | 4. **Math notation.** Use `$...$` for inline math. Keep the original notation: `$\vec{u}(-2;2;0)$` with semicolons (Moroccan convention), `$\lim\limits_{x\to 0^+}$`, etc. 5. **Question numbering in content.** Include the numbering as part of the content string: `"1) a) Vérifier que..."`, `"b) Montrer que..."`. For sub-part headers: `"\\textbf{I-} Soit $g$ la fonction..."`. 6. **Per-exercise vs per-question barème.** Look at the original: - If ONE number appears next to the exercise title (e.g. "3points" on the left margin at the title level) → per-exercise. Set `section.bareme = "3points"`, leave all `row.bareme` empty. - If EACH question has its own decimal (e.g. "0,5" next to 1)a), "0,75" next to 2)a)) → per-question. Set `section.bareme = ""`, set `row.bareme` on each question row. 7. **Figures.** If there's a graph, curve, geometric figure, or any visual: - Add a row with `figure_id` and `figure_width_cm` at the right position in the flow. - Add a matching entry in `figures[]` with the bbox for sizing. - The pipeline generates a **blank white rectangle** with a border. The end user draws the figure by hand on the printout. 8. **Figures = EXISTING visual content ONLY.** A figure is something VISUALLY DRAWN on the page (a graph, a geometric diagram, a plotted curve). An instruction like "Tracer la courbe (C)", "Construire le triangle", "Dresser le tableau de variation", "Recopier la courbe" is NOT a figure — it's a QUESTION asking the student to draw something. Do NOT add a `figure_id` row or a `figures` entry for drawing instructions. Only add figures for content that is already drawn/printed on the source page. 9. **Don't hallucinate.** If something is illegible, note it as `[illisible]` in the content field.