ex-sheet / skill.md
Salem Lahlou
cover page: share header bar, drop redundant title, dotfill barèmes
421df9a
metadata
name: exam-sheet
description: >
  Extract structured data from photos of exam sheets and return it via the
  submit_exam tool call. The pipeline renders the data into LaTeX
  deterministically — you never write LaTeX yourself.
allowed-tools: Read, Write, Edit, Bash, Glob, MultiTool

Data Extraction Skill

You receive photos of exam sheets (typically Moroccan lycée math exams but could be any structured document with questions and point values).

Your job: extract the content into a structured JSON object and call the submit_exam tool. The pipeline takes care of all LaTeX formatting.

What to extract

header

Field What goes there Example
top_left Page number or identifier "1/3", "Page 4"
school School / ministry / institution. Use \n for line breaks. NO Arabic characters — transliterate to French. "Ministère de l'éducation Nationale\nGSCP CASA"
exam_info Exam type, date. Use \n for line breaks. "Bac-blanc --Maths\n18-05-2018"
right Duration, code, symbol, etc. "Durée :3h"

intro_page (OPTIONAL, only when the FIRST photo is a cover page)

A cover page is a standalone first page that contains no exam questions — just the exam title, a matière/niveau/durée/coefficient table, a bullet list describing each exercise, and usually a calculator-usage notice.

Only fill intro_page when all of these are true:

  • It is the first photo (image_index == 0).
  • The photo has no numbered questions (1), 2), a)...).
  • The photo is clearly introductory / administrative metadata.

Leave intro_page unset (null/omitted) for any exam whose first photo already starts with questions.

Field Type Description
title string Main heading (e.g. "DEVOIR SURVEILLÉ 3"). "" if none.
subtitle string Optional secondary heading. "" if none.
info_rows array Key/value pairs from the administrative table. Each item: {"label": "Matière", "value": "Mathématiques"}. Empty array if none.
bullets array One item per bullet on the cover page. Each item: {"description": "Le problème se rapporte à l'analyse", "bareme": "10,75pts"}. The renderer draws the dotted line and parentheses — do NOT include them in description. Leave bareme as "" if the bullet has no point value. Keep inline $math$ in description if present.
footer string Closing notice (e.g. "L'usage de la calculatrice est autorisé"). "" if none.

Note on title/subtitle: the renderer shares the main header bar (school, exam_info, …) with the cover page, so the cover does NOT re-display the exam type. Leave title and subtitle empty unless the cover has genuinely different text that would otherwise be lost.

sections (array)

Each exercise, partie, or problème is one section, in document order.

Field Type Description
title string Exactly as printed: "Exercice1", "Partie II :", "Problème"
bareme string Per-exercise barème, e.g. "3points", "11points". Leave "" if barème is per-question instead.
intro string Optional setup paragraph that appears between the title and the first question. Raw text with inline $math$. "" if none.
rows array One entry per question / sub-question / sub-part header / figure, in order.

rows (array inside each section)

Field Type Required Description
bareme string no Per-question barème: "0,5", "0,75", "0,25". Empty or omitted if this row has no individual barème.
content string yes The question text, with inline $math$, question numbering (1) a)), sub-part headers (\textbf{I-} Soit $g$...). The renderer wraps this in \textit{} automatically — do NOT add \textit{} yourself.
figure_id string no If this row is a figure placeholder (blank rectangle for hand-drawing), set this to the figure's id (e.g. "fig1"). The content field is ignored.
figure_width_cm number no Width of the blank box in cm. Required if figure_id is set. Typically 5-8 cm.

figures (array)

Same as before — one entry per figure on the page:

Field Type Description
id string Short unique name matching the figure_id in a row (e.g. "fig1")
image_index integer 0-based index into the uploaded photos
bbox [x1,y1,x2,y2] Fractional bounding box in the source photo. Used for sizing the blank rectangle. Be precise about the width/height ratio.

Critical rules

  1. NO LaTeX document structure. You don't write \documentclass, \begin{document}, longtable code, or any preamble. You return structured JSON via the tool call.

  2. Faithfulness. Transcribe every question, every sub-question, every barème value exactly as printed. Don't skip, summarize, or rephrase.

  3. Arabic → French. Transliterate all Arabic text to French equivalents. Common translations:

    Arabic French
    الامتحان الوطني الموحد للبكالوريا Examen National Unifié du Baccalauréat
    الدورة العادية 2025 Session Normale 2025
    مادة الرياضيات Mathématiques
    مسلك العلوم الفيزيائية Filière Sciences Physiques
    (خيار فرنسية) (Option Français)
    الصفحة / صفحة Page
    الموضوع Le Sujet
  4. Math notation. Use $...$ for inline math. Keep the original notation: $\vec{u}(-2;2;0)$ with semicolons (Moroccan convention), $\lim\limits_{x\to 0^+}$, etc.

  5. Question numbering in content. Include the numbering as part of the content string: "1) a) Vérifier que...", "b) Montrer que...". For sub-part headers: "\\textbf{I-} Soit $g$ la fonction...".

  6. Per-exercise vs per-question barème. Look at the original:

    • If ONE number appears next to the exercise title (e.g. "3points" on the left margin at the title level) → per-exercise. Set section.bareme = "3points", leave all row.bareme empty.
    • If EACH question has its own decimal (e.g. "0,5" next to 1)a), "0,75" next to 2)a)) → per-question. Set section.bareme = "", set row.bareme on each question row.
  7. Figures. If there's a graph, curve, geometric figure, or any visual:

    • Add a row with figure_id and figure_width_cm at the right position in the flow.
    • Add a matching entry in figures[] with the bbox for sizing.
    • The pipeline generates a blank white rectangle with a border. The end user draws the figure by hand on the printout.
  8. Figures = EXISTING visual content ONLY. A figure is something VISUALLY DRAWN on the page (a graph, a geometric diagram, a plotted curve). An instruction like "Tracer la courbe (C)", "Construire le triangle", "Dresser le tableau de variation", "Recopier la courbe" is NOT a figure — it's a QUESTION asking the student to draw something. Do NOT add a figure_id row or a figures entry for drawing instructions. Only add figures for content that is already drawn/printed on the source page.

  9. Don't hallucinate. If something is illegible, note it as [illisible] in the content field.