File size: 5,940 Bytes
ffe59ba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5a1fd0a
 
 
 
 
 
 
 
ffe59ba
 
 
5a1fd0a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4e0f10e
5a1fd0a
 
 
 
 
 
 
 
 
ffe59ba
5a1fd0a
 
 
 
 
 
ffe59ba
 
 
0df0841
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ffe59ba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5a1fd0a
 
 
 
ffe59ba
 
 
 
 
 
 
 
 
 
5a1fd0a
 
 
 
 
 
ffe59ba
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width,initial-scale=1" />
    <title>document-ocr</title>
    <link rel="stylesheet" href="/static/style.css" />
  </head>
  <body>
    <header>
      <div class="title">
        <span class="logo">πŸ“„</span>
        <h1>document-ocr</h1>
        <span class="badge" id="badge">idle</span>
      </div>
      <div class="meta" id="models">SIE: <code>...</code></div>
      <div class="meta" id="sie-state">checking SIE...</div>
      <div class="cta-row">
        <a class="cta" href="https://github.com/superlinked/brave-new-demos/tree/main/document-ocr" target="_blank" rel="noopener">
          <span>β†—</span> Source on GitHub
        </a>
        <a class="cta" href="https://github.com/superlinked/sie" target="_blank" rel="noopener">
          <span>β˜…</span> SIE repo
        </a>
      </div>
    </header>

    <section class="hero">
      <div class="hero-text">
        <p>
          OCR is rarely a single-model problem. This demo runs three model
          classes through <strong>one SIE server</strong>: a VLM-OCR recognizes
          the document into Markdown, a fine-tuned Donut emits a JSON tree
          directly, and a zero-shot NER (GLiNER) pulls typed fields out of
          the recognition output. Pick a sample on the left, swap any of the
          three models in the dropdowns, watch SIE hot-swap them with
          <em>one identifier change</em>.
        </p>
      </div>
      <div class="hero-diagram">
        <div class="diagram">
          <div class="diagram-input">image</div>
          <div class="diagram-arrow">↓</div>
          <div class="diagram-server">one SIE server Β· <code>client.extract(model_id, item)</code></div>
          <div class="diagram-arrows">
            <span>↓</span><span>↓</span><span>↓</span>
          </div>
          <div class="diagram-models">
            <div class="diagram-box diagram-recognition">VLM-OCR<br><span>(LightOnOCR-2-1B, PaddleOCR-VL, GLM-OCR)</span></div>
            <div class="diagram-box diagram-structured">Donut<br><span>(end-to-end JSON)</span></div>
            <div class="diagram-box diagram-ner">GLiNER<br><span>(zero-shot NER)</span></div>
          </div>
        </div>
      </div>
    </section>

    <section class="why-sie">
      <h3>Why SIE</h3>
      <p>
        Three different model architectures (a vision-language model, a
        fine-tuned encoder-decoder, a span-based NER), one inference engine,
        one HTTP API, one SDK call. Without SIE, this demo would be three
        separate inference services with three SDKs, three auth flows, three
        rate limits. With SIE, swap a string in <code>client.extract(...)</code>
        and the underlying architecture changes.
      </p>
    </section>

    <section class="tour">
      <h3>Try these moments</h3>
      <ol class="tour-list">
        <li>
          <strong>Click any sample on the left.</strong> All three models run
          in one pipeline. The footer prints per-stage timings as each one
          lands.
        </li>
        <li>
          <strong>Open "See the SIE call"</strong> in any panel, then swap the
          model dropdown above. The snippet updates with the one parameter
          that changed. That is the swap-a-string pitch in action.
        </li>
        <li>
          <strong>Click the receipt, then the multi-column page.</strong>
          Donut (fine-tuned on receipts) dominates the first; recognition
          dominates the second. Same pipeline, different model wins.
        </li>
        <li>
          <strong>Switch NER from <code>gliner_multi</code> to
          <code>gliner_large</code>.</strong> Same labels, same input text,
          different confidence scores. Model quality is a single dropdown
          away.
        </li>
      </ol>
    </section>

    <main>
      <section class="panel" id="panel-events">
        <header><h2>Sample documents</h2></header>
        <div class="meta-row">
          <label class="model-pick">
            <span class="dropdown-label">Recognition</span>
            <select id="select-recognition"></select>
          </label>
          <label class="model-pick">
            <span class="dropdown-label">Structured</span>
            <select id="select-structured"></select>
          </label>
          <label class="model-pick">
            <span class="dropdown-label">NER</span>
            <select id="select-ner"></select>
          </label>
        </div>
        <div class="list" id="events">loading...</div>
      </section>

      <section class="panel" id="panel-recognition">
        <header>
          <h2>Recognition (Markdown)</h2>
          <span class="hint" id="recognition-meta"></span>
        </header>
        <details class="sdk-snippet">
          <summary>See the SIE call</summary>
          <pre><code id="snippet-recognition">// pick a recognition model in the dropdown</code></pre>
        </details>
        <div class="markdown" id="recognition">
          <p class="hint">Click a sample on the left.</p>
        </div>
      </section>

      <section class="panel" id="panel-extraction">
        <header>
          <h2>Extraction</h2>
          <span class="hint" id="extraction-meta"></span>
        </header>
        <details class="sdk-snippet">
          <summary>See the SIE calls</summary>
          <pre><code id="snippet-structured">// structured (Donut)</code>

<code id="snippet-ner">// NER (GLiNER)</code></pre>
        </details>
        <div class="extraction" id="extraction">
          <p class="hint">Typed fields will appear here.</p>
        </div>
      </section>
    </main>

    <footer>
      <span id="footer">SIE on <code id="sie-url">http://localhost:8080</code></span>
      <span id="timings"></span>
    </footer>

    <script src="/static/app.js"></script>
  </body>
</html>