OpenTransformer commited on
Commit
2d6d1ff
·
verified ·
1 Parent(s): 3bb7c1d

Upload inference_results/step13662915_20260324_0401.md with huggingface_hub

Browse files
inference_results/step13662915_20260324_0401.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Inference Results — Step 13662915
2
+ **Date:** 2026-03-24 03:49-04:00 UTC
3
+ **Checkpoint:** pretrain_delta_step13662915.pt
4
+ **Model:** 698.39M params | d=1024 | 24L | 16H | rank=128 | ratio=2.0x
5
+ **Tokenizer:** deepseek-ai/DeepSeek-V3.2 (VOCAB=128815)
6
+ **Device:** CPU (GPU occupied by training)
7
+ **Training Progress:** 35.1% (12.278B/35B tokens) | Loss: 7.794
8
+
9
+ ---
10
+
11
+ ## AR Mode (Autoregressive)
12
+ **Prompt:** "Antibiotic resistance emerges in bacterial populations through mechanisms of horizontal gene transfer including conjugation transduction and transformation which allow resistance genes to spread rapidly across species boundaries"
13
+
14
+ **Output:** . [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] At the outbreak of the First World War , people with strong opposition to having cemented Saint-Georges as a mulatto for his wife, Nanon landed in Bordeaux . [ 10 ] On the first day of the election, most supporters had at least one o'clock on Wednesday and then there was a mighty explosion that allowed them to join the radical left behind. The next year, he joined a new state championship built on that meant, but it is not only an immigration rather than an immigration. [ 11 ] After its premiere in August 1920, Saint-Georges's account was fast-tracked under which Saint-Georges lived. [ 12 ] [ 13 ] [ 14 ] His position was given to Antoine Dauvergne instead. [ 15 ] At last, he was accused by Stanislas-Marie Maillard , who then
15
+
16
+ **Stats:** 172.50s | 200 tokens | 1.2 tok/s
17
+ **Notes:** Grammatically correct English sentences. Drifts off-topic from antibiotic resistance to French Revolution-era content about Saint-Georges. Wikipedia-style numbered references present. Coherent sentence structure but no topical relevance.
18
+
19
+ ---
20
+
21
+ ## SAT-Fixed Mode (Span-Aware Transformer, fixed spans)
22
+ **Prompt:** "Magnetars are neutron stars possessing extraordinarily powerful magnetic fields roughly one quadrillion times stronger than Earths magnetic field which produce intense bursts of X-rays and gamma rays detectable across galactic distances"
23
+
24
+ **Output:** ,es .: der was5en.'ensusag1 siisetnt'erche49 en Unungs du und den estif a zum" Er {}lich de au sch une y France " iniree ver nur; Tungen Diece das wird manliche des14ver0 by d als Z (ig V NPR Madrid un–% seW In eineund et lein about Remann'sies im-ß fürers L àst Jagu di lesreslren die Bina ' Smentven00 ton Df M : euraumor your reu7ischen ismier his ilapp the morei ,26antan Mediate Im23g ] auf as27 mitä Iz Sune Wause if for Worldimat8 einw zu Theonars areKre20 her., [ « o not laA Überurkestands auchigenica best dem A ofelze andleta war4 Ver
25
+
26
+ **Stats:** 74.30s | 200 tokens | 2.7 tok/s
27
+ **Notes:** Mixed-language gibberish (German/English/French fragments). SAT mode still producing incoherent output at this training stage. Faster than AR mode due to parallel span generation.
28
+
29
+ ---
30
+
31
+ ## SAT-Variable Mode (Span-Aware Transformer, variable spans)
32
+ **Prompt:** "Bioluminescent organisms in the deep ocean produce light through enzymatic reactions involving luciferin and luciferase serving functions ranging from predator evasion to mate attraction in perpetually dark marine environments"
33
+
34
+ **Output:** elm de es.5ungent'le' F a: . USens are denre en Vagus Teräire zum desum s Franceieren, of (ce deret Die le duktrbfMilnine man An me für il J Aus einel Musicert estu d la="'s Leungs con wasst('ien Les(ze warherlete [isch');a l zute that00 -0 qu les ,Express un sichigerh{ol- withem durch] surun Mur9 The iant CNN after" not über queonarw N sie nichtrent er auf se L orät202 S atigen they Cke dieischeniert ↑ern it24v be Newsir on mit Quichtinaraung11 ist : à La And von English Z the likegeier50ah mehr Shows have v dé19 and3 morein! isde sog es toid can in width cheock from
35
+
36
+ **Stats:** 72.65s | 200 tokens | 2.8 tok/s
37
+ **Notes:** Similar mixed-language gibberish to SAT-fixed. Variable spans do not improve coherence at this stage. SAT modes need significantly more training before producing coherent output.
38
+
39
+ ---
40
+
41
+ ## Summary
42
+ | Mode | tok/s | Coherence | Topic Relevance |
43
+ |------|-------|-----------|----------------|
44
+ | AR | 1.2 | Medium (grammatical sentences) | Low (off-topic drift) |
45
+ | SAT-fixed | 2.7 | Very Low (gibberish) | None |
46
+ | SAT-var | 2.8 | Very Low (gibberish) | None |
47
+
48
+ AR mode continues to show improvement with coherent English sentences. SAT modes remain incoherent, which is expected at 35% training progress. The model shows strong Wikipedia-style knowledge in AR mode but lacks prompt-following ability.