A wikipédia kenlm kvantálás nélküli, a HPLT -a 22 -q 8 -b 8 kapcsolókkal lett kvantálva.
A HPLT adatbázis 50%-a lett felhasználva:
=== 1/5 Counting and sorting n-grams ===
Reading /home/sarpba/2TB/hplt_magyar_korpusz_felezett.txt
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Unigram tokens 13847057702 types 96225600
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:1154707200 2:29655070720 3:55603257344 4:88965210112 5:129740939264 6:177930420224
Statistics:
1 96225600 D1=0.764073 D2=1.03791 D3+=1.27263
2 1265844926 D1=0.806391 D2=1.0989 D3+=1.30792
3 1575497442/3590763606 D1=0.874748 D2=1.16937 D3+=1.29281
4 877443852/5312014485 D1=0.925102 D2=1.26541 D3+=1.29521
5 838676466/5932092458 D1=0.953571 D2=1.35662 D3+=1.27285
6 439341856/6030728712 D1=0.585162 D2=1.55164 D3+=1.71841
Memory estimate for binary LM:
type GB
probing 111 assuming -p 1.5
probing 137 assuming -r models -p 1.5
trie 68 without quantization
trie 42 assuming -q 8 -b 8 quantization
trie 57 assuming -a 22 array pointer compression
trie 31 assuming -a 22 -q 8 -b 8 array pointer compression and quantization
=== 3/5 Calculating and sorting initial probabilities ===
Chain sizes: 1:1154707200 2:20253518816 3:31509948840 4:21058652448 5:23482941048 6:14058939392
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
*******#############################################################################################
=== 4/5 Calculating and writing order-interpolated probabilities ===
Chain sizes: 1:1154707200 2:20253518816 3:31509948840 4:21058652448 5:23482941048 6:14058939392
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
####################################################################################################
=== 5/5 Writing ARPA model ===
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Name:lmplz VmPeak:473506760 kB VmRSS:0 kB RSSMax:462636584 kB user:18683.3 sys:7247.18 CPU:25930.5 real:21271.2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support