Spaces:
Sleeping
Sleeping
Sync from GitHub: 080bffac0f304b7f3781266ca4fb761c2974f8a6
Browse files- grammars/concepts.gbnf +8 -10
grammars/concepts.gbnf
CHANGED
|
@@ -28,14 +28,12 @@
|
|
| 28 |
|
| 29 |
root ::= item ("," ws item){0,7}
|
| 30 |
item ::= word (ws word){0,3}
|
| 31 |
-
# Word:
|
| 32 |
-
#
|
| 33 |
-
#
|
| 34 |
-
#
|
| 35 |
-
#
|
| 36 |
-
# ("thatGreenhouseCarbon")
|
| 37 |
-
#
|
| 38 |
-
|
| 39 |
-
# Falcon3-10B-1.58bit.
|
| 40 |
-
word ::= [a-zA-Z] [a-z0-9-]{2,19}
|
| 41 |
ws ::= " "
|
|
|
|
| 28 |
|
| 29 |
root ::= item ("," ws item){0,7}
|
| 30 |
item ::= word (ws word){0,3}
|
| 31 |
+
# Word: any letter + alphanumerics + hyphens. Mid-word capitals
|
| 32 |
+
# are permitted because legitimate concepts frequently contain them:
|
| 33 |
+
# acronyms (RSA, CPU, DNA, ATP, NADPH), patronymic proper nouns
|
| 34 |
+
# (McDonalds, MacPhearson), brand/product names (iPhone, eBay).
|
| 35 |
+
# Length-cap of 20 chars + the defensive parser's word-count gate
|
| 36 |
+
# handle the run-7 token-jam pollution ("thatGreenhouseCarbon")
|
| 37 |
+
# without sacrificing legitimate mid-capital content.
|
| 38 |
+
word ::= [a-zA-Z] [a-zA-Z0-9-]{2,19}
|
|
|
|
|
|
|
| 39 |
ws ::= " "
|