drzo commited on
Commit
539df93
·
verified ·
1 Parent(s): f34c19e

feat: initial unicosys hypergraph knowledge model (34.7M params, 203K nodes, 15K edges)

Browse files
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - knowledge-graph
5
+ - hypergraph
6
+ - legal-evidence
7
+ - graph-neural-network
8
+ - unicosys
9
+ language:
10
+ - en
11
+ library_name: transformers
12
+ pipeline_tag: graph-ml
13
+ ---
14
+
15
+ # Unicosys Hypergraph Knowledge Model
16
+
17
+ A trainable knowledge graph embedding model encoding the unified evidence
18
+ hypergraph for Case 2025-137857.
19
+
20
+ ## Model Description
21
+
22
+ This model encodes a **unified hypergraph** linking financial transactions,
23
+ email communications, legal evidence, and entity relationships into a
24
+ single trainable knowledge representation.
25
+
26
+ ### Architecture
27
+
28
+ | Component | Details |
29
+ |---|---|
30
+ | Node Embedding | 128-dim structural + 256-dim text |
31
+ | Hidden Dimension | 256 |
32
+ | Text Encoder | 2-layer Transformer, 4 heads |
33
+ | Graph Attention | 2-layer GAT, 4 heads |
34
+ | Link Predictor | 2-layer MLP with margin ranking loss |
35
+ | Total Parameters | **34,758,785** |
36
+
37
+ ### Knowledge Graph Statistics
38
+
39
+ | Metric | Count |
40
+ |---|---|
41
+ | Total Nodes | 203,642 |
42
+ | Total Edges | 14,675 |
43
+ | Cross-Links | 4,116 |
44
+ | Entities | 16 |
45
+ | Emails | 199,541 |
46
+ | Financial Documents | 2,290 |
47
+ | Timeline Events | 1,220 |
48
+ | LEX Schemes | 13 |
49
+ | Legal Filings | 7 |
50
+
51
+ ### Subsystems
52
+
53
+ | Subsystem | Nodes |
54
+ |---|---|
55
+ | Core (Entities) | 16 |
56
+ | Fincosys (Financial) | 3,908 |
57
+ | Comcosys (Communications) | 199,541 |
58
+ | RevStream1 (Evidence) | 144 |
59
+ | Ad-Res-J7 (Legal) | 33 |
60
+
61
+ ## Training
62
+
63
+ The model can be fine-tuned on link prediction tasks:
64
+
65
+ ```python
66
+ from model.unicosys_model import UnicosysHypergraphModel, UnicosysConfig
67
+
68
+ model = UnicosysHypergraphModel.from_pretrained("hyperholmes/unicosys-hypergraph")
69
+ # ... prepare training data ...
70
+ # model.forward(node_ids, node_type_ids, subsystem_ids, edge_index, edge_type_ids,
71
+ # pos_edge_index=pos, neg_edge_index=neg, labels=labels)
72
+ ```
73
+
74
+ ## Files
75
+
76
+ - `model.safetensors` — Model weights
77
+ - `config.json` — Model configuration
78
+ - `graph_data.safetensors` — Encoded graph tensors (nodes, edges)
79
+ - `tokenizer.json` — Character-level tokenizer for node labels
80
+ - `node_id_mapping.json` — Node ID string to integer index mapping
81
+ - `model_summary.json` — Compact statistics summary
82
+
83
+ ## Source
84
+
85
+ Generated by the [Unicosys](https://github.com/hyperholmes/unicosys) intelligence pipeline.
config.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "UnicosysHypergraphModel"
4
+ ],
5
+ "case_number": "2025-137857",
6
+ "dtype": "float32",
7
+ "edge_type_vocab": {
8
+ "<pad>": 0,
9
+ "<unk>": 1,
10
+ "proves": 2,
11
+ "transaction_evidenced_by": 3
12
+ },
13
+ "gat_dropout": 0.1,
14
+ "gat_num_heads": 4,
15
+ "gat_num_layers": 2,
16
+ "hidden_dim": 256,
17
+ "margin": 1.0,
18
+ "max_nodes": 250000,
19
+ "model_type": "unicosys_hypergraph",
20
+ "negative_sample_ratio": 5,
21
+ "node_embed_dim": 128,
22
+ "node_type_vocab": {
23
+ "<pad>": 0,
24
+ "<unk>": 1,
25
+ "email": 2,
26
+ "entity": 3,
27
+ "entity_document": 4,
28
+ "fund_flow_analysis": 5,
29
+ "hypergraph_node": 6,
30
+ "legal_filing": 7,
31
+ "lex_scheme": 8,
32
+ "timeline_event": 9
33
+ },
34
+ "num_cross_links": 4116,
35
+ "num_edge_types": 4,
36
+ "num_entities": 16,
37
+ "num_evidence": 203642,
38
+ "num_node_types": 10,
39
+ "num_subsystems": 7,
40
+ "subsystem_vocab": {
41
+ "<pad>": 0,
42
+ "<unk>": 1,
43
+ "ad_res_j7": 2,
44
+ "comcosys": 3,
45
+ "core": 4,
46
+ "fincosys": 5,
47
+ "revstream1": 6
48
+ },
49
+ "text_embed_dim": 256,
50
+ "text_max_length": 128,
51
+ "text_num_heads": 4,
52
+ "text_num_layers": 2,
53
+ "text_vocab_size": 219,
54
+ "transformers_version": "5.3.0"
55
+ }
graph_data.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33faa0851dda60392855cc055a950f565f22e40f55425c6f24460ae8cdb86f5a
3
+ size 5264784
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7e927a8a37ee0bcdc708ebfb937dace604b9a8d325516a19757173806cb25f63
3
+ size 139041988
model_summary.json ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "case_number": "2025-137857",
3
+ "total_nodes": 203642,
4
+ "total_edges": 15707,
5
+ "total_cross_links": 4116,
6
+ "node_types": {
7
+ "entity": 16,
8
+ "entity_document": 2290,
9
+ "timeline_event": 1220,
10
+ "hypergraph_node": 551,
11
+ "fund_flow_analysis": 4,
12
+ "email": 199541,
13
+ "lex_scheme": 13,
14
+ "legal_filing": 7
15
+ },
16
+ "edge_types": {
17
+ "proves": 14674,
18
+ "transaction_evidenced_by": 1
19
+ },
20
+ "subsystems": {
21
+ "core": 16,
22
+ "fincosys": 3908,
23
+ "comcosys": 199541,
24
+ "revstream1": 144,
25
+ "ad_res_j7": 33
26
+ },
27
+ "model_params": 34758785,
28
+ "model_architecture": {
29
+ "node_embed_dim": 128,
30
+ "text_embed_dim": 256,
31
+ "hidden_dim": 256,
32
+ "gat_layers": 2,
33
+ "gat_heads": 4,
34
+ "text_encoder_layers": 2
35
+ }
36
+ }
node_id_mapping.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.json ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "vocab_size": 32000,
3
+ "max_length": 128,
4
+ "char_to_id": {
5
+ "p": 4,
6
+ "e": 5,
7
+ "t": 6,
8
+ "r": 7,
9
+ " ": 8,
10
+ "a": 9,
11
+ "n": 10,
12
+ "d": 11,
13
+ "w": 12,
14
+ "f": 13,
15
+ "u": 14,
16
+ "c": 15,
17
+ "i": 16,
18
+ "j": 17,
19
+ "q": 18,
20
+ "l": 19,
21
+ "y": 20,
22
+ "o": 21,
23
+ "b": 22,
24
+ "s": 23,
25
+ "k": 24,
26
+ "v": 25,
27
+ "m": 26,
28
+ "h": 27,
29
+ "g": 28,
30
+ "(": 29,
31
+ ")": 30,
32
+ "\u00e9": 31,
33
+ "z": 32,
34
+ ":": 33,
35
+ "4": 34,
36
+ "8": 35,
37
+ "3": 36,
38
+ "1": 37,
39
+ "0": 38,
40
+ "7": 39,
41
+ "6": 40,
42
+ "9": 41,
43
+ "2": 42,
44
+ "5": 43,
45
+ "x": 44,
46
+ ",": 45,
47
+ "-": 46,
48
+ "/": 47,
49
+ "&": 48,
50
+ ".": 49,
51
+ "+": 50,
52
+ "%": 51,
53
+ "#": 52,
54
+ "'": 53,
55
+ "@": 54,
56
+ "_": 55,
57
+ "*": 56,
58
+ "|": 57,
59
+ "[": 58,
60
+ "]": 59,
61
+ "!": 60,
62
+ "\u2122": 61,
63
+ "=": 62,
64
+ "\u00f1": 63,
65
+ "\u263a": 64,
66
+ "\u00a3": 65,
67
+ "\u2013": 66,
68
+ "\ud83d\udd17": 67,
69
+ "\ud83d\udc9c": 68,
70
+ "\u00a0": 69,
71
+ "\u26f3": 70,
72
+ "\u26a1": 71,
73
+ "\u23f0": 72,
74
+ "?": 73,
75
+ "\u2019": 74,
76
+ "\u2018": 75,
77
+ "\ud83c\udfe0": 76,
78
+ "\u2014": 77,
79
+ "\ud83c\udfe1": 78,
80
+ "\u2728": 79,
81
+ "\u00ae": 80,
82
+ "\ud83c\udf38": 81,
83
+ "$": 82,
84
+ "\ud83d\udc8c": 83,
85
+ "\ud83d\udcb8": 84,
86
+ "\ud83d\udd52": 85,
87
+ "\ud83d\udfe2": 86,
88
+ "\ud83d\ude97": 87,
89
+ "\ud83e\udde0": 88,
90
+ "\ud83d\udc64": 89,
91
+ "\ud83c\udf89": 90,
92
+ "\ud83d\ude80": 91,
93
+ "\ud83c\udf0e": 92,
94
+ "\ud83d\udc40": 93,
95
+ "\ufe0f": 94,
96
+ "\u2011": 95,
97
+ "\ud83c\udf82": 96,
98
+ "\ud83d\ude08": 97,
99
+ "\ud83c\udfa4": 98,
100
+ ";": 99,
101
+ "\ud83d\udcbc": 100,
102
+ "\ud83e\udd76": 101,
103
+ "\ud83d\udea8": 102,
104
+ "\ud83c\udf34": 103,
105
+ "\ud83d\udd2d": 104,
106
+ "\ud83d\ude0e": 105,
107
+ "\u2600": 106,
108
+ "\ud83d\udcda": 107,
109
+ "\ud83c\udf81": 108,
110
+ "\ud83e\udd16": 109,
111
+ "\ud83d\udc4b": 110,
112
+ "\u2022": 111,
113
+ "\u2763": 112,
114
+ "\ud83d\udd25": 113,
115
+ "\ud83c\udf40": 114,
116
+ "\u2764": 115,
117
+ "\u200d": 116,
118
+ "\"": 117,
119
+ "\ud83d\udcf1": 118
120
+ },
121
+ "next_id": 119
122
+ }