npip99 commited on
Commit
c41ce94
·
verified ·
1 Parent(s): 69d2e3e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -3
README.md CHANGED
@@ -1,3 +1,73 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen3-14B
7
+ tags:
8
+ - finance
9
+ - legal
10
+ - code
11
+ - stem
12
+ - medical
13
+ ---
14
+
15
+ <img src="https://i.imgur.com/oxvhvQu.png"/>
16
+
17
+ # zegen-1
18
+
19
+ NOTE: This is an alpha version for early testing, the model weights and even architecture may change.
20
+
21
+ This is a keyword generation model that can rewrite natural language queries into arrays of keyword search expressions, intended for use in retrieval pipelines with sparse keyword embedding search available.
22
+
23
+ ## How to Use
24
+
25
+ ```python
26
+ query = "When am I supposed to complete my security training?"
27
+
28
+ llm = LLM(
29
+ model="zeroentropy/zegen-1",
30
+ tensor_parallel_size=1,
31
+ dtype="auto",
32
+ gpu_memory_utilization=0.8,
33
+ )
34
+
35
+ messages = [
36
+ {
37
+ "role": "system",
38
+ "content": """
39
+ Your goal is to take a User natural language query, and make a list of Slack API searches for that User query. The presumption is that there are slack messages inside of the User's slack, that answer the User's query, and you are trying to find them.
40
+
41
+ - Consider synonyms for each of the words in the query
42
+ - Consider alternative ways a slack message could be written, that would match a query term, even if it's not an exact synonym. For example, if the query is "How do I reset my password", a target word could be "forgot". Even though, in isolation, no query word is synonymous with "forgot".
43
+ - Make a final list of 7 search queries, that you believe will be specific enough to pull 1-10 options, but not too specific that you get 0 results (It's very common to get 0 results from the slack API, so be conservative, prefer 1 word or 2 word searches, but have some searches at every length).
44
+ - Note that capitalization doesn't make a difference, neither does suffixes such as -s or -ing. Don't include stop words either.
45
+ - Output your answer as a JSON string array. Do not output anything else.
46
+ """.strip(),
47
+ },
48
+ {"role": "user", "content": query},
49
+ ]
50
+
51
+ outputs = llm.chat(
52
+ messages=[messages],
53
+ sampling_params=SamplingParams(
54
+ max_tokens=max_new_tokens,
55
+ temperature=0.0,
56
+ ),
57
+ chat_template_kwargs={"enable_thinking": False},
58
+ )
59
+
60
+ response = outputs[0].outputs[0].text
61
+ """
62
+ [
63
+ "security training",
64
+ "security due",
65
+ "training deadline",
66
+ "security training deadline",
67
+ "required training",
68
+ "onboarding",
69
+ "compliance"
70
+ ]
71
+ """
72
+
73
+ ```