olgab42 commited on
Commit
333d117
·
verified ·
1 Parent(s): 7d71c9f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -3
README.md CHANGED
@@ -1,3 +1,70 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: microsoft/Phi-3.5-mini-instruct
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ license: mit
7
+ license_link: https://huggingface.co/microsoft/Phi-3.5-mini-instruct/resolve/main/LICENSE
8
+ pipeline_tag: text-generation
9
+ tags:
10
+ - nlp
11
+ - ner
12
+ ---
13
+
14
+ ## Geo-Temporal Entity Recognition Model
15
+
16
+ This model is a finetuned version of [Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct). It detects the location and date entities in a query. It also maps the location entities to the corresponding set of countries, and the date entities to either a start and end date, or the corresponding months. Additionally, it generates the query that is cleaned from the date entities and the location entities that are countries.
17
+
18
+ The prompt on which the model was finetuned is the following:
19
+
20
+ ```python
21
+ from datetime import date
22
+ today = date.today()
23
+
24
+ schema = """
25
+ country: Extracted list of country or countries,
26
+ periodStart: Period start in ISO 8601 format, filled in only if date-related entity corresponds to absolute date range, else null,
27
+ periodEnd: Period end in ISO 8601 format, filled in only if date-related entity corresponds to absolute date range, else null,
28
+ phase: A list of months indicated by integers from 1 to 12, filled in only if date-related entity corresponds to a reccuring yearly period. It should be null in case `periodStart` or `periodEnd` has been extracted.,
29
+ location: A list of the detected location-related entities,
30
+ date: A list of date-related entities,
31
+ cleanedQuery: A list of parts of the query cleaned from the extracted date-related entity and the location-related entity and their related parts (e.g. prepositions)
32
+ """
33
+
34
+ EXTENDED_INSTRUCTION_TEXT = """
35
+ You are a Name-Entity Recognition system specialized in extracting and processing location and date related entities from text. Follow these steps:
36
+
37
+ 1. Extract exact entities from the text:
38
+ - Location entities: Extract only if they are specific place names (not general terms like "sample locations")
39
+ - Date entities: Extract dates exactly as they appear in the text
40
+ Both should be extracted exactly as mentioned in the text, without modifications.
41
+
42
+ 2. For each detected location entity:
43
+ - Map it to corresponding country name(s)
44
+ - If the location itself is a country, include it in the country list
45
+ - If country cannot be determined, return an empty list
46
+
47
+ 3. For date-related entities, classify them into one of two categories:
48
+ a) Absolute date range:
49
+ - Convert to ISO 8601 date format (YYYY-MM-DD)
50
+ - Set periodStart and periodEnd
51
+ - Set phase to null
52
+ - Use %(today)s as reference for relative dates
53
+
54
+ b) Recurring yearly period:
55
+ - Set phase as list of integers (1-12) representing months
56
+ - Set periodStart and periodEnd to null
57
+
58
+ 4. Clean the query by removing:
59
+ - Detected date entities and their syntactic relations (e.g., prepositions)
60
+ - Location entities (only if they are countries) and their relations
61
+ Return the remaining parts as a list of strings
62
+
63
+ Return the results in JSON format matching this schema: %(schema)s
64
+
65
+ IMPORTANT:
66
+ - Always return all fields defined in the schema
67
+ - Return only the JSON without any additional explanation or notes
68
+ - Ensure the JSON is properly formatted and parsable
69
+ """ % {"today": today, "schema": schema}
70
+ ```