btaras22 commited on
Commit
1c5e401
·
verified ·
1 Parent(s): 61a79f0

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # WorkUA Resumes Dataset
2
+
3
+ ## Dataset Summary
4
+
5
+ This dataset consists of 84,316 resume entries collected from
6
+ publicly available pages on [Work.ua]("https://www.work.ua/resumes/"). Each entry represents structured
7
+ information extracted from a candidate's resume, including education,
8
+ work experience, skills, languages, disability status, veteran status,
9
+ driver license presence, and additional profile metadata.
10
+
11
+ The dataset is designed for research and development of:
12
+
13
+ - Resume parsing models
14
+ - Information extraction systems
15
+ - Vacancy--candidate matching algorithms
16
+ - NLP pipelines for Ukrainian-language documents
17
+ - Data engineering and ML training workflows
18
+
19
+ All personally identifying information has been removed or anonymized.
20
+
21
+ ## Dataset Structure
22
+
23
+ The dataset is provided as a Polars DataFrame with **21 fields**.
24
+
25
+ ### Schema Overview
26
+
27
+ id: String
28
+ url: String
29
+ title: String
30
+ candidate_name: String
31
+ age: Int64
32
+ city: String
33
+ desired_salary: Int64
34
+ employment_type: String
35
+ work_location_preference: String
36
+ driver_license: Boolean
37
+ creation_date: Datetime
38
+ other_resumes: List(Struct{title, url, resume_id, description})
39
+ veteran: Boolean
40
+ disability: String
41
+ work_experiences: List(Struct{position, start_date, end_date, company, city, industry, responsibilities})
42
+ recommendations: List(Struct{name, position})
43
+ languages: List(Struct{language, level})
44
+ skills: List(String)
45
+ educations: List(Struct{institution, faculty, city, level, start_year, end_year})
46
+ additional_educations: List(Struct{institution, start_year, end_year})
47
+ additional_info: String
48
+
49
+ ## Data Example
50
+
51
+ {
52
+ "id": "123456",
53
+ "url": "https://www.work.ua/resumes/123456/",
54
+ "title": "Будівельник",
55
+ "candidate_name": "Іван",
56
+ "age": 32,
57
+ "city": "Київ",
58
+ "desired_salary": 25000,
59
+ "employment_type": "повна",
60
+ "work_location_preference": "офіс",
61
+ "driver_license": true,
62
+ "creation_date": "2025-03-10T12:30:00",
63
+ "veteran": false,
64
+ "disability": null,
65
+ "skills": ["Штукатурка", "Монтаж гіпсокартону"],
66
+ "languages": [{"language": "Українська", "level": "вільно"}],
67
+ "educations": [
68
+ {
69
+ "institution": "КНУБА",
70
+ "faculty": "Промислове та цивільне будівництво",
71
+ "city": "Київ",
72
+ "level": "Вища",
73
+ "start_year": 2012,
74
+ "end_year": 2016
75
+ }
76
+ ],
77
+ "additional_info": "Готовий до відряджень."
78
+ }
79
+
80
+ ## Intended Use
81
+
82
+ - Training resume parsers
83
+ - Semantic search research
84
+ - Text classification
85
+ - Career recommendation systems
86
+ - Applicant ranking models
87
+
88
+ ## Limitations
89
+
90
+ - Some fields may be incomplete due to original document variability
91
+
92
+ ## Ethical Considerations
93
+
94
+ These resumes don't include any sensitive or personal information.