WorkUA Resumes Dataset
Dataset Summary
This dataset consists of 84,316 resume entries collected from publicly available pages on Work.ua. Each entry represents structured information extracted from a candidate's resume, including education, work experience, skills, languages, disability status, veteran status, driver license presence, and additional profile metadata.
The dataset is designed for research and development of:
- Resume parsing models
- Information extraction systems
- Vacancy--candidate matching algorithms
- NLP pipelines for Ukrainian-language documents
- Data engineering and ML training workflows
All personally identifying information has been removed or anonymized.
Dataset Structure
The dataset is provided as a Polars DataFrame with 21 fields.
Schema Overview
id: String
url: String
title: String
candidate_name: String
age: Int64
city: String
desired_salary: Int64
employment_type: String
work_location_preference: String
driver_license: Boolean
creation_date: Datetime
other_resumes: List(Struct{title, url, resume_id, description})
veteran: Boolean
disability: String
work_experiences: List(Struct{position, start_date, end_date, company, city, industry, responsibilities})
recommendations: List(Struct{name, position})
languages: List(Struct{language, level})
skills: List(String)
educations: List(Struct{institution, faculty, city, level, start_year, end_year})
additional_educations: List(Struct{institution, start_year, end_year})
additional_info: String
Data Example
{
"id": "123456",
"url": "https://www.work.ua/resumes/123456/",
"title": "Будівельник",
"candidate_name": "Іван",
"age": 32,
"city": "Київ",
"desired_salary": 25000,
"employment_type": "повна",
"work_location_preference": "офіс",
"driver_license": true,
"creation_date": "2025-03-10T12:30:00",
"veteran": false,
"disability": null,
"skills": ["Штукатурка", "Монтаж гіпсокартону"],
"languages": [{"language": "Українська", "level": "вільно"}],
"educations": [
{
"institution": "КНУБА",
"faculty": "Промислове та цивільне будівництво",
"city": "Київ",
"level": "Вища",
"start_year": 2012,
"end_year": 2016
}
],
"additional_info": "Готовий до відряджень."
}
Intended Use
- Training resume parsers
- Semantic search research
- Text classification
- Career recommendation systems
- Applicant ranking models
Limitations
- Some fields may be incomplete due to original document variability
Ethical Considerations
These resumes don't include any sensitive or personal information.