| # WorkUA Resumes Dataset | |
| ## Dataset Summary | |
| This dataset consists of 84,316 resume entries collected from | |
| publicly available pages on [Work.ua]("https://www.work.ua/resumes/"). Each entry represents structured | |
| information extracted from a candidate's resume, including education, | |
| work experience, skills, languages, disability status, veteran status, | |
| driver license presence, and additional profile metadata. | |
| The dataset is designed for research and development of: | |
| - Resume parsing models | |
| - Information extraction systems | |
| - Vacancy--candidate matching algorithms | |
| - NLP pipelines for Ukrainian-language documents | |
| - Data engineering and ML training workflows | |
| All personally identifying information has been removed or anonymized. | |
| ## Dataset Structure | |
| The dataset is provided as a Polars DataFrame with **21 fields**. | |
| ### Schema Overview | |
| id: String | |
| url: String | |
| title: String | |
| candidate_name: String | |
| age: Int64 | |
| city: String | |
| desired_salary: Int64 | |
| employment_type: String | |
| work_location_preference: String | |
| driver_license: Boolean | |
| creation_date: Datetime | |
| other_resumes: List(Struct{title, url, resume_id, description}) | |
| veteran: Boolean | |
| disability: String | |
| work_experiences: List(Struct{position, start_date, end_date, company, city, industry, responsibilities}) | |
| recommendations: List(Struct{name, position}) | |
| languages: List(Struct{language, level}) | |
| skills: List(String) | |
| educations: List(Struct{institution, faculty, city, level, start_year, end_year}) | |
| additional_educations: List(Struct{institution, start_year, end_year}) | |
| additional_info: String | |
| ## Data Example | |
| { | |
| "id": "123456", | |
| "url": "https://www.work.ua/resumes/123456/", | |
| "title": "Будівельник", | |
| "candidate_name": "Іван", | |
| "age": 32, | |
| "city": "Київ", | |
| "desired_salary": 25000, | |
| "employment_type": "повна", | |
| "work_location_preference": "офіс", | |
| "driver_license": true, | |
| "creation_date": "2025-03-10T12:30:00", | |
| "veteran": false, | |
| "disability": null, | |
| "skills": ["Штукатурка", "Монтаж гіпсокартону"], | |
| "languages": [{"language": "Українська", "level": "вільно"}], | |
| "educations": [ | |
| { | |
| "institution": "КНУБА", | |
| "faculty": "Промислове та цивільне будівництво", | |
| "city": "Київ", | |
| "level": "Вища", | |
| "start_year": 2012, | |
| "end_year": 2016 | |
| } | |
| ], | |
| "additional_info": "Готовий до відряджень." | |
| } | |
| ## Intended Use | |
| - Training resume parsers | |
| - Semantic search research | |
| - Text classification | |
| - Career recommendation systems | |
| - Applicant ranking models | |
| ## Limitations | |
| - Some fields may be incomplete due to original document variability | |
| ## Ethical Considerations | |
| These resumes don't include any sensitive or personal information. | |