# WorkUA Resumes Dataset ## Dataset Summary This dataset consists of 84,316 resume entries collected from publicly available pages on [Work.ua]("https://www.work.ua/resumes/"). Each entry represents structured information extracted from a candidate's resume, including education, work experience, skills, languages, disability status, veteran status, driver license presence, and additional profile metadata. The dataset is designed for research and development of: - Resume parsing models - Information extraction systems - Vacancy--candidate matching algorithms - NLP pipelines for Ukrainian-language documents - Data engineering and ML training workflows All personally identifying information has been removed or anonymized. ## Dataset Structure The dataset is provided as a Polars DataFrame with **21 fields**. ### Schema Overview id: String url: String title: String candidate_name: String age: Int64 city: String desired_salary: Int64 employment_type: String work_location_preference: String driver_license: Boolean creation_date: Datetime other_resumes: List(Struct{title, url, resume_id, description}) veteran: Boolean disability: String work_experiences: List(Struct{position, start_date, end_date, company, city, industry, responsibilities}) recommendations: List(Struct{name, position}) languages: List(Struct{language, level}) skills: List(String) educations: List(Struct{institution, faculty, city, level, start_year, end_year}) additional_educations: List(Struct{institution, start_year, end_year}) additional_info: String ## Data Example { "id": "123456", "url": "https://www.work.ua/resumes/123456/", "title": "Будівельник", "candidate_name": "Іван", "age": 32, "city": "Київ", "desired_salary": 25000, "employment_type": "повна", "work_location_preference": "офіс", "driver_license": true, "creation_date": "2025-03-10T12:30:00", "veteran": false, "disability": null, "skills": ["Штукатурка", "Монтаж гіпсокартону"], "languages": [{"language": "Українська", "level": "вільно"}], "educations": [ { "institution": "КНУБА", "faculty": "Промислове та цивільне будівництво", "city": "Київ", "level": "Вища", "start_year": 2012, "end_year": 2016 } ], "additional_info": "Готовий до відряджень." } ## Intended Use - Training resume parsers - Semantic search research - Text classification - Career recommendation systems - Applicant ranking models ## Limitations - Some fields may be incomplete due to original document variability ## Ethical Considerations These resumes don't include any sensitive or personal information.