metadata
base_model: sentence-transformers/distiluse-base-multilingual-cased-v1
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:5108
- loss:CosineSimilarityLoss
widget:
- source_sentence: |-
EDUCATION
University of Roseton
Bachelor of Science in Computer Science | Graduated: 2016
Graduated Cum Laude with GPA: 3.9
Class Salutatorian
Computer Club President
Student Council Member
Winslough High School
High School Diploma | 2008 - 2012
Computer Club Vice President
Winslough Warriors Basketball Team Member
AP Scholar Award
National Honor Society Member
SKILLS AND ABILITIES
Programming Languages: Proficient in multiple programming languages
Problem Solving: Strong problem-solving and critical thinking skills
Expertise: Expertise in software systems and computer operating structures
Communication: Effective communication and teamwork skills
sentences:
- >-
Eduvator
Skills and expertise:
Degree:
University graduate, major in Computer Science or Information
Technology.
Priority is given to candidates who graduated from universities such as
University of Natural Sciences, University of Information Technology -
VNU-HCM, University of Technology, etc.
Experience:
At least 2 years of experience developing software for enterprise
solutions, especially using NodeJS.
Professional skills:
Experience with NodeJS and familiarity with ExpressJS (priority is given
to those with experience with Apollo Server).
Good knowledge of SQL and NoSQL databases, ability to write complex
queries (priority is given to those who know how to use MongoDB).
Proficient or experienced with Docker.
Good problem-solving skills, proficient in handling and manipulating
strings/characters and arrays.
Strong knowledge of the git source code management system.
Learning ability:
Ability to learn new technologies quickly.
- >-
Skills & Expertise:
Requirements:
Education:
Graduated from a regular university (with a degree), priority is given
to majors: Information Technology, Telecommunications, Information
Systems, Computer Science, etc.
Experience:
At least 3 years of experience as a Java programmer.
Professional knowledge:
Strong knowledge of Object Oriented Programming (OOP), Web Development,
Database Management System (DBMS), ORM, Design Pattern.
Proficient in Restful API, HTML, CSS, JavaScript (Jquery, ...),
Bootstrap, JSON, XML.
Proficient in PostgreSQL (or MySQL, MSSQL, Oracle), priority is given to
candidates with the ability to design DB and optimize system
performance.
Understand and apply MVC model.
Proficient in Git.
- >-
Skills & Expertise:
Requirements:
Education:
University degree or higher in Information Technology or Electronics -
Telecommunications.
Professional certificates:
CCNA or CCNP certificate.
Priority is given to candidates with certificates such as: MCSA, Azure,
VMware, CEH, OSCE, OSCP, COMPTIA Pentest+ or equivalent.
Foreign languages:
English equivalent to Toeic 400 points or higher.
Office skills:
Proficient in MS Office, Visio.
- source_sentence: >-
EDUCATION Bachelor of Technology Really Great University Really Great
Company 2016 - Present Responsible for database administration and website
design.
Developed the logic for a streamlined ad-serving platform that scales
effectively for educational institutions and online classroom management.
SKILLS Web Design Design Thinking Wireframe Creation Front-End Coding
Back-End Technology Problem Solving Computer Literacy Project Management
Tools Strong Communication Skills
sentences:
- >-
ABBANK
Skill requirements:
University or master's degree in: Information Security, Information
Technology, Software Engineering, Computer Science.
At least 5 years of experience in the field of information security.
At least 2 years of experience in the field of DevOps / DevSecOps.
At least 2 years of experience working with Cloud platforms (AWS, Azure,
GCP).
Deep knowledge of IT in the field of DevSecOps, solutions in On-premise,
Cloud native and Hybrid environments (Container, Kubernetes, Docker,
Git, CI/CD Jenkins, GitLab CI, GitHub...).
Skills related to exploiting vulnerabilities, security weaknesses.
Understanding of OWASP, MITRE Attack.
Deep knowledge of CI/CD pipeline model, deploying integrated security
automation solutions in the development process.
Develop documentation and standard processes to deploy and operate
security solutions on Cloud/DevSecOps in a stable and secure manner.
Have certificates such as CEH, OSCP, CCSP... or equivalent; having a
certificate in GCP Cloud Security is an advantage.
Education requirements:
Graduate from university or master's degree in related field.
Language requirements:
Being able to communicate in English is an advantage.
- >-
FPT
Skills and qualifications required:
General standards:
Be a Vietnamese citizen, have a permanent residence in Vietnam.
Be under 35 years old.
Be in good health to perform the job.
Have good moral qualities, no criminal record, no detention, no prison
sentence, suspended sentence, non-custodial reform, no local education
status, medical treatment, drug rehabilitation, etc.
Specific standards:
Professional qualifications:
Graduate from university or higher, regular system (including second
university, not including university transfer) at domestic universities
or graduate from university or higher at foreign universities,
affiliated universities.
Major: Graduated from majors such as Information Technology, Information
Security, Telecommunications Electronics, Information Electronics,
Mathematics - Informatics or equivalent majors.
Foreign Language:
Having one of the following certificates: TOEIC 600/990, TOEFL PBT/IPT
500/677, TOEFL CBT 173/300, TOEFL iBT 61/120, IELTS 5.5/9.0, Cambridge
Exam First (FCE), B2 European Framework, 4/6 6-level Foreign Language
Proficiency Framework for Vietnam. Accepting additional English
certificates within 24 months from the date of recruitment.
Knowledge, skills and experience:
Experience in managing server systems and storage systems is an
advantage.
Having one of the following certificates: MCITP-SA/Oracle Certified
Associate/Oracle Solaris 11 System Administrator Certification/Linux
Professional Institute LPIC-1 is an advantage.
Withstand high work pressure, accept to monitor, track systems and
handle incidents according to job requirements. Have a serious, careful
and enthusiastic working attitude.
Ability to research and learn technology quickly.
Good communication and problem-solving skills.
Ability to work in a team or independently.
- >-
Professional qualifications:
Experience: At least 1 year of participating in projects related to
Machine Learning and image processing.
Programming languages: Proficiency in Python and using frameworks such
as TensorFlow, PyTorch, Keras.
Programming skills: C/C++ skills are an advantage.
CAD experience: Experience in AI for CAD (AutoCAD, SolidWorks,...) and
reading CAD drawings in Python/C++ is an advantage.
CAD knowledge: Knowledge of 2D to 3D, 3D CAD and Gaussian Splatting is
an advantage.
Awards: Priority is given to those who have IT/mathematical awards or
scientific papers in conferences on Image Processing, Image Recognition.
Language: Good English if the candidate is capable and can work
full-time.
- source_sentence: >-
PROFESSIONAL EXPERIENCE
Giggling Platypus Co.
Software Engineer
01 Jun 2052 - Present
Designed and implemented a new microservice architecture using Borcelle,
improving application performance by 20%.
Developed and maintained critical features for the company's core product,
resulting in a 15% increase in user engagement.
Lead the development of a new automated testing framework using Rimberio,
reducing manual testing time by 30%.
Developed and implemented a real-time data processing pipeline using
Rimberio to handle a 10x increase in data volume.
Data Thynk Unlimited
Software Engineer
01 Aug 2050 - 30 Apr 2052
Developed custom software solutions to improve application performance.
Worked closely with cross-functional teams to ensure requirements and
quality standards were met.
EDUCATION
Fauget University
Bachelor of Science in Information Technology
June 2050
SKILLS & COMPETENCIES
Programming Languages: Rimberio, Borcelle, Java, Python
Software Development: System Design, Automated Testing, CI/CD
Analytical and Problem Solving Skills: Requirements Analysis, Performance
Optimization
Communication and Collaboration: Teamwork, Agile Project Management
Adaptability and Continuous Learning: Quickly Assimilate New Technologies
CERTIFICATIONS
Cloud Certified Rimberio | 2050
Software Lifecycle Professional Borcelle | 2050
Scrum Developer Borcelle | 2050
sentences:
- >-
Skills and expertise:
Education:
University graduate majoring in Information Technology or related field.
Experience:
Minimum 2 years of front-end development experience using the following
technologies:
Proficient in front-end development with JavaScript.
Experience developing applications for Windows operating systems.
Passionate about developing applications for Windows platforms.
- >-
Required Skills:
Experience:
Minimum 3 years of experience working with .NET or .NET Core.
Proficient in back-end technologies: ASP.NET Core, EF6 Code First,
Identity Server 4, RESTful API, SQL Server 2016, C#, Unit Test.
Good knowledge of ASP.NET MVC frameworks (3,4,5,6), JavaScript, jQuery,
JSON, Web API and web application security.
Additional Skills:
Experience with NodeJS or WPF, WinForms is an advantage.
Preferably with Microsoft MCP/MCSD certification in web applications or
Azure solutions.
Understanding of Agile/Scrum development methodology.
Personal Skills:
Honest and confident when working directly with customers.
Passionate about developing innovative products and willing to learn new
technologies from Microsoft Tech Stack and Azure cloud services.
Ability to analyze and translate business needs into system design and
technical solutions.
- >-
Qualifications:
Experience:
Minimum 2-3 years of experience working with Fullstack technologies,
including:
Frontend: Vue.js (TypeScript), React Native or Flutter.
Backend: Node.js, C# .NET Core.
Database: MySQL, SQL Server.
Architecture: Microservices.
Advantages:
Knowledge of Kubernetes (K8s) and ArgoCD.
Experience working with CI/CD and DevOps systems is a big plus.
Good problem-solving skills, ability to work independently and
collaborate effectively in a team.
Logical thinking and willingness to learn new technologies.
- source_sentence: >-
EDUCATION Bachelor of Technology Really Great University Really Great
Company 2016 - Present Responsible for database administration and website
design.
Developed the logic for a streamlined ad-serving platform that scales
effectively for educational institutions and online classroom management.
SKILLS Web Design Design Thinking Wireframe Creation Front-End Coding
Back-End Technology Problem Solving Computer Literacy Project Management
Tools Strong Communication Skills
sentences:
- >-
LG CNS Vietnam
Skill Requirements
University graduate majoring in IT.
Good English communication skills.
At least 1 year of experience with C#, .NET framework.
Experience with SQL and DB Function/Procedure.
Experience with complex SQL optimization (PL/SQL preferred).
Experience developing Zebra print (will be trained if joining the
company).
Experience with C++, Unix/Linux (will be trained if joining the
company).
Ability to read/write Korean is an advantage.
Education Requirements
University graduate majoring in related field.
Language Requirements
Good English communication skills.
Ability to read/write Korean is an advantage.
- |-
CA Advance
Skills & Expertise:
General Requirements:
Education:
College degree or higher.
Development Experience:
2+ years of experience developing Web systems.
Source Code Management:
2+ years of experience using Git, GitHub or GitLab.
Learning Ability:
Willing to learn and develop new languages.
Specific Experience:
2 years of experience in:
Using HTML, CSS and Bootstrap.
Using JavaScript.
Using Next.js, React.js and Redux.
Using Restful API.
1 year of experience in:
Developing on public cloud (AWS, Azure, GCP).
Developing using Agile methodology.
Experience developing team projects.
- >-
Professional Qualifications:
Education:
University degree in Information Technology, Software Engineering or
related fields.
Experience:
More than 4 years of experience as a software developer.
Skills:
Experience managing configuration management platforms such as Git,
GitHub.
Familiar with both WinForm and WebForm applications.
Understanding SQL (MSSQL).
Using tools: Confluence, Jira, GitHub, CI/CD.
Understanding of software development processes: Agile, Waterfall.
Good communication skills (preferred).
Experience with .NET both WinForm and WebForm, Node.js, Angular.
Good skills to have:
Self-management mindset.
Experience with Information Management Systems.
- source_sentence: >-
EDUCATION
BA in Management Information Systems
Duy Tan University (2021 - Expected completion 05/2025)
GPA: 3.6/4.0
TECHNICAL SKILLS
Frontend:
Languages ??& Frameworks: HTML, CSS, JavaScript, TypeScript
Libraries & Tools: TailwindCSS, React.js, Next.js
Backend:
Main Framework: NestJS, ExpressJS
Database & ORM: PostgreSQL, MongoDB, TypeOrm, Mongoose
Cloud Services: AWS, Elasticsearch
Messaging & Streaming: KafkaJS, WebSocket
Container & Deployment: Docker
Other Tools:
Version Control: Git, GitHub
CI/CD: Vercel
PERSONAL PROJECTS
Nestgres
GitHub Repository: Learning Project with NestJS and PostgreSQL. Integrate
AWS S3 for file storage, Docker for containerization, and Elasticsearch
for advanced search. This project helped me gain a deep understanding of
building a scalable backend system and handling big data.
Simple Todo
Live Demo: A simple to-do list application to reinforce my knowledge of
React, including component architecture, state management, and efficient
rendering.
Nestactube
GitHub Repository: A full-stack video platform combining NestJS and React,
serving video streaming from backend to frontend. Focusing on handling
large media files and ensuring smooth video playback.
MindForge
GitHub Repository - Live Demo: Clone of Notion with note creation and
editing features. Using Convex for secure data storage and Clerk for user
authentication. The project helped me develop a friendly interface and
handle complex data structures.
AWARDS & ACHIEVEMENTS
Excellent Academic Performance - 2022, 2023
Boeing Scholarship - 2022, 2023
Third Prize - Duy Tan Informatics Competition, 2023
Third Prize - Informatics Olympiad (non-specialist group), 2023
CERTIFICATIONS
Foundations of User Experience (UX) Design
sentences:
- >-
Skills and Qualifications Required:
Qualifications:
Bachelor's degree in Computer Science, Computer Networking or related
fields from Universities such as University of Science, University of
Natural Sciences, University of Information Technology, Vietnam National
University, Ho Chi Minh City, or Vietnam National University, Hanoi.
Only candidates with academic background and practical experience
directly related to Information Technology will be considered
(candidates from short-term or non-major programs will not be accepted).
Grade Point Average (GPA):
Minimum GPA: 7.0 (on a scale of 10) or 2.8 (on a scale of 4)
Technical Skills:
Strong programming skills with C/C++, along with knowledge of
object-oriented programming.
Have practical experience (1 year or more) working with C/C++.
Basic knowledge of operating systems such as Windows, Linux, and MacOS.
Understanding of network protocols and security principles. Strong team
working skills and problem solving ability.
- >-
Qualifications:
Expected Skills:
Programming Languages: Proficiency in C# and experience working with
Unity3D.
Game Development: Solid understanding of game mechanics, UI/UX, and
physics.
Cybersecurity Tools: Experience with cyber security tools with a
defensive/offensive mindset.
Performance Optimization: Solid skills in game optimization and memory
management.
Version Management Knowledge: Familiarity with Git or similar version
control systems.
Preferred Skills:
AR/VR development experience is a plus.
Knowledge of multi-player and networking concepts, along with
creativity.
Education:
Bachelor's degree in Computer Science, Game Development, or related
field, or equivalent hands-on experience.
- >-
Professional qualifications:
Experience: At least 1 year of participating in projects related to
Machine Learning and image processing.
Programming languages: Proficiency in Python and using frameworks such
as TensorFlow, PyTorch, Keras.
Programming skills: C/C++ skills are an advantage.
CAD experience: Experience in AI for CAD (AutoCAD, SolidWorks,...) and
reading CAD drawings in Python/C++ is an advantage.
CAD knowledge: Knowledge of 2D to 3D, 3D CAD and Gaussian Splatting is
an advantage.
Awards: Priority is given to those who have IT/mathematical awards or
scientific papers in conferences on Image Processing, Image Recognition.
Language: Good English if the candidate is capable and can work
full-time.
SentenceTransformer based on sentence-transformers/distiluse-base-multilingual-cased-v1
This is a sentence-transformers model finetuned from sentence-transformers/distiluse-base-multilingual-cased-v1. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/distiluse-base-multilingual-cased-v1
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 512 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'EDUCATION\nBA in Management Information Systems\nDuy Tan University (2021 - Expected completion 05/2025)\nGPA: 3.6/4.0\nTECHNICAL SKILLS\nFrontend:\n\nLanguages ??& Frameworks: HTML, CSS, JavaScript, TypeScript\nLibraries & Tools: TailwindCSS, React.js, Next.js\nBackend:\n\nMain Framework: NestJS, ExpressJS\nDatabase & ORM: PostgreSQL, MongoDB, TypeOrm, Mongoose\nCloud Services: AWS, Elasticsearch\n\nMessaging & Streaming: KafkaJS, WebSocket\n\nContainer & Deployment: Docker\n\nOther Tools:\n\nVersion Control: Git, GitHub\nCI/CD: Vercel\nPERSONAL PROJECTS\nNestgres\nGitHub Repository: Learning Project with NestJS and PostgreSQL. Integrate AWS S3 for file storage, Docker for containerization, and Elasticsearch for advanced search. This project helped me gain a deep understanding of building a scalable backend system and handling big data.\n\nSimple Todo\nLive Demo: A simple to-do list application to reinforce my knowledge of React, including component architecture, state management, and efficient rendering.\n\nNestactube\nGitHub Repository: A full-stack video platform combining NestJS and React, serving video streaming from backend to frontend. Focusing on handling large media files and ensuring smooth video playback.\n\nMindForge\nGitHub Repository - Live Demo: Clone of Notion with note creation and editing features. Using Convex for secure data storage and Clerk for user authentication. The project helped me develop a friendly interface and handle complex data structures.\n\nAWARDS & ACHIEVEMENTS\nExcellent Academic Performance - 2022, 2023\nBoeing Scholarship - 2022, 2023\nThird Prize - Duy Tan Informatics Competition, 2023\nThird Prize - Informatics Olympiad (non-specialist group), 2023\nCERTIFICATIONS\nFoundations of User Experience (UX) Design',
"Skills and Qualifications Required:\n\nQualifications:\n\nBachelor's degree in Computer Science, Computer Networking or related fields from Universities such as University of Science, University of Natural Sciences, University of Information Technology, Vietnam National University, Ho Chi Minh City, or Vietnam National University, Hanoi.\nOnly candidates with academic background and practical experience directly related to Information Technology will be considered (candidates from short-term or non-major programs will not be accepted).\nGrade Point Average (GPA):\n\nMinimum GPA: 7.0 (on a scale of 10) or 2.8 (on a scale of 4)\n\nTechnical Skills:\n\nStrong programming skills with C/C++, along with knowledge of object-oriented programming.\nHave practical experience (1 year or more) working with C/C++.\nBasic knowledge of operating systems such as Windows, Linux, and MacOS.\nUnderstanding of network protocols and security principles. Strong team working skills and problem solving ability.",
"Qualifications:\n\nExpected Skills:\n\nProgramming Languages: Proficiency in C# and experience working with Unity3D.\nGame Development: Solid understanding of game mechanics, UI/UX, and physics.\nCybersecurity Tools: Experience with cyber security tools with a defensive/offensive mindset.\nPerformance Optimization: Solid skills in game optimization and memory management.\nVersion Management Knowledge: Familiarity with Git or similar version control systems.\nPreferred Skills:\n\nAR/VR development experience is a plus.\nKnowledge of multi-player and networking concepts, along with creativity.\nEducation:\n\nBachelor's degree in Computer Science, Game Development, or related field, or equivalent hands-on experience.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 5,108 training samples
- Columns:
sentence_0,sentence_1, andlabel - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 44 tokens
- mean: 106.09 tokens
- max: 128 tokens
- min: 46 tokens
- mean: 117.55 tokens
- max: 128 tokens
- min: 0.5
- mean: 0.66
- max: 0.74
- Samples:
sentence_0 sentence_1 label EDUCATION
Software Engineer (CMU)
Duy Tan University, 09/2021 - Present
GPA: 3.6/4.0
OUTSTANDING PROJECTS
Fashion Store Website
Time: 17/01 - 25/02
Role: Backend Developer
Features:
Register, log in, log out (customer rights)
Log in and manage with admin rights
Add, edit, delete products and vouchers (admin)
Display product and voucher list
Technology:
Frontend: Customer page & admin dashboard (edit from available template)
Backend: Typescript (NestJS)
Database: MongoDB (Mongoose)
GitHub Link: Fashion Store Project
SKILLS
Front-end: HTML, CSS, JavaScript, Bootstrap 5, ReactJS
Back-end: ExpressJS, NestJS (TypeScript)
Database: MySQL, MongoDB
OTHER TOOLS & TECHNOLOGIES
Tools: Git (GitHub), Docker, Postman
Other languages: C# (ASP.NET), Java (Spring Boot)
LANGUAGES
English: Good communication and reading skillsProfessional Qualifications:
Experience: Minimum 2 years of experience working with Java, proficient in Spring Boot.
Frontend Framework: Proficient in using one of the Frontend frameworks (such as Vue, React,...) to develop interactive user interfaces.
Database: Experience working with relational databases such as MS SQL Server, Oracle, PostgreSQL, MongoDB.
Microservice Architecture: Experience with microservice architecture and containerization technologies such as Kubernetes/Docker.
Multitasking: Ability to handle multitasking, multithreading, multiprocessing, and mechanisms such as hash tables, file processing mechanisms.
API Integration: Experience in integrating/developing RESTful, SOAP, TCP/IP APIs.
Operating System: Experience working with Enterprise Linux and Windows Server operating systems.0.65EDUCATION HISTORY University of Roseton Master of Science in Software EngineeringGraduated: 2020 Best Thesis Awardee Berou Solutions Scholarship Recipient De Loureigh University Bachelor of Science in Computer Science Graduated: 2016 (Cum Laude) Founder of DLU Programming Club Hackathon Champion Beechtown 2015 RELEVANT SKILLS Programming Languages: JavaScript, C/C++, Java, Python, Kotlin, Go Core Skills: Problem Solving, Team Communication EDUCATION Bachelor of Science in Computer Science Rutgers University ? New Brunswick, NJ 2008 - 2012 SKILLS HTML CSS JavaScript React jQuery Angular.js Vue.js Enzyme Jest GitABBANK
Skill requirements:
Minimum 5 years of experience as a BA in CLIMS system.
Experience in lending activities.
Experience in developing test scenarios and test plans for software projects.
Understanding of software development processes.
Ensure compliance with application development standards and quality processes.
Ability to analyze and resolve complex technical issues, provide feasible and timely solutions.
Effective communication and teamwork skills when collaborating with different functional groups.
Focus on Agile implementation and willingness to learn and grow from experience.
Education requirements:
University degree in IT, banking, finance or related fields.
Language requirements:
Being able to communicate in English is an advantage.0.62 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16per_device_eval_batch_size: 16multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseeval_use_gather_object: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin
Training Logs
| Epoch | Step | Training Loss |
|---|---|---|
| 1.5625 | 500 | 0.0023 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.2.1
- Transformers: 4.44.2
- PyTorch: 2.5.0+cu121
- Accelerate: 0.34.2
- Datasets: 3.1.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}