File size: 75,813 Bytes
8f5972e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:814
- loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-distilroberta-v1
widget:
- source_sentence: data modeling, predictive analytics, technical writing
  sentences:
  - 'experience in data engineeringStrong understanding of Datawarehousing conceptsProficient
    in Python for building UDFs and pre-processing scriptsProficient in sourcing data
    from APIs and cloud storage systemsProficient in SQL with analytical thought processExperience
    working on Airflow orchestrationMust have experience working on any of the cloud
    platforms - AWS would be preferredExperience with CI/CD tools in a python tech
    stackExperience working on Snowflake Datawarehouse would be nice to haveCompetent
    working in secured internal network environmentsExperience working in story and
    task-tracking tools for agile workflowsMotivated and Self-Starting: able to think
    critically about problems, decipher user preferences versus hard requirements,
    and effectively use online and onsite resources to find an appropriate solution
    with little interventionPassionate about writing clear, maintainable code that
    will be used and modified by others, and able to use and modify other developers’
    work rather than recreate itBachelor’s Degree in related field'
  - 'requirements and deliver innovative solutionsPerform data cleaning, preprocessing,
    and feature engineering to improve model performanceOptimize and fine-tune machine
    learning models for scalability and efficiencyEvaluate and improve existing ML
    algorithms, frameworks, and toolkitsStay up-to-date with the latest trends and
    advancements in the field of machine learning

    RequirementsBachelor''s degree in Computer Science, Engineering, or a related
    fieldStrong knowledge of machine learning algorithms and data modeling techniquesProficiency
    in Python and its associated libraries such as TensorFlow, PyTorch, or scikit-learnExperience
    with big data technologies such as Hadoop, Spark, or Apache KafkaFamiliarity with
    cloud computing platforms such as AWS or Google CloudExcellent problem-solving
    and analytical skillsStrong communication and collaboration abilitiesAbility to
    work effectively in a fast-paced and dynamic environment'
  - "Qualifications\n\n3 to 5 years of experience in exploratory data analysisStatistics\
    \ Programming, data modeling, simulation, and mathematics Hands on working experience\
    \ with Python, SQL, R, Hadoop, SAS, SPSS, Scala, AWSModel lifecycle executionTechnical\
    \ writingData storytelling and technical presentation skillsResearch SkillsInterpersonal\
    \ SkillsModel DevelopmentCommunicationCritical ThinkingCollaborate and Build RelationshipsInitiative\
    \ with sound judgementTechnical (Big Data Analysis, Coding, Project Management,\
    \ Technical Writing, etc.)Problem Solving (Responds as problems and issues are\
    \ identified)Bachelor's Degree in Data Science, Statistics, Mathematics, Computers\
    \ Science, Engineering, or degrees in similar quantitative fields\n\n\nDesired\
    \ Qualification(s)\n\nMaster's Degree in Data Science, Statistics, Mathematics,\
    \ Computer Science, or Engineering\n\n\nHours: Monday - Friday, 8:00AM - 4:30PM\n\
    \nLocations: 820 Follin Lane, Vienna, VA 22180 | 5510 Heritage Oaks Drive, Pensacola,\
    \ FL 32526\n\nAbout Us\n\nYou have goals, dreams, hobbies, and things you're passionate\
    \ about—what's important to you is important to us. We're looking for people who\
    \ not only want to do meaningful, challenging work, keep their skills sharp and\
    \ move ahead, but who also take time for the things that matter to them—friends,\
    \ family, and passions. And we're looking for team members who are passionate\
    \ about our mission—making a difference in military members' and their families'\
    \ lives. Together, we can make it happen. Don't take our word for it:\n\n Military\
    \ Times 2022 Best for Vets Employers WayUp Top 100 Internship Programs Forbes®\
    \ 2022 The Best Employers for New Grads Fortune Best Workplaces for Women Fortune\
    \ 100 Best Companies to Work For® Computerworld® Best Places to Work in IT Ripplematch\
    \ Campus Forward Award - Excellence in Early Career Hiring Fortune Best Place\
    \ to Work for Financial and Insurance Services\n\n\n\n\nDisclaimers: Navy Federal\
    \ reserves the right to fill this role at a higher/lower grade level based on\
    \ business need. An assessment may be required to compete for this position. Job\
    \ postings are subject to close early or extend out longer than the anticipated\
    \ closing date at the hiring team’s discretion based on qualified applicant volume.\
    \ Navy Federal Credit Union assesses market data to establish salary ranges that\
    \ enable us to remain competitive. You are paid within the salary range, based\
    \ on your experience, location and market position\n\nBank Secrecy Act: Remains\
    \ cognizant of and adheres to Navy Federal policies and procedures, and regulations\
    \ pertaining to the Bank Secrecy Act."
- source_sentence: Foreign Exchange analytics, cross-border payments expertise, financial
    services reporting
  sentences:
  - 'requirements gathering to recommend SAP solutions that drive data-driven decision-making
    and operational efficiency.


    Client Engagement And Advisory


    Build and maintain robust client relationships, serving as a trusted advisor on
    SAP Analytics capabilities and industry best practices.Address client challenges
    by aligning SAP Analytics solutions with their strategic goals, enhancing their
    analytical capabilities and reporting functions.


    Project Leadership And Management


    Oversee SAP Analytics implementation projects, ensuring timely delivery within
    scope and budget.Lead and inspire cross-functional teams, promoting collaboration
    and innovation to meet and exceed project objectives.


    Risk Management And Quality Assurance


    Proactively identify and address potential project risks, developing strategies
    to mitigate them and ensure project success.Uphold the highest standards of quality
    for all project deliverables, ensuring they meet Argano’s expectations and client
    requirements.


    Change Management And Training


    Facilitate effective change management processes associated with the implementation
    of SAP Analytics solutions, minimizing business disruption.Design and conduct
    comprehensive training sessions to empower clients with the knowledge and skills
    to leverage SAP Analytics solutions fully.


    Thought Leadership And Innovation


    Maintain up-to-date knowledge of the latest SAP Analytics developments, trends,
    and best practices, positioning Argano as a thought leader in the field.Foster
    a culture of continuous improvement by sharing insights and best practices with
    clients and internal teams.


    Minimum And/or Preferred Qualifications


    Education: Bachelor''s or master''s degree in Business Administration, Computer
    Science, Information Systems, Engineering, or a related field.Experience: Minimum
    of 5+ years in SAP consulting, with extensive experience in SAP Analytics Suite
    (which includes native SAP products, Google, Azure, AWS, and other cloud vendor
    products for SAP customers), SAP Analytics Cloud (SAC), SAP Datasphere/Data Warehousing
    Cloud, SAP Embedded Modeling.Certifications: SAP certifications in Analytics,
    SAC, Datasphere/DWC, or related areas are highly regarded.Skills:Profound expertise
    in SAP Analytics, SAP Analytics Suite (which includes native SAP products, Google,
    Azure, AWS, and other cloud vendor products for SAP customers), SAP Analytics
    Cloud (SAC), SAP Datasphere/Data Warehousing Cloud, SAP Embedded Modeling.Exceptional
    project management and leadership skills, capable of guiding teams through complex
    implementations.Excellent client engagement and communication skills, adept at
    establishing trust and acting as a strategic advisor.Strong capabilities in risk
    management, quality assurance, and change management.Travel required depending
    on the project.

    This position offers a unique chance to make a significant impact on our clients''
    success and to contribute to the growth and prestige of Argano as a global leader
    in digital consultancy. If you are a seasoned expert in SAP Data & Analytics with
    a passion for digital transformation and a proven track record of delivering results,
    we invite you to join our dynamic team.


    About Us


    Argano is the first of its kind: a digital consultancy totally immersed in high-performance
    operations. We steward enterprises through ever-evolving markets, empowering them
    with transformative strategies and technologies to exceed customer expectations,
    unlock commercial innovation, and drive optimal efficiency and growth.


    Argano is an equal-opportunity employer. All applicants will be considered for
    employment without regard to race, color, religion, sex, sexual orientation, gender
    identity, national origin, veteran status, or disability status.'
  - 'experience in the industries we serve, and to partner with diverse teams of passionate,
    enterprising SVBers, dedicated to an inclusive approach to helping them grow and
    succeed at every stage of their business.


    Join us at SVB and be part of bringing our clients'' world-changing ideas to life.
    At SVB, we have the opportunity to grow and collectively make an impact by supporting
    the innovative clients and communities SVB serves. We pride ourselves in having
    both a diverse client roster and an equally diverse and inclusive organization.
    And we work diligently to encourage all with different ways of thinking, different
    ways of working, and especially those traditionally underrepresented in technology
    and financial services, to apply.


    Responsibilities


    SVB’s Foreign Exchange business is one of the largest FX providers to the Innovation
    economy. We support the transactional and risk management needs of our fast-growing
    clients as they expand and do business internationally.


    Located close to one of our Hubs in SF, NYC or Raleigh and reporting to the Managing
    Director of FX Strategy, this Business Data Analyst will be an integral part of
    the Product Strategy and Business Management team, supporting and driving the
    insights that will be used to formulate, drive and validate our strategic and
    business effectiveness.





    You will take part in complex, multi-disciplinary projects to further enable the
    Product, Trading and Sales teams. You will be a fast learner who is comfortable
    in the weeds with analytics and data manipulation whilst developing the story
    for leadership.


    This role would be a great fit for a creative, curious and energetic individual
    and offers the right candidate the opportunity to grow while creating significant
    business value by continuously improving business intelligence/reporting, processes,
    procedures, and workflow.


    The ideal candidate will have 3-5 yrs experience in Financial Services or Fintech,
    preferably with FX, Trading or Cross Border Payment experience.





    requirements.Become familiar with the evolving FX, Fintech and Banking landscape
    to overlay industry insights.Drive continued evolution of our business analytics/data
    framework in order to inform MI and product evaluation.Assist with maintenance
    and accuracy of company data within SVB’s data repositories.


    Qualifications


    Basic Requirements:


    BS/BA Degree – preferably in a quantitative discipline (e.g., Economics, Mathematics,
    Statistics) or a HS Diploma or GED with equivalent work experience3-5 years’ experience
    in financial services or fintech, ideally within FX or Cross Border Payments


    Preferred Requirements:


    Strong attention to detail with an eye for data governance and compliance


    Aptitude for framing business questions in analytic terms and translating requirements
    into useful datasets and analyses with actionable insights.'
  - 'experience, and job responsibilities, and does not encompass additional non-standard
    compensation (e.g., benefits, paid time off, per diem, etc.). Job Description:Work
    with Material Master product team to gather requirements, collect data, lead cleansing
    efforts and load/support data loads into SAP.Will need to bridge the gap between
    business and IT teams to document and set expectations of work/deliverables.Create
    and maintain trackers that show progress and hurdles to PM’s and stakeholders.Assist
    in go live of site including, collecting, cleansing and loading data into SAP
    system.Middleman between IT and business stakeholderAble to communicate data models.Knowledge
    in SAP and MDG is preferred.Years of experience: 2+ in data analytics spaceStrong
    communication skills are a must.Will be working on multiple high priority, high
    paced projects where attention to detail and organization is required.Intermediate
    to Senior position – great opportunity to learn an in-demand area of SAP MDG.Strong
    willingness to learn – no ceiling on learning and growth potential and plenty
    of work to go around. About BCforward:Founded in 1998 on the idea that industry
    leaders needed a professional service, and workforce management expert, to fuel
    the development and execution of core business and technology strategies, BCforward
    is a Black-owned firm providing unique solutions supporting value capture and
    digital product delivery needs for organizations around the world. Headquartered
    in Indianapolis, IN with an Offshore Development Center in Hyderabad, India, BCforward’s
    6,000 consultants support more than 225 clients globally.BCforward champions the
    power of human potential to help companies transform, accelerate, and scale. Guided
    by our core values of People-Centric, Optimism, Excellence, Diversity, and Accountability,
    our professionals have helped our clients achieve their strategic goals for more
    than 25 years. Our strong culture and clear values have enabled BCforward to become
    a market leader and best in class place to work.BCforward is'
- source_sentence: data modeling, statistical analysis, data visualization tools
  sentences:
  - "skills to translate the complexity of your work into tangible business goals\
    \ \n\nThe Ideal Candidate is\n\n Customer first. You love the process of analyzing\
    \ and creating, but also share our passion to do the right thing. You know at\
    \ the end of the day it’s about making the right decision for our customers. \
    \ Innovative. You continually research and evaluate emerging technologies. You\
    \ stay current on published state-of-the-art methods, technologies, and applications\
    \ and seek out opportunities to apply them.  Creative. You thrive on bringing\
    \ definition to big, undefined problems. You love asking questions and pushing\
    \ hard to find answers. You’re not afraid to share a new idea.  A leader. You\
    \ challenge conventional thinking and work with stakeholders to identify and improve\
    \ the status quo. You’re passionate about talent development for your own team\
    \ and beyond.  Technical. You’re comfortable with open-source languages and are\
    \ passionate about developing further. You have hands-on experience developing\
    \ data science solutions using open-source tools and cloud computing platforms.\
    \  Statistically-minded. You’ve built models, validated them, and backtested them.\
    \ You know how to interpret a confusion matrix or a ROC curve. You have experience\
    \ with clustering, classification, sentiment analysis, time series, and deep learning.\
    \  A data guru. “Big data” doesn’t faze you. You have the skills to retrieve,\
    \ combine, and analyze data from a variety of sources and structures. You know\
    \ understanding the data is often the key to great data science. \n\nBasic Qualifications:\n\
    \n Currently has, or is in the process of obtaining a Bachelor’s Degree plus 2\
    \ years of experience in data analytics, or currently has, or is in the process\
    \ of obtaining Master’s Degree, or currently has, or is in the process of obtaining\
    \ PhD, with an expectation that required degree will be obtained on or before\
    \ the scheduled start dat  At least 1 year of experience in open source programming\
    \ languages for large scale data analysis  At least 1 year of experience with\
    \ machine learning  At least 1 year of experience with relational databases \n\
    \nPreferred Qualifications:\n\n Master’s Degree in “STEM” field (Science, Technology,\
    \ Engineering, or Mathematics) plus 3 years of experience in data analytics, or\
    \ PhD in “STEM” field (Science, Technology, Engineering, or Mathematics)  At least\
    \ 1 year of experience working with AWS  At least 2 years’ experience in Python,\
    \ PyTorch, Scala, or R  At least 2 years’ experience with machine learning  At\
    \ least 2 years’ experience with SQL  At least 2 years' experience working with\
    \ natural language processing \n\nCapital One will consider sponsoring a new qualified\
    \ applicant for employment authorization for this position.\n\nThe minimum and\
    \ maximum full-time annual salaries for this role are listed below, by location.\
    \ Please note that this salary information is solely for candidates hired to perform\
    \ work within one of these locations, and refers to the amount Capital One is\
    \ willing to pay at the time of this posting. Salaries for part-time roles will\
    \ be prorated based upon the agreed upon number of hours to be regularly worked.\n\
    \nNew York City (Hybrid On-Site): $138,500 - $158,100 for Data Science Masters\n\
    \nSan Francisco, California (Hybrid On-site): $146,700 - $167,500 for Data Science\
    \ Masters\n\nCandidates hired to work in other locations will be subject to the\
    \ pay range associated with that location, and the actual annualized salary amount\
    \ offered to any candidate at the time of hire will be reflected solely in the\
    \ candidate’s offer letter.\n\nThis role is also eligible to earn performance\
    \ based incentive compensation, which may include cash bonus(es) and/or long term\
    \ incentives (LTI). Incentives could be discretionary or non discretionary depending\
    \ on the plan.\n\nCapital One offers a comprehensive, competitive, and inclusive\
    \ set of health, financial and other benefits that support your total well-being.\
    \ Learn more at the Capital One Careers website . Eligibility varies based on\
    \ full or part-time status, exempt or non-exempt status, and management level.\n\
    \nThis role is expected to accept applications for a minimum of 5 business days.No\
    \ agencies please. Capital One is \n\nIf you have visited our website in search\
    \ of information on employment opportunities or to apply for a position, and you\
    \ require an accommodation, please contact Capital One Recruiting at 1-800-304-9102\
    \ or via email at RecruitingAccommodation@capitalone.com . All information you\
    \ provide will be kept confidential and will be used only to the extent required\
    \ to provide needed reasonable accommodations.\n\nFor technical support or questions\
    \ about Capital One's recruiting process, please send an email to Careers@capitalone.com\n\
    \nCapital One does not provide, endorse nor guarantee and is not liable for third-party\
    \ products, services, educational tools or other information available through\
    \ this site.\n\nCapital One Financial is made up of several different entities.\
    \ Please note that any position posted in Canada is for Capital One Canada, any\
    \ position posted in the United Kingdom is for Capital One Europe and any position\
    \ posted in the Philippines is for Capital One Philippines Service Corp. (COPSSC)."
  - "experienced team that caters to niche skills demands for customers across various\
    \ technologies and verticals.\n Role Description\n This is a full-time on-site\
    \ role for a Data Engineer at Computer Data Concepts, Inc. The Data Engineer will\
    \ be responsible for day-to-day tasks related to data engineering, data modeling,\
    \ ETL (Extract Transform Load), data warehousing, and data analytics. The role\
    \ requires expertise in handling and manipulating large datasets, designing and\
    \ maintaining databases, and implementing efficient data processing systems.\n\
    \ Qualifications\n Data Engineering skillsData Modeling skillsETL (Extract Transform\
    \ Load) skillsData Warehousing skillsData Analytics skillsStrong analytical and\
    \ problem-solving abilitiesProficiency in programming languages such as Python\
    \ or SQLExperience with cloud-based data platforms like AWS or AzureKnowledge\
    \ of data visualization tools like Tableau or PowerBIExcellent communication and\
    \ teamwork skillsBachelor's degree in Computer Science, Data Science, or a related\
    \ fieldRelevant certifications in data engineering or related areas"
  - "requirements.\n\n Qualifications\n \nStrong analytical skills, with experience\
    \ in data analysis and statistical techniquesProficiency in data modeling and\
    \ data visualization toolsExcellent communication skills, with the ability to\
    \ effectively convey insights to stakeholdersExperience in business analysis and\
    \ requirements analysisProject management skillsDatabase administration knowledgeBackground\
    \ in Data Analytics and StatisticsExperience with Big Data technologies like Hadoop"
- source_sentence: ETL development, data modelling, DBT framework
  sentences:
  - "Qualifications\n Strong knowledge in Pattern Recognition and Neural NetworksProficiency\
    \ in Computer Science and StatisticsExperience with Algorithms and Data StructuresHands-on\
    \ experience in machine learning frameworks and librariesFamiliarity with cloud\
    \ platforms and big data technologiesExcellent problem-solving and analytical\
    \ skillsStrong programming skills in languages such as Python or RGood communication\
    \ and collaboration skillsMaster's or PhD in Computer Science, Data Science, or\
    \ a related field"
  - "skills as well as strong leadership qualities.\n\nThis position is eligible for\
    \ the TalentQuest employee referral program. If an employee referred you for this\
    \ job, please apply using the system-generated link that was sent to you.\n\n\
    Responsibilities\n\nDesign, develop, and evaluate large and complex predictive\
    \ models and advanced algorithms Test hypotheses/models, analyze, and interpret\
    \ resultsDevelop actionable insights and recommendationsDevelop and code complex\
    \ software programs, algorithms, and automated processesUse evaluation, judgment,\
    \ and interpretation to select right course of actionWork on problems of diverse\
    \ scope where analysis of information requires evaluation of identifiable factorsProduce\
    \ innovative solutions driven by exploratory data analysis from complex and high-dimensional\
    \ datasetsTransform data into charts, tables, or format that aids effective decision\
    \ makingUtilize effective written and verbal communication to document analyses\
    \ and present findings analyses to a diverse audience of stakeholders Develop\
    \ and maintain strong working relationships with team members, subject matter\
    \ experts, and leadersLead moderate to large projects and initiativesModel best\
    \ practices and ethical AIWorks with senior management on complex issuesAssist\
    \ with the development and enhancement practices, procedures, and instructionsServe\
    \ as technical resource for other team membersMentor lower levels\n\n\nQualifications\n\
    \n6+ years of experience with requisite competenciesFamiliar with analytical frameworks\
    \ used to support the pricing of lending productsFamiliar with analytical models/analysis\
    \ used to support credit card underwriting and account management underwriting\
    \ policiesFamiliar using GitHub for documentation and code collaboration purposesComplete\
    \ knowledge and full understanding of specializationStatistics, machine learning\
    \ , data mining, data auditing, aggregation, reconciliation, and visualizationProgramming,\
    \ data modeling, simulation, and advanced mathematics SQL, R, Python, Hadoop,\
    \ SAS, SPSS, Scala, AWSModel lifecycle executionTechnical writingData storytelling\
    \ and technical presentation skillsResearch SkillsInterpersonal SkillsAdvanced\
    \ knowledge of procedures, instructions and validation techniquesModel DevelopmentCommunicationCritical\
    \ ThinkingCollaborate and Build RelationshipsInitiative with sound judgementTechnical\
    \ (Big Data Analysis, Coding, Project Management, Technical Writing, etc.)Independent\
    \ JudgmentProblem Solving (Identifies the constraints and risks)Bachelor's Degree\
    \ in Data Science, Statistics, Mathematics, Computers Science, Engineering, or\
    \ degrees in similar quantitative fields\n\n\nDesired Qualification(s)\n\nMaster's/PhD\
    \ Degree in Data Science, Statistics, Mathematics, Computers Science, or Engineering\n\
    \n\nHours: Monday - Friday, 8:00AM - 4:30PM\n\nLocation: 820 Follin Lane, Vienna,\
    \ VA 22180\n\nAbout Us\n\nYou have goals, dreams, hobbies, and things you're passionate\
    \ about—what's important to you is important to us. We're looking for people who\
    \ not only want to do meaningful, challenging work, keep their skills sharp and\
    \ move ahead, but who also take time for the things that matter to them—friends,\
    \ family, and passions. And we're looking for team members who are passionate\
    \ about our mission—making a difference in military members' and their families'\
    \ lives. Together, we can make it happen. Don't take our word for it:\n\n Military\
    \ Times 2022 Best for Vets Employers WayUp Top 100 Internship Programs Forbes®\
    \ 2022 The Best Employers for New Grads Fortune Best Workplaces for Women Fortune\
    \ 100 Best Companies to Work For® Computerworld® Best Places to Work in IT Ripplematch\
    \ Campus Forward Award - Excellence in Early Career Hiring Fortune Best Place\
    \ to Work for Financial and Insurance Services\n\n\n\n\nDisclaimers: Navy Federal\
    \ reserves the right to fill this role at a higher/lower grade level based on\
    \ business need. An assessment may be required to compete for this position. Job\
    \ postings are subject to close early or extend out longer than the anticipated\
    \ closing date at the hiring team’s discretion based on qualified applicant volume.\
    \ Navy Federal Credit Union assesses market data to establish salary ranges that\
    \ enable us to remain competitive. You are paid within the salary range, based\
    \ on your experience, location and market position\n\nBank Secrecy Act: Remains\
    \ cognizant of and adheres to Navy Federal policies and procedures, and regulations\
    \ pertaining to the Bank Secrecy Act."
  - 'requirements and data mapping documents into a technical design.Develop, enhance,
    and maintain code following best practices and standards.Execute unit test plans
    and support regression/system testing.Debug and troubleshoot issues found during
    testing or production.Communicate project status, issues, and blockers with the
    team.Contribute to continuous improvement by identifying and addressing opportunities.

    Qualifications / Skills:Minimum of 5 years of experience in ETL/ELT development
    within a Data Warehouse.Understanding of enterprise data warehousing best practices
    and standards.Familiarity with DBT framework.Comfortable with git fundamentals
    change management.Minimum of 5 years of experience in ETL development.Minimum
    of 5 years of experience writing SQL queries.Minimum of 2 years of experience
    with Python.Minimum of 3 years of cloud experience with AWS, Azure or Google.Experience
    in P&C Insurance or Financial Services Industry preferred.Understanding of data
    warehousing best practices and standards.Experience in software engineering, including
    designing and developing systems.

    Education and/or Experience:Required knowledge & skills would typically be acquired
    through a bachelor’s degree in computer sciences or 5 or more years of related
    experience in ELT and/or Analytics Engineering'
- source_sentence: Data engineering, ETL workflows, cloud-based data solutions
  sentences:
  - "Qualifications and Skills Education: Bachelor's degree in Computer Science or\
    \ a related field. Experience: 5+ years in Software Engineering with a focus on\
    \ Data Engineering. Technical Proficiency: Expertise in Python; familiarity with\
    \ JavaScript and Java is beneficial. Proficient in SQL (Postgres, Presto/Trino\
    \ dialects), ETL workflows, and workflow orchestration systems (e.g. Airflow,\
    \ Prefect). Knowledge of modern data file formats (e.g. Parquet, Avro, ORC) and\
    \ Python data tools (e.g. pandas, Dask, Ray). Cloud and Data Solutions: Experience\
    \ in building cloud-based Data Warehouse/Data Lake solutions (AWS Athena, Redshift,\
    \ Snowflake) and familiarity with AWS cloud services and infrastructure-as-code\
    \ tools (CDK, Terraform). Communication Skills: Excellent communication and presentation\
    \ skills, fluent in English. Work Authorization: Must be authorized to work in\
    \ the US. \nWork Schedule Hybrid work schedule: Minimum 3 days per week in the\
    \ San Francisco office (M/W/Th), with the option to work remotely 2 days per week.\
    \ \nSalary Range: $165,000-$206,000 base depending on experience \nBonus: Up to\
    \ 20% annual performance bonus \nGenerous benefits package: Fully paid healthcare,\
    \ monthly reimbursements for gym, commuting, cell phone & home wifi."
  - "experience with Transformers\nNeed to be 8+ year's of work experience. \nWe need\
    \ a Data Scientist with demonstrated expertise in training and evaluating transformers\
    \ such as BERT and its derivatives.\nRequired: Proficiency with Python, pyTorch,\
    \ Linux, Docker, Kubernetes, Jupyter. Expertise in Deep Learning, Transformers,\
    \ Natural Language Processing, Large Language Models\nPreferred: Experience with\
    \ genomics data, molecular genetics. Distributed computing tools like Ray, Dask,\
    \ Spark"
  - 'Experience with LLMs and PyTorch: Extensive experience with large language models
    and proficiency in PyTorch.Expertise in Parallel Training and GPU Cluster Management:
    Strong background in parallel training methods and managing large-scale training
    jobs on GPU clusters.Analytical and Problem-Solving Skills: Ability to address
    complex challenges in model training and optimization.Leadership and Mentorship
    Capabilities: Proven leadership in guiding projects and mentoring team members.Communication
    and Collaboration Skills: Effective communication skills for conveying technical
    concepts and collaborating with cross-functional teams.Innovation and Continuous
    Learning: Passion for staying updated with the latest trends in AI and machine
    learning.


    What We Offer


    Market competitive and pay equity-focused compensation structure100% paid health
    insurance for employees with 90% coverage for dependentsAnnual lifestyle wallet
    for personal wellness, learning and development, and more!Lifetime maximum benefit
    for family forming and fertility benefitsDedicated mental health support for employees
    and eligible dependentsGenerous time away including company holidays, paid time
    off, sick time, parental leave, and more!Lively office environment with catered
    meals, fully stocked kitchens, and geo-specific commuter benefits


    Base pay for the successful applicant will depend on a variety of job-related
    factors, which may include education, training, experience, location, business
    needs, or market demands. The expected salary range for this role is based on
    the location where the work will be performed and is aligned to one of 3 compensation
    zones. This role is also eligible to participate in a Robinhood bonus plan and
    Robinhood’s equity plan. For other locations not listed, compensation can be discussed
    with your recruiter during the interview process.


    Zone 1 (Menlo Park, CA; New York, NY; Bellevue, WA; Washington, DC)


    $187,000—$220,000 USD


    Zone 2 (Denver, CO; Westlake, TX; Chicago, IL)


    $165,000—$194,000 USD


    Zone 3 (Lake Mary, FL)


    $146,000—$172,000 USD


    Click Here To Learn More About Robinhood’s Benefits.


    We’re looking for more growth-minded and collaborative people to be a part of
    our journey in democratizing finance for all. If you’re ready to give 100% in
    helping us achieve our mission—we’d love to have you apply even if you feel unsure
    about whether you meet every single requirement in this posting. At Robinhood,
    we''re looking for people invigorated by our mission, values, and drive to change
    the world, not just those who simply check off all the boxes.


    Robinhood embraces a diversity of backgrounds and experiences and provides equal
    opportunity for all applicants and employees. We are dedicated to building a company
    that represents a variety of backgrounds, perspectives, and skills. We believe
    that the more inclusive we are, the better our work (and work environment) will
    be for everyone. Additionally, Robinhood provides reasonable accommodations for
    candidates on request and respects applicants'' privacy rights. To review Robinhood''s
    Privacy Policy please review the specific policy applicable to your country.'
datasets:
- pfrenee/ai_alignment
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
model-index:
- name: SentenceTransformer based on sentence-transformers/all-distilroberta-v1
  results:
  - task:
      type: triplet
      name: Triplet
    dataset:
      name: ai job validation
      type: ai-job-validation
    metrics:
    - type: cosine_accuracy
      value: 0.9801980257034302
      name: Cosine Accuracy
  - task:
      type: triplet
      name: Triplet
    dataset:
      name: ai job test
      type: ai-job-test
    metrics:
    - type: cosine_accuracy
      value: 0.9708737730979919
      name: Cosine Accuracy
---

# SentenceTransformer based on sentence-transformers/all-distilroberta-v1

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-distilroberta-v1](https://huggingface.co/sentence-transformers/all-distilroberta-v1) on the [ai_alignment](https://huggingface.co/datasets/pfrenee/ai_alignment) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [sentence-transformers/all-distilroberta-v1](https://huggingface.co/sentence-transformers/all-distilroberta-v1) <!-- at revision 842eaed40bee4d61673a81c92d5689a8fed7a09f -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
    - [ai_alignment](https://huggingface.co/datasets/pfrenee/ai_alignment)
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'RobertaModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("pfrenee/distilroberta_ai_alignment")
# Run inference
queries = [
    "Data engineering, ETL workflows, cloud-based data solutions",
]
documents = [
    "Qualifications and Skills Education: Bachelor's degree in Computer Science or a related field. Experience: 5+ years in Software Engineering with a focus on Data Engineering. Technical Proficiency: Expertise in Python; familiarity with JavaScript and Java is beneficial. Proficient in SQL (Postgres, Presto/Trino dialects), ETL workflows, and workflow orchestration systems (e.g. Airflow, Prefect). Knowledge of modern data file formats (e.g. Parquet, Avro, ORC) and Python data tools (e.g. pandas, Dask, Ray). Cloud and Data Solutions: Experience in building cloud-based Data Warehouse/Data Lake solutions (AWS Athena, Redshift, Snowflake) and familiarity with AWS cloud services and infrastructure-as-code tools (CDK, Terraform). Communication Skills: Excellent communication and presentation skills, fluent in English. Work Authorization: Must be authorized to work in the US. \nWork Schedule Hybrid work schedule: Minimum 3 days per week in the San Francisco office (M/W/Th), with the option to work remotely 2 days per week. \nSalary Range: $165,000-$206,000 base depending on experience \nBonus: Up to 20% annual performance bonus \nGenerous benefits package: Fully paid healthcare, monthly reimbursements for gym, commuting, cell phone & home wifi.",
    "Experience with LLMs and PyTorch: Extensive experience with large language models and proficiency in PyTorch.Expertise in Parallel Training and GPU Cluster Management: Strong background in parallel training methods and managing large-scale training jobs on GPU clusters.Analytical and Problem-Solving Skills: Ability to address complex challenges in model training and optimization.Leadership and Mentorship Capabilities: Proven leadership in guiding projects and mentoring team members.Communication and Collaboration Skills: Effective communication skills for conveying technical concepts and collaborating with cross-functional teams.Innovation and Continuous Learning: Passion for staying updated with the latest trends in AI and machine learning.\n\nWhat We Offer\n\nMarket competitive and pay equity-focused compensation structure100% paid health insurance for employees with 90% coverage for dependentsAnnual lifestyle wallet for personal wellness, learning and development, and more!Lifetime maximum benefit for family forming and fertility benefitsDedicated mental health support for employees and eligible dependentsGenerous time away including company holidays, paid time off, sick time, parental leave, and more!Lively office environment with catered meals, fully stocked kitchens, and geo-specific commuter benefits\n\nBase pay for the successful applicant will depend on a variety of job-related factors, which may include education, training, experience, location, business needs, or market demands. The expected salary range for this role is based on the location where the work will be performed and is aligned to one of 3 compensation zones. This role is also eligible to participate in a Robinhood bonus plan and Robinhood’s equity plan. For other locations not listed, compensation can be discussed with your recruiter during the interview process.\n\nZone 1 (Menlo Park, CA; New York, NY; Bellevue, WA; Washington, DC)\n\n$187,000—$220,000 USD\n\nZone 2 (Denver, CO; Westlake, TX; Chicago, IL)\n\n$165,000—$194,000 USD\n\nZone 3 (Lake Mary, FL)\n\n$146,000—$172,000 USD\n\nClick Here To Learn More About Robinhood’s Benefits.\n\nWe’re looking for more growth-minded and collaborative people to be a part of our journey in democratizing finance for all. If you’re ready to give 100% in helping us achieve our mission—we’d love to have you apply even if you feel unsure about whether you meet every single requirement in this posting. At Robinhood, we're looking for people invigorated by our mission, values, and drive to change the world, not just those who simply check off all the boxes.\n\nRobinhood embraces a diversity of backgrounds and experiences and provides equal opportunity for all applicants and employees. We are dedicated to building a company that represents a variety of backgrounds, perspectives, and skills. We believe that the more inclusive we are, the better our work (and work environment) will be for everyone. Additionally, Robinhood provides reasonable accommodations for candidates on request and respects applicants' privacy rights. To review Robinhood's Privacy Policy please review the specific policy applicable to your country.",
    "experience with Transformers\nNeed to be 8+ year's of work experience. \nWe need a Data Scientist with demonstrated expertise in training and evaluating transformers such as BERT and its derivatives.\nRequired: Proficiency with Python, pyTorch, Linux, Docker, Kubernetes, Jupyter. Expertise in Deep Learning, Transformers, Natural Language Processing, Large Language Models\nPreferred: Experience with genomics data, molecular genetics. Distributed computing tools like Ray, Dask, Spark",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.4493, 0.0204, 0.0266]])
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Triplet

* Datasets: `ai-job-validation` and `ai-job-test`
* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)

| Metric              | ai-job-validation | ai-job-test |
|:--------------------|:------------------|:------------|
| **cosine_accuracy** | **0.9802**        | **0.9709**  |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### ai_alignment

* Dataset: [ai_alignment](https://huggingface.co/datasets/pfrenee/ai_alignment) at [bb2b8ee](https://huggingface.co/datasets/pfrenee/ai_alignment/tree/bb2b8ee8b02aa81cffdd5333d4e18bbb6fc8b601)
* Size: 814 training samples
* Columns: <code>query</code>, <code>job_description_pos</code>, and <code>job_description_neg</code>
* Approximate statistics based on the first 814 samples:
  |         | query                                                                             | job_description_pos                                                                 | job_description_neg                                                                 |
  |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
  | type    | string                                                                            | string                                                                              | string                                                                              |
  | details | <ul><li>min: 8 tokens</li><li>mean: 14.97 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 349.01 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 347.16 tokens</li><li>max: 512 tokens</li></ul> |
* Samples:
  | query                                                                                           | job_description_pos                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | job_description_neg                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
  |:------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>Python design patterns, Snowflake data warehousing, AWS data pipeline optimization</code> | <code>Requirements:<br>- Good communication; and problem-solving abilities- Ability to work as an individual contributor; collaborating with Global team- Strong experience with Data Warehousing- OLTP, OLAP, Dimension, Facts, Data Modeling- Expertise implementing Python design patterns (Creational, Structural and Behavioral Patterns)- Expertise in Python building data application including reading, transforming; writing data sets- Strong experience in using boto3, pandas, numpy, pyarrow, Requests, Fast API, Asyncio, Aiohttp, PyTest, OAuth 2.0, multithreading, multiprocessing, snowflake python connector; Snowpark- Experience in Python building data APIs (Web/REST APIs)- Experience with Snowflake including SQL, Pipes, Stream, Tasks, Time Travel, Data Sharing, Query Optimization- Experience with Scripting language in Snowflake including SQL Stored Procs, Java Script Stored Procedures; Python UDFs- Understanding of Snowflake Internals; experience in integration with Reporting; UI applications- Stron...</code>                | <code>QUALIFICATIONS Required Certifications DoD IAT Level III Certification (Must obtain within 180 days of hire). Education, Background, and Years of Experience 3-5 years of Data Analyst experience. ADDITIONAL SKILLS & QUALIFICATIONS Required Skills At least 3 years of hands-on experience with query languages, such as SQL and Kusto to facilitate robust reporting capabilities. Preferred Skills Understanding of Microsoft Power Platform. Power BI authoring, in combination with designing and integrating with data sources. Tier III, Senior Level Experience with Kusto Query Language (KQL). Tier III, Senior Level Experience with Structured Query Language (SQL). WORKING CONDITIONS Environmental Conditions Contractor site with 0%-10% travel possible. Possible off-hours work to support releases and outages. General office environment. Work is generally sedentary in nature but may require standing and walking for up to 10% of the time. The working environment is generally favorable. Lighting and temp...</code>       |
  | <code>Data Science in Marketing, Customer LTV Modeling, Experimentation Frameworks</code>       | <code>experience. You are comfortable with a range of statistical and ML techniques with the ability to apply them to deliver measurable business impact at Turo.<br><br>You’re someone who constantly thinks about how data can support Turo’s work across domains, actively utilizing it to work-through challenges and unlock new opportunities. You’re proficient in translating unstructured problems into tangible mathematical frameworks, and are able to bring others with you on that journey. You’re someone who enjoys working with business stakeholders to drive experimentation and foster a data-centric culture. You’re able to recognize the right tools for each problem and design solutions that scale the impact of your work. You have a passion for contributing to a best in class product and take ownership of your work from inception to implementation and beyond.<br><br>What You Will Do<br><br>Turo’s marketplace has enjoyed continued growth as a business, which has in part been achieved through significant Marketing inv...</code> | <code>requirements.Prepares and presents results of analysis along with improvements and/or recommendations to the business at all levels of management.Coordinates with global sourcing team and peers to aggregate data align reporting.Maintain data integrity of databases and make changes as required to enhance accuracy, usefulness and access.Acts as a Subject Matter Expert (SME) for key systems/processes in subject teams and day-to-day functions.Develops scenario planning tools/models (exit/maintain/grow). Prepares forecasts and analyzes trends in general business conditions.Request for Proposal (RFP) activities – inviting suppliers to participate in RFP, loading RFP into Sourcing tool, collecting RFP responses, conducting qualitative and quantitative analyses.Assists Sourcing Leads in maintaining pipeline, reports on savings targets.<br>Qualifications:Bachelors Degree is required.Minimum of 4 years of relevant procurement analyst experience.Advanced Excel skills are required.C.P.M., C.P.S.M., o...</code>    |
  | <code>education workforce data analysis R Tableau</code>                                        | <code>experience as an SME in complex enterprise-level projects, 5+ years of experience analyzing info and statistical data to prepare reports and studies for professional use, and experience working with education and workforce data.<br>If you’re interested, I'll gladly provide more details about the role and further discuss your qualifications.<br>Thanks,Stephen M HrutkaPrincipal Consultantwww.hruckus.com<br>Executive Summary: HRUCKUS is looking to hire a Data Analyst resource to provide data analysis and management support. The Data Analyst must have at least 10 years of overall experience.<br>Position Description: The role of the Data Analyst is to provide data analysis support for the Office of Education Through Employment Pathways, which is located within the Office of the Deputy Mayor for Education. This is a highly skilled position requiring familiarity with educational data and policies.<br>The position will require the resources to produce data analysis, focusing on education and workforce-relate...</code>    | <code>Experience of Delta Lake, DWH, Data Integration, Cloud, Design and Data Modelling.• Proficient in developing programs in Python and SQL• Experience with Data warehouse Dimensional data modeling.• Working with event based/streaming technologies to ingest and process data.• Working with structured, semi structured and unstructured data.• Optimize Databricks jobs for performance and scalability to handle big data workloads. • Monitor and troubleshoot Databricks jobs, identify and resolve issues or bottlenecks. • Implement best practices for data management, security, and governance within the Databricks environment. Experience designing and developing Enterprise Data Warehouse solutions.• Proficient writing SQL queries and programming including stored procedures and reverse engineering existing process.• Perform code reviews to ensure fit to requirements, optimal execution patterns and adherence to established standards.<br>Qualifications:<br>• 5+ years Python coding experience.• 5+ years - SQL...</code> |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim",
      "gather_across_devices": false
  }
  ```

### Evaluation Dataset

#### ai_alignment

* Dataset: [ai_alignment](https://huggingface.co/datasets/pfrenee/ai_alignment) at [bb2b8ee](https://huggingface.co/datasets/pfrenee/ai_alignment/tree/bb2b8ee8b02aa81cffdd5333d4e18bbb6fc8b601)
* Size: 101 evaluation samples
* Columns: <code>query</code>, <code>job_description_pos</code>, and <code>job_description_neg</code>
* Approximate statistics based on the first 101 samples:
  |         | query                                                                              | job_description_pos                                                                  | job_description_neg                                                                  |
  |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
  | type    | string                                                                             | string                                                                               | string                                                                               |
  | details | <ul><li>min: 10 tokens</li><li>mean: 14.79 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 61 tokens</li><li>mean: 366.96 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 27 tokens</li><li>mean: 372.63 tokens</li><li>max: 512 tokens</li></ul> |
* Samples:
  | query                                                                                    | job_description_pos                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | job_description_neg                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
  |:-----------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>Statistical programming SAS, clinical development, AAV gene therapy</code>         | <code>QUALIFICATIONS: <br><br>Education:<br><br>12 years of related experience with a Bachelor’s degree; or 8 years and a Master’s degree; or a PhD with 5 years experience; or equivalent experience<br><br>Experience:<br><br>Work experience in biotech/pharmaceutical industry or medical research for a minimum of 8 years (or 4 years for a PhD with relevant training)Experience in clinical developmentExperience in ophthalmology and/or biologic/gene therapy a plus<br><br>Skills:<br><br>Strong SAS programming skills required with proficiency in SAS/BASE, SAS Macros, SAS/Stat and ODS (proficiency in SAS/SQL, SAS/GRAPH or SAS/ACCESS is a plus)Proficiency in R programming a plusProficiency in Microsoft Office Apps, such as WORD, EXCEL, and PowerPoint (familiar with the “Chart” features in EXCEL/PowerPoint a plus)Good understanding of standards specific to clinical trials such as CDISC, SDTM, and ADaM, MedDRA, WHODRUGExperience with all clinical phases (I, II, III, and IV) is desirableExperience with BLA/IND submissions is strongly desir...</code> | <code>requirements may change at any time.<br><br>Qualifications<br><br> Qualification:<br>• BS degree in Computer Science, Computer Engineering or other relevant majors.<br>• Excellent programming, debugging, and optimization skills in general purpose programming languages<br>• Ability to think critically and to formulate solutions to problems in a clear and concise way.<br><br>Preferred Qualifications:<br>• Experience with one or more general purpose programming languages including but not limited to: Go, C/C++, Python.<br>• Good understanding in one of the following domains: ad fraud detection, risk control, quality control, adversarial engineering, and online advertising systems.<br>• Good knowledge in one of the following areas: machine learning, deep learning, backend, large-scale systems, data science, full-stack.<br><br>TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workpla...</code> |
  | <code>ETL pipeline design, bulk data solutions, classified environments</code>           | <code>Skills & Experience:Must hold a TS/SCI Full Scope Polygraph clearance, and have experience working in classified environments.Professional experience with Python and a JVM language (e.g., Scala) 4+ years of experience designing and maintaining ETL pipelines Experience using Apache SparkExperience with SQL (e.g., Postgres) and NoSQL (e.g., Cassandra, ElasticSearch, etc.)databases Experience working on a cloud platform like GCP, AWS, or Azure Experience working collaboratively with git <br>Desired Skills & Experience:Understanding of Docker/Kubernetes Understanding of or interest in knowledge graphsExperienced in supporting and working with internal teams and customers in a dynamic environment Passionate about open source development and innovative technology<br>Benefits: Limitless growth and learning opportunitiesA collaborative and positive culture - your team will be as smart and driven as youA strong commitment to diversity, equity & inclusionExceedingly generous vacation leave, parental l...</code>                               | <code>experience with all aspects of the software development lifecycle, from design to deployment. Demonstrate understanding of the full life data lifecycle and the role that high-quality data plays across applications, machine learning, business analytics, and reporting. Lead and take ownership of assigned technical projects in a fast-paced environment. <br>What you need to succeed (minimum qualifications)3-5+ years of experienceFamiliar with best practices for data ingestion and data designDevelop initial queries for profiling data, validating analysis, testing assumptions, driving data quality assessment specifications, and define a path to deploymentIdentify necessary business rules for extracting data along with functional or technical risks related to data sources (e.g. data latency, frequency, etc.)Knowledge of working with queries/applications, including performance tuning, utilizing indexes, and materialized views to improve query performanceContinuously improve quality, efficiency, a...</code>                                        |
  | <code>Provider data analysis, healthcare compliance, business process improvement</code> | <code>requirements of health plan as it pertains to contracting, benefits, prior authorizations, fee schedules, and other business requirements.<br>•Analyze and interpret data to determine appropriate configuration changes.• Accurately interprets specific state and/or federal benefits, contracts as well as additional business requirements and converting these terms to configuration parameters.• Oversees coding, updating, and maintaining benefit plans, provider contracts, fee schedules and various system tables through the user interface.• Applies previous experience and knowledge to research and resolve claim/encounter issues, pended claims and update system(s) as necessary.• Works with fluctuating volumes of work and can prioritize work to meet deadlines and needs of user community.• Provides analytical, problem-solving foundation including definition and documentation, specifications.• Recognizes, identifies and documents changes to existing business processes and identifies new opportunities...</code>                                  | <code>experience.Required Skills: ADF pipelines, SQL, Kusto, Power BI, Cosmos (Scope Scripts). Power Bi, ADX (Kusto), ADF, ADO, Python/C#.Good to have – Azure anomaly Alerting, App Insights, Azure Functions, Azure FabricQualifications for the role 5+ years experience building and optimizing ‘big data’ data pipelines, architectures and data sets. Specific experience working with COSMOS and Scope is required for this role. Experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases is a plus. Experience with investigating and on-boarding new data sources in a big-data environment, including forming relationships with data engineers cross-functionally to permission, mine and reformat new data sets. Strong analytic skills related to working with unstructured data sets. A successful history of manipulating, processing and extracting value from large disconnected datasets.</code>                                                                                                          |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim",
      "gather_across_devices": false
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `learning_rate`: 1e-05
- `num_train_epochs`: 6
- `warmup_ratio`: 0.1
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 1e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 6
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `hub_revision`: None
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `liger_kernel_config`: None
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}

</details>

### Training Logs
| Epoch  | Step | Training Loss | Validation Loss | ai-job-validation_cosine_accuracy | ai-job-test_cosine_accuracy |
|:------:|:----:|:-------------:|:---------------:|:---------------------------------:|:---------------------------:|
| -1     | -1   | -             | -               | 0.8614                            | -                           |
| 1.9608 | 100  | 0.848         | 0.3421          | 0.9802                            | -                           |
| 3.9216 | 200  | 0.3142        | 0.3138          | 0.9802                            | -                           |
| 5.8824 | 300  | 0.1828        | 0.3009          | 0.9802                            | -                           |
| -1     | -1   | -             | -               | 0.9802                            | 0.9709                      |


### Framework Versions
- Python: 3.12.11
- Sentence Transformers: 5.1.0
- Transformers: 4.55.4
- PyTorch: 2.8.0
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.21.4

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->