File size: 32,903 Bytes
97b874c
 
 
 
 
 
b09122e
97b874c
 
0a4fbf4
d80e2b2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a4fbf4
 
 
 
97b874c
 
 
 
 
 
 
d80e2b2
 
e37f7a4
97b874c
e37f7a4
97b874c
 
 
 
 
 
 
 
e37f7a4
 
 
 
 
 
 
 
 
 
 
8870b72
 
 
 
 
 
 
 
 
 
 
 
e37f7a4
 
 
 
 
 
 
 
 
 
 
 
b09122e
 
97b874c
0a4fbf4
e37f7a4
 
 
 
 
eaa76f0
 
 
 
 
 
 
 
e37f7a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d80e2b2
e37f7a4
 
 
 
 
 
 
 
 
 
d80e2b2
 
 
e37f7a4
 
d80e2b2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e37f7a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d80e2b2
e37f7a4
 
 
 
 
 
 
 
 
d80e2b2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e37f7a4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
---
license: mit
language:
- en
metrics:
- accuracy
- recall
base_model:
- BAAI/bge-base-en-v1.5

widget:
- source_sentence: 'Represent this sentence for searching relevant passages: RCW 36.75.190'
  sentences:
  - 'RCW 36.75.190 - Engineer''s report—Hearing—Order.

    Upon report by the examining engineer for the erection and construction upon any
    county road, or for acquisition by purchase, gift or condemnation of any bridge,
    trestle, or any other structure crossing any stream, body of water, gulch, navigable
    water, swamp or other topographical formation, which constitutes a boundary, publication
    shall be made and joint hearing had upon such report in the same manner and upon
    the same procedure as in the case of resolution or petition for the laying out
    and establishing of county roads. If upon the hearing the governing authorities
    jointly order the erection and construction or acquisition of such bridge, trestle,
    or other structure, they may jointly acquire land necessary therefor by purchase,
    gift, or condemnation in the manner as provided for acquiring land for county
    roads, and shall advertise calls for bids, require contractor''s deposit and bond,
    award contracts, and supervise construction as by law provided and in the same
    manner as required in the case of the construction of county roads. Any such bridges,
    trestles or other structures may be operated free, or may be operated as toll
    bridges, trestles, or other structures under the provisions of the laws of this
    state relating thereto.

    [ 1963 c 4 s 36.75.190 . Prior: 1937 c 187 s 29 ; RRS s 6450-29.]'
  - 'RCW 28B.30.285 - State treasurer receiving agent of certain federal aid—Trust
    funds not subject to appropriation.

    All federal grants received by the state treasurer pursuant to RCW 28B.30.270
    shall be deemed trust funds under the control of the state treasurer and not subject
    to appropriation by the legislature.

    [ 1969 ex.s. c 223 s 28B.30.285 . Prior: 1955 c 66 s 4 . Formerly RCW 28.80.224
    .]'
  - 'RCW 48.09.160 - Directors—Disqualification.

    No individual shall be a director of a domestic mutual insurer by reason of his
    or her holding public office. Adjudication as a bankrupt or taking the benefit
    of any insolvency law or making a general assignment for the benefit of creditors
    disqualifies an individual from being or acting as a director.

    [ 2009 c 549 s 7037 ; 1947 c 79 s .09.16; Rem. Supp. 1947 s 45.09.16.]'
- source_sentence: 'Represent this sentence for searching relevant passages: RCW disclosure
    suspect identity civil redress'
  sentences:
  - 'RCW 49.60.525 - Review of existing recorded covenants and deed restrictions to
    identify documents that include racial or other unlawful restrictions on property
    ownership.(Expires July 1, 2027.)

    (1) Subject to the availability of amounts appropriated for this specific purpose,
    the University of Washington and Eastern Washington University shall review existing
    recorded covenants and deed restrictions to identify those recorded documents
    that include racial or other restrictions on property ownership or use against
    protected classes that are unlawful under RCW 49.60.224 . For properties subject
    to such racial and other unlawful restrictions, the universities shall provide
    notice to the property owner and to the county auditor of the county in which
    the property is located. The universities shall provide information to the property
    owner on how such provisions can be struck pursuant to RCW 49.60.227 . The universities
    may contract with other public and private not-for-profit higher education institutions
    that are regionally accredited to carry out the review and notification requirements
    of this section. (2) This section expires July 1, 2027.

    [ 2021 c 256 s 2 .]

    Findings — Intent — 2021 c 256: "The legislature finds that the existence of racial,
    religious, or ethnic-based property restrictions or covenants on a deed or chain
    of title for real property is like having a monument to racism on that property
    and is repugnant to the tenets of equality. Furthermore, such restrictions and
    covenants may cause mental anguish and tarnish a property owner''s sense of ownership
    in the property because the owner feels as though they have participated in a
    racist act themselves. It is the intent of the legislature that the owner, occupant,
    or tenant or homeowners'' association board of the property which is subject to
    an unlawful deed restriction or covenant pursuant to RCW 49.60.224 is entitled
    to have discriminatory covenants and restrictions that are contrary to public
    policy struck from their chain of title. The legislature has presented two ways
    this can be accomplished through RCW 49.60.227 (1) (a) and (b). If the owner,
    occupant, or tenant or homeowners'' association board of the property elects to
    pursue a judicial remedy, the legislature intends that the court issue a declaratory
    judgment ordering the county auditor, or in charter counties the county official
    charged with the responsibility for recording instruments in the county records,
    to entirely strike the racist or otherwise discriminatory covenants from the chain
    of title. Striking the language does not prevent preservation of the original
    record, outside of the chain of title, for historical or archival purposes. The
    legislature finds that striking racist, religious, and ethnic restrictions or
    covenants from the chain of title is no different than having an offensive statutory
    monument which the owner may entirely remove. So too should the owner be able
    to entirely remove the offensive written monument to racism or other unconstitutional
    discrimination." [ 2021 c 256 s 1 .]

    Application — 2021 c 256: "This act applies to real estate transactions entered
    into on or after January 1, 2022." [ 2021 c 256 s 5 .]'
  - 'RCW 10.97.070 - Disclosure of suspect''s identity to victim.

    (1) Criminal justice agencies may, in their discretion, disclose to persons who
    have suffered physical loss, property damage, or injury compensable through civil
    action, the identity of persons suspected as being responsible for such loss,
    damage, or injury together with such information as the agency reasonably believes
    may be of assistance to the victim in obtaining civil redress. Such disclosure
    may be made without regard to whether the suspected offender is an adult or a
    juvenile, whether charges have or have not been filed, or a prosecuting authority
    has declined to file a charge or a charge has been dismissed. (2) Unless the agency
    determines release would interfere with an ongoing criminal investigation, in
    any action brought pursuant to this chapter, criminal justice agencies shall disclose
    identifying information, including photographs of suspects, if the acts are alleged
    by the plaintiff or victim to be a violation of RCW 9A.50.020 . (3) The disclosure
    by a criminal justice agency of investigative information pursuant to subsection
    (1) of this section shall not establish a duty to disclose any additional information
    concerning the same incident or make any subsequent disclosure of investigative
    information, except to the extent an additional disclosure is compelled by legal
    process.

    [ 1993 c 128 s 10 ; 1977 ex.s. c 314 s 7 .]

    Effective date — 1993 c 128: See RCW 9A.50.902 .'
  - 'RCW 65.16.110 - Affidavit to cover payment of fees.

    The affidavit of publication of all notices required by law to be published shall
    state the full amount of the fee charged for such publication and that the fee
    has been paid in full.

    [ 1921 c 99 s 7 ; RRS s 253-7.]'
- source_sentence: 'Represent this sentence for searching relevant passages: RCW 87.80
    form and contents of notice'
  sentences:
  - 'RCW 36.32.270 - Competitive bids—Exemptions.

    The county legislative authority may waive the competitive bidding requirements
    of this chapter pursuant to RCW 39.04.280 if an exemption contained within that
    section applies to the purchase or public work.

    [ 1998 c 278 s 4 ; 1963 c 4 s 36.32.270 . Prior: 1961 c 169 s 3 ; 1945 c 61 s
    4 ; Rem. Supp. 1945 s 10322-18.]'
  - 'RCW 87.80.060 - Form and contents of notice.

    The notice of the hearing on the petition shall state that a petition requesting
    the creation of a board of joint control to administer the facilities and activities,
    naming them if named in the petition, has been filed with the board of county
    commissioners of the county, naming the county; that the board of joint control,
    if it is created, will have authority to provide for apportionment of costs to
    carry out the objects of its creation among the member irrigation entities (naming
    them); shall state the day, hour, and place of the hearing on the petition; shall
    state that any person interested in the creation of the board of joint control
    may appear on or before the day of hearing on the petition, and show cause in
    writing, if any, why the same should not be granted, and the notice shall be over
    the name of the clerk of the board of county commissioners.

    [ 1996 c 320 s 6 ; 1949 c 56 s 6 ; Rem. Supp. 1949 s 7505-25.]'
  - 'RCW 18.88B.090 - Reinstatement of certification.

    (1) A certificate that has been expired for five years or less may be reinstated
    if the person holding the expired certificate: (a) Completes an abbreviated application
    form; (b) Pays any necessary fees, including the current certification fee, late
    renewal fees, and expired credential reissuance fees, unless exempt pursuant to
    *RCW 18.88B.091 ; (c) Provides a written declaration that no action has been taken
    by a state or federal jurisdiction or hospital which would prevent or restrict
    the person holding the expired certificate from practicing as a home care aide;
    (d) Provides a written declaration that the person holding the expired certificate
    has not voluntarily given up any credential or privilege or has not been restricted
    from practicing as a home care aide in lieu of or to avoid formal action; and
    (e) Submits to a state and federal background check as required by RCW 74.39A.056
    , if the certificate has been expired for more than one year. (2) In addition
    to meeting the requirements of subsection (1) of this section, a certificate that
    has been expired for more than five years may be reinstated if the person holding
    the expired certificate demonstrates competence to the standards established by
    the secretary and meets other requirements established by the secretary.

    [ 2023 c 424 s 3 .]

    *Reviser''s note: RCW 18.88B.091 expired July 1, 2025.'
- source_sentence: 'Represent this sentence for searching relevant passages: RCW 48.30A.055'
  sentences:
  - 'RCW 48.30A.055 - Insurance antifraud plan—Review—Disapproval—Notice—Audit to
    ensure compliance.

    If after review of an insurer''s antifraud plan, the commissioner finds that the
    plan does not comply with RCW 48.30A.050 , the commissioner may disapprove the
    antifraud plan. Notice of disapproval must include a statement of the specific
    reasons for disapproval. The insurer shall refile a plan disapproved by the commissioner
    within sixty days of the date of the notice of disapproval. The commissioner may
    audit insurers to ensure compliance with antifraud plans.

    [ 1995 c 285 s 11 .]'
  - 'RCW 18.160.090 - Surety bond—Security deposit—Venue and time limit for actions
    upon bonds—Limit of liability of surety—Payment of claims.

    (1) Before granting a license under this chapter, the director of fire protection
    shall require that the applicant file with the state director of fire protection
    a surety bond issued by a surety insurer who meets the requirements of chapter
    48.28 RCW in a form acceptable to the director of fire protection running to the
    state of Washington in the penal sum of ten thousand dollars. However, the surety
    bond for a fire protection sprinkler system contractor whose business is restricted
    solely to NFPA 13-D or NFPA 13-R systems shall be in the penal sum of six thousand
    dollars. The bond shall be conditioned that the applicant will pay all purchasers
    of fire protection sprinkler systems with whom the applicant has a contract for
    the applicant to install, inspect, maintain, or service a fire protection sprinkler
    system, and who have obtained a judgment against the applicant for the breach
    of such a contract. The term "purchaser" means an owner of property who has entered
    into a contract for the installation of a fire protection sprinkler system on
    that property, or a contractor who contracts to install, inspect, maintain, or
    service such a system with an owner of property and subcontracts the work to the
    applicant. No other person, including, but not limited to, persons who supply
    labor, materials, or rental equipment to the applicant, shall have any rights
    against the bond. (2) In lieu of the surety bond required by this section the
    applicant may file with the director of fire protection a deposit consisting of
    cash or other security acceptable to the director of fire protection in an amount
    equal to the penal sum of the required bond. The director of fire protection may
    adopt rules necessary for the proper administration of the security. (3) Before
    granting renewal of a fire protection sprinkler system contractor''s license to
    any applicant, the director of fire protection shall require that the applicant
    file with the director satisfactory evidence that the surety bond or cash deposit
    is in full force. (4) Any purchaser of a fire protection sprinkler system having
    a claim against the licensee for the breach of a contract for the licensee to
    install, inspect, maintain, or service a fire protection sprinkler system may
    bring suit upon such bond in superior court of the county in which the work was
    done or of any county in which jurisdiction of the licensee may be had. Any such
    action must be brought not later than one year after the expiration of the licensee''s
    license or renewal license then in effect at the time of the alleged breach of
    contract. (5) The bond shall be considered one continuous obligation, and the
    surety upon the bond shall not be liable in aggregate or cumulative amount exceeding
    ten thousand dollars, or six thousand dollars if the bond was issued to a licensee
    whose business is restricted solely to NFPA 13-D or NFPA 13-R systems, regardless
    of the number of years the bond is in effect, or whether it is reinstated, renewed,
    reissued, or otherwise continued, and regardless of the year in which any claim
    accrued. The bond shall not be liable for any liability of the licensee for tortious
    acts, whether or not such liability is imposed by statute or common law, or is
    imposed by contract. The bond shall not be a substitute or supplemental to any
    liability or other insurance required by law or by the contract. (6) If the surety
    desires to make payment without awaiting court action against it, the amount of
    the bond shall be reduced to the extent of any payment made by the surety in good
    faith under the bond. Any payment shall be based on final judgments received by
    the surety. (7) Claims against the bond shall be satisfied from the bond in the
    following order: (a) Claims by a purchaser of a fire protection sprinkler system
    for the breach of a contract for the licensee to install, inspect, maintain, or
    service a fire protection sprinkler system; (b) Any court costs, interest, and
    attorneys'' fees the plaintiff may be entitled to recover by contract, statute,
    or court rule. A condition precedent to the surety being liable to any claimant
    is a final judgment against the licensee, unless the surety desires to make payment
    without awaiting court action. In the event of a dispute regarding the apportionment
    of the bond proceeds among claimants, the surety may bring an action for interpleader
    against all claimants upon the bond. (8) Any purchaser of a fire protection sprinkler
    system having an unsatisfied final judgment against the licensee for the breach
    of a contract for the licensee to install, inspect, maintain, or service a fire
    protection sprinkler system may execute upon the security held by the director
    of fire protection by serving a certified copy of the unsatisfied final judgment
    by registered or certified mail upon the director within one year of the date
    of entry of such judgment. Upon the receipt of service of such certified copy
    the director shall pay or order paid from the deposit, through the registry of
    the court which rendered judgment, towards the amount of the unsatisfied judgment.
    The priority of payment by the director shall be the order of receipt by the director,
    but the director shall have no liability for payment in excess of the amount of
    the deposit.

    [ 1991 sp.s. c 6 s 1 .]'
  - 'RCW 18.100.010 - Legislative intent.

    It is the legislative intent to provide for the incorporation of an individual
    or group of individuals to render the same professional service to the public
    for which such individuals are required by law to be licensed or to obtain other
    legal authorization.

    [ 1969 c 122 s 1 .]'
- source_sentence: 'Represent this sentence for searching relevant passages: washington
    RCW nonprofit canon law'
  sentences:
  - 'RCW 43.21C.220 - Incorporation of city or town exempt from chapter.

    The incorporation of a city or town is exempted from compliance with this chapter.

    [ 1982 c 220 s 6 .]

    Severability — 1982 c 220: See note following RCW 36.93.100 .

    Incorporation proceedings exempt from chapter: RCW 36.93.170 .'
  - 'RCW 79A.05.085 - Lease of parklands for television stations—Lease rental rates,
    terms—Attachment of antennae.

    The commission shall determine the fair market value for television station leases
    based upon independent appraisals and existing leases for television stations
    shall be extended at said fair market rental for at least one period of not more
    than twenty years: PROVIDED, That the rates in said leases shall be renegotiated
    at five year intervals: PROVIDED FURTHER, That said stations shall permit the
    attachment of antennae of publicly operated broadcast and microwave stations where
    electronically practical to combine the towers: PROVIDED FURTHER, That notwithstanding
    any term to the contrary in any lease, this section shall not preclude the commission
    from prescribing new and reasonable lease terms relating to the modification,
    placement, or design of facilities operated by or for a station, and any extension
    of a lease granted under this section shall be subject to this proviso: PROVIDED
    FURTHER, That notwithstanding any other provision of law the director in his or
    her discretion may waive any requirement that any environmental impact statement
    or environmental assessment be submitted as to any lease negotiated and signed
    between January 1, 1974, and December 31, 1974.

    [ 2013 c 23 s 265 ; 1974 ex.s. c 151 s 1 . Formerly RCW 43.51.063 .]'
  - 'RCW 24.03A.050 - Subordination to canon law.

    To the extent religious doctrine or canon law governing the internal affairs of
    a nonprofit corporation is inconsistent with this chapter, the religious doctrine
    or canon law controls to the extent required by the United States Constitution,
    the state Constitution, or both.

    [ 2021 c 176 s 1110 .]

    Effective date — 2021 c 176: See note following RCW 24.03A.005 .'

pipeline_tag: sentence-similarity
library_name: sentence-transformers

tags:
- legal
- law
- WA
- sentence-transformers
- feature-extraction
- sentence-similarity
- dense
- loss:MultipleNegativesRankingLoss

model-index:
- name: washington-state-law-embedding-model-Base
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: RCW Validation
      type: rcw-validation
    metrics:
      - type: accuracy_at_10
        value: 0.8441
        name: Accuracy@10
    
      - type: precision_at_10
        value: 0.0844
        name: Precision@10
    
      - type: recall_at_10
        value: 0.8441
        name: Recall@10
      
      - type: accuracy_at_1
        value: 0.0891
        name: Accuracy@1

      - type: accuracy_at_3
        value: 0.2595
        name: Accuracy@3
    
      - type: accuracy_at_5
        value: 0.4318
        name: Accuracy@5
    
      - type: ndcg_at_10
        value: 0.3876
        name: NDCG@10
    
      - type: mrr_at_10
        value: 0.2524
        name: MRR@10
    
      - type: map_at_100
        value: 0.2595
        name: MAP@100
datasets:
- CSI-lab/RCW_2025_Positive_Query_Pairs
---

# Washington-state-law-embedding-model-Base 

**Washington-state-law-embedding-model-Base** is a highly specialized embedding model fine-tuned specifically for Legal Information Retrieval (IR) within the State of Washington. 

Generic embedding models often perform suboptimally on legal texts due to the semantic gap between natural language questions (e.g., "What dollar amount makes a theft a first degree felony?") and formal statutory legalese. This model bridges that gap, allowing plain-English queries, legal scenarios, and document drafts to be accurately mapped to their corresponding Washington State statutes (Revised Code of Washington - RCW).

## Available Models

| Model | Language | Description | Query Prefix |
|:------|:---------|:------------|:-------------|
| [CSI-lab/Washington-state-law-embedding-model-Large](https://huggingface.co/CSI-lab/Washington-state-law-embedding-model-Large) | English | Fine-tuned `large` model (1024d) for WA State RCWs. Best performance. | `Represent this sentence for searching relevant passages: ` |
| [CSI-lab/Washington-state-law-embedding-model-Base](https://huggingface.co/CSI-lab/Washington-state-law-embedding-model-Base) | English | Fine-tuned `base` model (768d) for WA State RCWs. Faster inference. | `Represent this sentence for searching relevant passages: ` |

## Model Overview
* **Base Model:** `BAAI/bge-base-en-v1.5`
* **Task:** Semantic Search / Information Retrieval / Legal Preemption Analysis
* **Language:** English (Legal Domain)
* **Max Sequence Length:** 512 tokens
* **Output Dimensionality:** 768 dimensions
* **Similarity Function:** Cosine Similarity

## Key Features
- Fine-tuned for Washington State legal domain (RCW)
- Optimized for semantic search and retrieval tasks
- Supports natural language legal queries
- Designed for RAG-based legal assistants
- Improved retrieval accuracy over base BGE embeddings

## Intended Use Cases
This model is optimized to act as the retriever component in legal Retrieval-Augmented Generation (RAG) pipelines. Primary use cases include:
1. **Statutory Cross-Referencing:** Mapping natural language legal questions to specific RCWs.
2. **Preemption Checking:** Automatically retrieving state laws that may preempt or conflict with proposed municipal ordinances.
3. **Legal Research Automation:** Clustering and searching local agency drafts against established state frameworks.
4. **AI Legal Assistants:** Powering chatbots and research tools that require accurate retrieval of Washington State laws before generating an answer.
5. **Automated Compliance:** Scanning contracts or external drafts against established state legislative frameworks.

## Technical Details & Training Methodology

### The Semantic Gap
A standard dense retriever often fails on legal tasks because it relies on vocabulary overlap rather than conceptual legal mapping. To address this, `Washington-state-law-embedding-model` was fine-tuned using a synthetic, high-variance dataset. 

### Training Data
The model was fine-tuned on synthetic legal query–passage pairs generated from Washington State RCW statutes.

The dataset includes:
- Size: 455,424 training samples
- Natural language paraphrases of legal questions
- Hypothetical legal scenarios
- Statute-grounded positive document matches

The dataset spans 500+ legal categories derived from RCW structure.

### Hyperparameters & Architecture
* **Loss Function:** Multiple Negatives Ranking (MNR) Loss
* **Batch Size:** 256
* **Epochs:** 4
* **fp16:** True
* **batch_sampler:** no_duplicates
* **multi_dataset_batch_sampler:** round_robin
* **Learning Rate Decay:** Linear
* **Infrastructure:** High-Performance Computing (HPC) Cluster
#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 256
- `per_device_eval_batch_size`: 256
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1
- `num_train_epochs`: 4
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `parallelism_config`: None
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `hub_revision`: None
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `liger_kernel_config`: None
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: round_robin
- `router_mapping`: {}
- `learning_rate_mapping`: {}

</details>

## Evaluation Metrics

The model was evaluated on a rigorously held-out validation set of synthetic municipal drafts mapped 1-to-1 against Washington State RCWs. The fine-tuning process yielded a **+31.27% absolute improvement in Recall@10** over the base model.

| Metric | Base Model (Untrained) | Fine-Tuned (Epoch 4) | Absolute Improvement |
|:-------|:-----------------------|:---------------------|:------------|
| **Recall@10** | 0.5314 | **0.8441** | + 31.27% |
| **Recall@5** | 0.2636 | **0.4318** | + 16.82% |
| **NDCG@10** | 0.2341 | **0.3876** | + 15.35% |
| **MRR@10** | 0.1462 | **0.2524** | + 10.62% |

*Interpretation: When a user asks this model a legal question in plain English, there is an 84.4% probability that the exact governing state law will be returned in the top 10 search results.*

## Limitations

- This model does not provide legal advice.
- Performance is limited to Washington State law (RCW) and may not generalize to other jurisdictions.
- Outputs depend on the quality of the underlying document corpus.
- Should be used as a retrieval tool, not a final decision-making system.

## Usage Examples

### Semantic Search with `sentence-transformers`
<div style="padding:10px; border-left:4px solid #ff4d4f; background-color:#fff1f0;">

**Warning:** Because this model is built on the BGE architecture, you **must** append the specific instruction prefix  
`"Represent this sentence for searching relevant passages:"`  
to your search queries to achieve optimal performance.  

**Do not** add this prefix to the database documents.

</div>

```python
import torch
from sentence_transformers import SentenceTransformer, util

# 1. Load the fine-tuned model
model = SentenceTransformer('CSI-lab/Washington-state-law-embedding-model-Base')

# 2. Define the laws (Your Vector Database)
laws = [
    "RCW 9A.56.030: Theft in the first degree. A person is guilty of theft in the first degree if he or she commits theft of property or services which exceed(s) five thousand dollars in value.",
    "RCW 46.61.502: Driving under the influence. A person is guilty of driving while under the influence of intoxicating liquor...",
    "RCW 9A.36.011: Assault in the first degree. A person is guilty of assault in the first degree if he or she..."
]

# 3. Define the user's search query
user_query = "What dollar amount makes a theft a first degree felony?"

# 4. CRITICAL: Add the required BGE prefix to the query ONLY
query_prefix = "Represent this sentence for searching relevant passages: "
formatted_query = query_prefix + user_query

# 5. Encode the documents and the query
law_embeddings = model.encode(laws, convert_to_tensor=True)
query_embedding = model.encode(formatted_query, convert_to_tensor=True)

# 6. Calculate Cosine Similarity
cosine_scores = util.cos_sim(query_embedding, law_embeddings)

# 7. Print the top result
best_idx = cosine_scores.argmax().item()
print(f"Top Match: {laws[best_idx]}")
print(f"Similarity Score: {cosine_scores[0][best_idx]:.4f}")
```

# Model Citation
```
@misc{washington_state_law_embedding_base_2026,
  title={Washington-state-law-embedding-model-Base: Fine-Tuned Dense Retrieval for Washington State Law},
  author={Tomar, Shlok},
  year={2026},
  publisher={Hugging Face}
  howpublished={\url{https://huggingface.co/CSI-lab/Washington-state-law-embedding-model-Base}},
  note={Hugging Face Model Repository}
}
```

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```