ayushexel's picture
Add new SentenceTransformer model
c65a2be verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:44284
  - loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
  - source_sentence: >-
      What are three games, in addition to Brick, which have been included with
      the iPod?
    sentences:
      - >-
        Video games are playable on various versions of iPods. The original iPod
        had the game Brick (originally invented by Apple's co-founder Steve
        Wozniak) included as an easter egg hidden feature; later firmware
        versions added it as a menu option. Later revisions of the iPod added
        three more games: Parachute, Solitaire, and Music Quiz.
      - >-
        Video games are playable on various versions of iPods. The original iPod
        had the game Brick (originally invented by Apple's co-founder Steve
        Wozniak) included as an easter egg hidden feature; later firmware
        versions added it as a menu option. Later revisions of the iPod added
        three more games: Parachute, Solitaire, and Music Quiz.
      - >-
        A very high resolution source may require more bandwidth than available
        in order to be transmitted without loss of fidelity. The lossy
        compression that is used in all digital HDTV storage and transmission
        systems will distort the received picture, when compared to the
        uncompressed source.
  - source_sentence: Willett's first suggestion was to change clocks by 20 minutes how often?
    sentences:
      - >-
        In 1977, the Supreme Court's Coker v. Georgia decision barred the death
        penalty for rape of an adult woman, and implied that the death penalty
        was inappropriate for any offense against another person other than
        murder. Prior to the decision, the death penalty for rape of an adult
        had been gradually phased out in the United States, and at the time of
        the decision, the State of Georgia and the U.S. Federal government were
        the only two jurisdictions to still retain the death penalty for that
        offense. However, three states maintained the death penalty for child
        rape, as the Coker decision only imposed a ban on executions for the
        rape of an adult woman. In 2008, the Kennedy v. Louisiana decision
        barred the death penalty for child rape. The result of these two
        decisions means that the death penalty in the United States is largely
        restricted to cases where the defendant took the life of another human
        being. The current federal kidnapping statute, however, may be exempt
        because the death penalty applies if the victim dies in the
        perpetrator's custody, not necessarily by his hand, thus stipulating a
        resulting death, which was the wording of the objection. In addition,
        the Federal government retains the death penalty for non-murder offenses
        that are considered crimes against the state, including treason,
        espionage, and crimes under military jurisdiction.
      - >-
        Some clock-shift problems could be avoided by adjusting clocks
        continuously or at least more gradually—for example, Willett at first
        suggested weekly 20-minute transitions—but this would add complexity and
        has never been implemented.
      - >-
        Some clock-shift problems could be avoided by adjusting clocks
        continuously or at least more gradually—for example, Willett at first
        suggested weekly 20-minute transitions—but this would add complexity and
        has never been implemented.
  - source_sentence: >-
      What sport Schwarzenegger played led to a trip to the gym that sparked his
      love of weightlifting?
    sentences:
      - >-
        Other regions host festivities of smaller extent, focused on the
        reenactment of traditional carnevalic customs, such as Tyrnavos
        (Thessaly), Kozani (West Macedonia), Rethymno (Crete) and in Xanthi
        (East Macedonia and Thrace). Tyrnavos holds an annual Phallus festival,
        a traditional "phallkloric" event in which giant, gaudily painted
        effigies of phalluses made of papier maché are paraded, and which women
        are asked to touch or kiss. Their reward for so doing is a shot of the
        famous local tsipouro alcohol spirit. Every year, from 1 to 8 January,
        mostly in regions of Western Macedonia, Carnival fiestas and festivals
        erupt. The best known is the Kastorian Carnival or "Ragoutsaria" (Gr.
        "Ραγκουτσάρια") [tags: Kastoria, Kastorian Carnival, Ragoutsaria,
        Ραγκουτσαρια, Καστοριά]. It takes place from 6 to 8 January with mass
        participation serenaded by brass bands, pipises, Macedonian and grand
        casa drums. It is an ancient celebration of nature's rebirth (fiestas
        for Dionysus (Dionysia) and Kronos (Saturnalia)), which ends the third
        day in a dance in the medieval square Ntoltso where the bands play at
        the same time.
      - >-
        As a boy, Schwarzenegger played several sports, heavily influenced by
        his father. He picked up his first barbell in 1960, when his soccer
        coach took his team to a local gym. At the age of 14, he chose
        bodybuilding over soccer as a career. Schwarzenegger has responded to a
        question asking if he was 13 when he started weightlifting: "I actually
        started weight training when I was 15, but I'd been participating in
        sports, like soccer, for years, so I felt that although I was slim, I
        was well-developed, at least enough so that I could start going to the
        gym and start Olympic lifting." However, his official website biography
        claims: "At 14, he started an intensive training program with Dan
        Farmer, studied psychology at 15 (to learn more about the power of mind
        over body) and at 17, officially started his competitive career." During
        a speech in 2001, he said, "My own plan formed when I was 14 years old.
        My father had wanted me to be a police officer like he was. My mother
        wanted me to go to trade school." Schwarzenegger took to visiting a gym
        in Graz, where he also frequented the local movie theaters to see
        bodybuilding idols such as Reg Park, Steve Reeves, and Johnny
        Weissmuller on the big screen. When Reeves died in 2000, Schwarzenegger
        fondly remembered him: "As a teenager, I grew up with Steve Reeves. His
        remarkable accomplishments allowed me a sense of what was possible, when
        others around me didn't always understand my dreams. Steve Reeves has
        been part of everything I've ever been fortunate enough to achieve." In
        1961, Schwarzenegger met former Mr. Austria Kurt Marnul, who invited him
        to train at the gym in Graz. He was so dedicated as a youngster that he
        broke into the local gym on weekends, when it was usually closed, so
        that he could train. "It would make me sick to miss a workout... I knew
        I couldn't look at myself in the mirror the next morning if I didn't do
        it." When Schwarzenegger was asked about his first movie experience as a
        boy, he replied: "I was very young, but I remember my father taking me
        to the Austrian theaters and seeing some newsreels. The first real movie
        I saw, that I distinctly remember, was a John Wayne movie."
      - >-
        As a boy, Schwarzenegger played several sports, heavily influenced by
        his father. He picked up his first barbell in 1960, when his soccer
        coach took his team to a local gym. At the age of 14, he chose
        bodybuilding over soccer as a career. Schwarzenegger has responded to a
        question asking if he was 13 when he started weightlifting: "I actually
        started weight training when I was 15, but I'd been participating in
        sports, like soccer, for years, so I felt that although I was slim, I
        was well-developed, at least enough so that I could start going to the
        gym and start Olympic lifting." However, his official website biography
        claims: "At 14, he started an intensive training program with Dan
        Farmer, studied psychology at 15 (to learn more about the power of mind
        over body) and at 17, officially started his competitive career." During
        a speech in 2001, he said, "My own plan formed when I was 14 years old.
        My father had wanted me to be a police officer like he was. My mother
        wanted me to go to trade school." Schwarzenegger took to visiting a gym
        in Graz, where he also frequented the local movie theaters to see
        bodybuilding idols such as Reg Park, Steve Reeves, and Johnny
        Weissmuller on the big screen. When Reeves died in 2000, Schwarzenegger
        fondly remembered him: "As a teenager, I grew up with Steve Reeves. His
        remarkable accomplishments allowed me a sense of what was possible, when
        others around me didn't always understand my dreams. Steve Reeves has
        been part of everything I've ever been fortunate enough to achieve." In
        1961, Schwarzenegger met former Mr. Austria Kurt Marnul, who invited him
        to train at the gym in Graz. He was so dedicated as a youngster that he
        broke into the local gym on weekends, when it was usually closed, so
        that he could train. "It would make me sick to miss a workout... I knew
        I couldn't look at myself in the mirror the next morning if I didn't do
        it." When Schwarzenegger was asked about his first movie experience as a
        boy, he replied: "I was very young, but I remember my father taking me
        to the Austrian theaters and seeing some newsreels. The first real movie
        I saw, that I distinctly remember, was a John Wayne movie."
  - source_sentence: Of the languages of the are, what was the only approved language?
    sentences:
      - >-
        One language, Interlingua, was developed so that the languages of
        Western civilization would act as its dialects. Drawing from such
        concepts as the international scientific vocabulary and Standard Average
        European, linguists[who?] developed a theory that the modern Western
        languages were actually dialects of a hidden or latent
        language.[citation needed] Researchers at the International Auxiliary
        Language Association extracted words and affixes that they considered to
        be part of Interlingua's vocabulary. In theory, speakers of the Western
        languages would understand written or spoken Interlingua immediately,
        without prior study, since their own languages were its dialects. This
        has often turned out to be true, especially, but not solely, for
        speakers of the Romance languages and educated speakers of English.
        Interlingua has also been found to assist in the learning of other
        languages. In one study, Swedish high school students learning
        Interlingua were able to translate passages from Spanish, Portuguese,
        and Italian that students of those languages found too difficult to
        understand. It should be noted, however, that the vocabulary of
        Interlingua extends beyond the Western language families.
      - >-
        At age eight, Beyoncé and childhood friend Kelly Rowland met LaTavia
        Roberson while in an audition for an all-girl entertainment group. They
        were placed into a group with three other girls as Girl's Tyme, and
        rapped and danced on the talent show circuit in Houston. After seeing
        the group, R&B producer Arne Frager brought them to his Northern
        California studio and placed them in Star Search, the largest talent
        show on national TV at the time. Girl's Tyme failed to win, and Beyoncé
        later said the song they performed was not good. In 1995 Beyoncé's
        father resigned from his job to manage the group. The move reduced
        Beyoncé's family's income by half, and her parents were forced to move
        into separated apartments. Mathew cut the original line-up to four and
        the group continued performing as an opening act for other established
        R&B girl groups. The girls auditioned before record labels and were
        finally signed to Elektra Records, moving to Atlanta Records briefly to
        work on their first recording, only to be cut by the company. This put
        further strain on the family, and Beyoncé's parents separated. On
        October 5, 1995, Dwayne Wiggins's Grass Roots Entertainment signed the
        group. In 1996, the girls began recording their debut album under an
        agreement with Sony Music, the Knowles family reunited, and shortly
        after, the group got a contract with Columbia Records.
      - >-
        During the dictatorships of Miguel Primo de Rivera (1923–1930) and
        especially of Francisco Franco (1939–1975), all regional cultures were
        suppressed. All of the languages spoken in Spanish territory, except
        Spanish (Castilian) itself, were officially banned. Symbolising the
        Catalan people's desire for freedom, Barça became 'More than a club'
        (Més que un club) for the Catalans. According to Manuel Vázquez
        Montalbán, the best way for the Catalans to demonstrate their identity
        was by joining Barça. It was less risky than joining a clandestine
        anti-Franco movement, and allowed them to express their dissidence.
        During Franco's regime, however, the blaugrana team was granted profit
        due to its good relationship with the dictator at management level, even
        giving two awards to him.
  - source_sentence: How many professional sports leagues have their headquarters in New York?
    sentences:
      - >-
        Biological anthropologists are interested in both human variation and in
        the possibility of human universals (behaviors, ideas or concepts shared
        by virtually all human cultures). They use many different methods of
        study, but modern population genetics, participant observation and other
        techniques often take anthropologists "into the field," which means
        traveling to a community in its own setting, to do something called
        "fieldwork." On the biological or physical side, human measurements,
        genetic samples, nutritional data may be gathered and published as
        articles or monographs.
      - >-
        New York City is home to the headquarters of the National Football
        League, Major League Baseball, the National Basketball Association, the
        National Hockey League, and Major League Soccer. The New York
        metropolitan area hosts the most sports teams in these five professional
        leagues. Participation in professional sports in the city predates all
        professional leagues, and the city has been continuously hosting
        professional sports since the birth of the Brooklyn Dodgers in 1882. The
        city has played host to over forty major professional teams in the five
        sports and their respective competing leagues, both current and
        historic. Four of the ten most expensive stadiums ever built worldwide
        (MetLife Stadium, the new Yankee Stadium, Madison Square Garden, and
        Citi Field) are located in the New York metropolitan area. Madison
        Square Garden, its predecessor, as well as the original Yankee Stadium
        and Ebbets Field, are some of the most famous sporting venues in the
        world, the latter two having been commemorated on U.S. postage stamps.
      - >-
        New York City is home to the headquarters of the National Football
        League, Major League Baseball, the National Basketball Association, the
        National Hockey League, and Major League Soccer. The New York
        metropolitan area hosts the most sports teams in these five professional
        leagues. Participation in professional sports in the city predates all
        professional leagues, and the city has been continuously hosting
        professional sports since the birth of the Brooklyn Dodgers in 1882. The
        city has played host to over forty major professional teams in the five
        sports and their respective competing leagues, both current and
        historic. Four of the ten most expensive stadiums ever built worldwide
        (MetLife Stadium, the new Yankee Stadium, Madison Square Garden, and
        Citi Field) are located in the New York metropolitan area. Madison
        Square Garden, its predecessor, as well as the original Yankee Stadium
        and Ebbets Field, are some of the most famous sporting venues in the
        world, the latter two having been commemorated on U.S. postage stamps.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: gooqa dev
          type: gooqa-dev
        metrics:
          - type: cosine_accuracy
            value: 0.4129999876022339
            name: Cosine Accuracy

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ayushexel/embed-all-MiniLM-L6-v2-squad-5-epochs")
# Run inference
sentences = [
    'How many professional sports leagues have their headquarters in New York?',
    'New York City is home to the headquarters of the National Football League, Major League Baseball, the National Basketball Association, the National Hockey League, and Major League Soccer. The New York metropolitan area hosts the most sports teams in these five professional leagues. Participation in professional sports in the city predates all professional leagues, and the city has been continuously hosting professional sports since the birth of the Brooklyn Dodgers in 1882. The city has played host to over forty major professional teams in the five sports and their respective competing leagues, both current and historic. Four of the ten most expensive stadiums ever built worldwide (MetLife Stadium, the new Yankee Stadium, Madison Square Garden, and Citi Field) are located in the New York metropolitan area. Madison Square Garden, its predecessor, as well as the original Yankee Stadium and Ebbets Field, are some of the most famous sporting venues in the world, the latter two having been commemorated on U.S. postage stamps.',
    'New York City is home to the headquarters of the National Football League, Major League Baseball, the National Basketball Association, the National Hockey League, and Major League Soccer. The New York metropolitan area hosts the most sports teams in these five professional leagues. Participation in professional sports in the city predates all professional leagues, and the city has been continuously hosting professional sports since the birth of the Brooklyn Dodgers in 1882. The city has played host to over forty major professional teams in the five sports and their respective competing leagues, both current and historic. Four of the ten most expensive stadiums ever built worldwide (MetLife Stadium, the new Yankee Stadium, Madison Square Garden, and Citi Field) are located in the New York metropolitan area. Madison Square Garden, its predecessor, as well as the original Yankee Stadium and Ebbets Field, are some of the most famous sporting venues in the world, the latter two having been commemorated on U.S. postage stamps.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.413

Training Details

Training Dataset

Unnamed Dataset

  • Size: 44,284 training samples
  • Columns: question, context, and negative
  • Approximate statistics based on the first 1000 samples:
    question context negative
    type string string string
    details
    • min: 6 tokens
    • mean: 14.54 tokens
    • max: 33 tokens
    • min: 28 tokens
    • mean: 148.41 tokens
    • max: 256 tokens
    • min: 32 tokens
    • mean: 150.29 tokens
    • max: 256 tokens
  • Samples:
    question context negative
    What percent were born outside of Switzerland? As of 2008[update], the population was 47.5% male and 52.5% female. The population was made up of 44,032 Swiss men (35.4% of the population) and 15,092 (12.1%) non-Swiss men. There were 51,531 Swiss women (41.4%) and 13,726 (11.0%) non-Swiss women. Of the population in the municipality, 39,008 or about 30.3% were born in Bern and lived there in 2000. There were 27,573 or 21.4% who were born in the same canton, while 25,818 or 20.1% were born somewhere else in Switzerland, and 27,812 or 21.6% were born outside of Switzerland. The city of Bern or Berne (German: Bern, pronounced [bɛrn] ( listen); French: Berne [bɛʁn]; Italian: Berna [ˈbɛrna]; Romansh: Berna [ˈbɛrnɐ] (help·info); Bernese German: Bärn [b̥æːrn]) is the de facto capital of Switzerland, referred to by the Swiss as their (e.g. in German) Bundesstadt, or "federal city".[note 1] With a population of 140,634 (November 2015), Bern is the fifth most populous city in Switzerland. The Bern agglomeration, which includes 36 municipalities, had a population of 406,900 in 2014. The metropolitan area had a population of 660,000 in 2000. Bern is also the capital of the Canton of Bern, the second most populous of Switzerland's cantons.
    Where do the dark-eyed junco migrate? The typical image of migration is of northern landbirds, such as swallows (Hirundinidae) and birds of prey, making long flights to the tropics. However, many Holarctic wildfowl and finch (Fringillidae) species winter in the North Temperate Zone, in regions with milder winters than their summer breeding grounds. For example, the pink-footed goose Anser brachyrhynchus migrates from Iceland to Britain and neighbouring countries, whilst the dark-eyed junco Junco hyemalis migrates from subarctic and arctic climates to the contiguous United States and the American goldfinch from taiga to wintering grounds extending from the American South northwestward to Western Oregon. Migratory routes and wintering grounds are traditional and learned by young during their first migration with their parents. Some ducks, such as the garganey Anas querquedula, move completely or partially into the tropics. The European pied flycatcher Ficedula hypoleuca also follows this migratory trend, breeding in Asia and... These advantages offset the high stress, physical exertion costs, and other risks of the migration. Predation can be heightened during migration: Eleonora's falcon Falco eleonorae, which breeds on Mediterranean islands, has a very late breeding season, coordinated with the autumn passage of southbound passerine migrants, which it feeds to its young. A similar strategy is adopted by the greater noctule bat, which preys on nocturnal passerine migrants. The higher concentrations of migrating birds at stopover sites make them prone to parasites and pathogens, which require a heightened immune response.
    When did United States begin to provide foreign aid to Israel? In July 2007, American business magnate and investor Warren Buffett's holding company Berkshire Hathaway bought an Israeli company, Iscar, its first non-U.S. acquisition, for $4 billion. Since the 1970s, Israel has received military aid from the United States, as well as economic assistance in the form of loan guarantees, which now account for roughly half of Israel's external debt. Israel has one of the lowest external debts in the developed world, and is a net lender in terms of net external debt (the total value of assets vs. liabilities in debt instruments owed abroad), which in December 2015[update] stood at a surplus of US$118 billion. In 1992, Yitzhak Rabin became Prime Minister following an election in which his party called for compromise with Israel's neighbors. The following year, Shimon Peres on behalf of Israel, and Mahmoud Abbas for the PLO, signed the Oslo Accords, which gave the Palestinian National Authority the right to govern parts of the West Bank and the Gaza Strip. The PLO also recognized Israel's right to exist and pledged an end to terrorism. In 1994, the Israel–Jordan Treaty of Peace was signed, making Jordan the second Arab country to normalize relations with Israel. Arab public support for the Accords was damaged by the continuation of Israeli settlements and checkpoints, and the deterioration of economic conditions. Israeli public support for the Accords waned as Israel was struck by Palestinian suicide attacks. Finally, while leaving a peace rally in November 1995, Yitzhak Rabin was assassinated by a far-right-wing Jew who opposed the Accords.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 5,000 evaluation samples
  • Columns: question, context, and negative_1
  • Approximate statistics based on the first 1000 samples:
    question context negative_1
    type string string string
    details
    • min: 7 tokens
    • mean: 14.54 tokens
    • max: 52 tokens
    • min: 32 tokens
    • mean: 147.83 tokens
    • max: 256 tokens
    • min: 28 tokens
    • mean: 144.67 tokens
    • max: 256 tokens
  • Samples:
    question context negative_1
    When did Massamba-Debat lose power in the Congo? Under the 1963 constitution, Massamba-Débat was elected President for a five-year term. During Massamba-Débat's term in office the regime adopted "scientific socialism" as the country's constitutional ideology. In 1965, Congo established relations with the Soviet Union, the People's Republic of China, North Korea and North Vietnam. Massamba-Débat's regime also invited several hundred Cuban army troops into the country to train his party's militia units and these troops helped his government survive a coup in 1966 led by paratroopers loyal to future President Marien Ngouabi. Nevertheless, Massamba-Débat was unable to reconcile various institutional, tribal and ideological factions within the country and his regime ended abruptly with a bloodless coup d'état in September 1968. Under the 1963 constitution, Massamba-Débat was elected President for a five-year term. During Massamba-Débat's term in office the regime adopted "scientific socialism" as the country's constitutional ideology. In 1965, Congo established relations with the Soviet Union, the People's Republic of China, North Korea and North Vietnam. Massamba-Débat's regime also invited several hundred Cuban army troops into the country to train his party's militia units and these troops helped his government survive a coup in 1966 led by paratroopers loyal to future President Marien Ngouabi. Nevertheless, Massamba-Débat was unable to reconcile various institutional, tribal and ideological factions within the country and his regime ended abruptly with a bloodless coup d'état in September 1968.
    I2a1b1 is found being highest where? On the other hand, I2a1b1 (P41.2) is typical of the South Slavic populations, being highest in Bosnia-Herzegovina (>50%). Haplogroup I2a2 is also commonly found in north-eastern Italians. There is also a high concentration of I2a2a in the Moldavian region of Romania, Moldova and western Ukraine. According to original studies, Hg I2a2 was believed to have arisen in the west Balkans sometime after the LGM, subsequently spreading from the Balkans through Central Russian Plain. Recently, Ken Nordtvedt has split I2a2 into two clades – N (northern) and S (southern), in relation where they arose compared to Danube river. He proposes that N is slightly older than S. He recalculated the age of I2a2 to be ~ 2550 years and proposed that the current distribution is explained by a Slavic expansion from the area north-east of the Carpathians. On the other hand, I2a1b1 (P41.2) is typical of the South Slavic populations, being highest in Bosnia-Herzegovina (>50%). Haplogroup I2a2 is also commonly found in north-eastern Italians. There is also a high concentration of I2a2a in the Moldavian region of Romania, Moldova and western Ukraine. According to original studies, Hg I2a2 was believed to have arisen in the west Balkans sometime after the LGM, subsequently spreading from the Balkans through Central Russian Plain. Recently, Ken Nordtvedt has split I2a2 into two clades – N (northern) and S (southern), in relation where they arose compared to Danube river. He proposes that N is slightly older than S. He recalculated the age of I2a2 to be ~ 2550 years and proposed that the current distribution is explained by a Slavic expansion from the area north-east of the Carpathians.
    What is a well-known lake of Hangzhou? Valleys and plains are found along the coastline and rivers. The north of the province lies just south of the Yangtze Delta, and consists of plains around the cities of Hangzhou, Jiaxing, and Huzhou, where the Grand Canal of China enters from the northern border to end at Hangzhou. Another relatively flat area is found along the Qu River around the cities of Quzhou and Jinhua. Major rivers include the Qiangtang and Ou Rivers. Most rivers carve out valleys in the highlands, with plenty of rapids and other features associated with such topography. Well-known lakes include the West Lake of Hangzhou and the South Lake of Jiaxing. Longjing tea (also called dragon well tea), originating in Hangzhou, is one of the most prestigious, if not the most prestigious Chinese tea. Hangzhou is also renowned for its silk umbrellas and hand fans. Zhejiang cuisine (itself subdivided into many traditions, including Hangzhou cuisine) is one of the eight great traditions of Chinese cuisine.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • num_train_epochs: 5
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss gooqa-dev_cosine_accuracy
-1 -1 - - 0.3284
0.2890 100 0.4443 0.7861 0.3900
0.5780 200 0.3909 0.7772 0.3950
0.8671 300 0.3944 0.7668 0.3932
1.1561 400 0.3208 0.7474 0.4066
1.4451 500 0.2724 0.7490 0.4034
1.7341 600 0.2778 0.7490 0.4056
2.0231 700 0.28 0.7495 0.3986
2.3121 800 0.205 0.7429 0.4030
2.6012 900 0.2101 0.7365 0.4100
2.8902 1000 0.2066 0.7348 0.4086
3.1792 1100 0.181 0.7382 0.4104
3.4682 1200 0.1686 0.7363 0.4114
3.7572 1300 0.1749 0.7438 0.4036
4.0462 1400 0.164 0.7396 0.4072
4.3353 1500 0.1463 0.7419 0.4098
4.6243 1600 0.1449 0.7389 0.4146
4.9133 1700 0.1475 0.7388 0.4102
-1 -1 - - 0.4130

Framework Versions

  • Python: 3.11.0
  • Sentence Transformers: 4.0.1
  • Transformers: 4.50.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}