Add new SentenceTransformer model

Browse files

Files changed (12) hide show

.gitattributes +1 -0
1_Pooling/config.json +10 -0
README.md +734 -0
config.json +27 -0
config_sentence_transformers.json +10 -0
model.safetensors +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
sentencepiece.bpe.model +3 -0
special_tokens_map.json +51 -0
tokenizer.json +3 -0
tokenizer_config.json +63 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 1024,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,734 @@

+---
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:82169
+- loss:MultipleNegativesRankingLoss
+base_model: BAAI/bge-m3
+widget:
+- source_sentence: can beef help reduce emissions
+  sentences:
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals These studies each shed light on the quantitative effects
+    of shifting production or sourcing from a conventional system to an alternative
+    system.
+    Because Poore and Nemecek’s (2018) database only captured studies published between
+    2000 and June 2016, we performed a literature review using similar search terms
+    and study inclusion criteria to capture additional studies that were published
+    through 2022. As Poore and Nemecek (2018) did, in some instances we performed
+    adjustments to fill data gaps or make results more comparable between studies
+    (e.g., estimating land use using data included in a study, making assumptions
+    to estimate impacts from the animals’ full life cycle). See Appendix A for more
+    details on our approach to adding in more recent studies and Appendix B for the
+    full list of “paired studies” included in our analysis below, as well as all adjustments
+    made. The Glossary provides definitions of the various production systems.
+    For each quantitative environmental indicator (e.g., GHG emissions, land use)
+    in each “paired study,” we calculated the percent changes that occurred when shifting
+    from the conventional system to the alternative production system.'
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals Finally, there are more complex nutrient quality indices
+    that could be used as denominators (FAO 2021; Katz-Rosene et al. 2023), but, since
+    no consensus exists about which one is “best,” we have used the simpler denominator
+    of protein. In sum, use of any of these alternative numerators and denominators
+    would not change the main findings and recommendations of this report.
+    4.  For GHG emissions, we removed land-use-change emissions from the estimates
+    in Poore and Nemecek (2018), so as not to double-count with the “carbon opportunity
+    costs” of agricultural land use.'
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals Shift toward lower-emissions foods. As noted elsewhere
+    in this report, because beef is an emissions-intensive food, shifting purchases
+    and sales toward lower-emissions foods can help companies reduce scope 3 emissions.
+    There is growing interest in improving grazing management to increase the amount
+    of carbon sequestered in pasturelands, a practice often called “regenerative grazing.”
+    Some proponents of regenerative grazing even suggest that by removing carbon from
+    the atmosphere, soil carbon sequestration could fully offset GHG emissions from
+    beef production, suggesting potentially “carbon neutral” or “carbon negative”
+    beef. And while traditional life cycle assessments assumed that soil carbon stocks
+    on agricultural lands were in equilibrium and did not include soil carbon stock
+    changes in studies on agriculture’s environmental impacts, more recent studies
+    have begun to incorporate soil carbon measurements, including several beef studies
+    included in our review (Buratti et al. 2017; Eldesouky et al. 2018; Stanley et
+    al. 2018).'
+- source_sentence: what is the npv for land restoration in latin america
+  sentences:
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals Overall, this places fish and seafood at the lower end
+    of the environmental impact spectrum for animal proteins (Gephart et al. 2021)
+    but usually still higher than plant-based proteins.
+    Similarly to terrestrial animal proteins, life cycle assessments of aquaculture
+    (fish farming) have found that there are environmental trade-offs with intensification.
+    When finfish and crustacean aquaculture systems move along the spectrum from more
+    traditional extensive systems to more industrialized intensive systems, land use
+    and water use per kilogram of fish declines, but water pollution and energy use
+    per kilogram of fish grow (Bohnes et al. 2018; Waite et al. 2014; Hall et al.
+    2011). Effects on GHG emissions can be mixed under intensification due to the
+    growth in energy use and land use for feeds balanced by the reduction in land
+    use for ponds (Searchinger et al. 2019), and translation of land use into “carbon
+    opportunity costs” can help better weigh these trade-offs. Aquaculture is also
+    a significant user of wild fish as feed; more than 20 percent of total wild-caught
+    fish catch in 2020 went to “nonfood” uses—mostly for fishmeal and fish oil used
+    in aquaculture operations (FAO 2022c).'
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals ▪ The company first simulates a pure “less meat” strategy
+    to reduce scope 3 emissions and carbon opportunity costs by a combined 25 percent.
+    To do so, it finds that sourcing 50 percent less beef, 20 percent less of other
+    meats, and 15 percent less dairy—and shifting the purchases toward pulses, soy,
+    and vegetables—achieves this 25 percent reduction in climate impacts.
+    ▪ The company then explores a plausible scenario of shifting all chicken and egg
+    purchases toward higherwelfare products. It uses Table 7 and selects points within
+    the impact ranges to assume that free-range chicken and eggs could lead to 15
+    percent higher GHG emissions and 25 percent higher land use (carbon opportunity
+    costs) than conventional chicken. The company estimates that this would increase
+    total climate impacts, but only slightly, since chicken and eggs represent a small
+    amount of the company’s total climate impact. Under this scenario, total climate
+    impacts are reduced versus the base year by “only” 24 percent instead of 25 percent.
+    ▪'
+  - The Economic Case for Landscape Restoration in Latin America This implies an underestimation
+    of benefits given that, in this form, the restoration scenario equation does not
+    account for the remaining annual difference in net flow values between the degraded
+    hectare that is restored and the same hectare left degraded for the years between
+    full restoration and the end of the study’s overall assumed 50-year time horizon.
+    The NPVs of all target hectares would have to be calculated for all 50 years,
+    particularly in the cases of lightly and moderately degraded lands which have
+    recovery periods under restoration (delimited in this equation by t, which are
+    only 7 and 15 years, respectively).
+- source_sentence: what is meat sourcing strategy
+  sentences:
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals CHAPTER 1 Introduction and context Meat and dairy production
+    are responsible for a large proportion of global greenhouse gas (GHG) emissions.
+    According to one widely cited estimate by the Food and Agriculture Organization
+    of the United Nations (FAO), animal agriculture (including the agricultural production
+    process and related land-use change) accounted for 14.5 percent of global GHG
+    emissions in 2005, with beef production alone accounting for 6 percent of global
+    emissions (Gerber and FAO 2013). Toward “Better” Meat? Aligning meat sourcing
+    strategies with corporate climate and sustainability goals | 11
+    More recent estimates for animal agriculture’s contribution to global emissions
+    in 2010–15 are of a similar magnitude, ranging from 11 to 20 percent (e.g., Poore
+    and Nemecek 2018; Twine 2021; Xu et al. 2021; FAO 2022a). Animal agriculture also
+    accounted for more than 30 percent of global methane emissions in 2017 (CCAC and
+    UNEP 2021).'
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals Further work is necessary to gather publicly available
+    data on other environmental, social, and economic attributes of “better meat,”
+    such as for soil health, on-farm biodiversity, and agricultural livelihoods, to
+    inform corporate decision-making. Similarly, better data are needed on alternative
+    systems and practices related to fish and seafood production; these “blue foods”
+    are important contributors to global food and nutrition security, but data are
+    even scarcer for these food production systems than for terrestrial animal agriculture.
+    In an ideal world, “better meat” production could lead to improvements across
+    all sustainability goals; however, our analysis shows that companies with quantitative
+    sustainability goals need to consider both co-benefits and trade-offs across all
+    goals when designing their meat sourcing strategies. We also show that balancing
+    these goals is eminently possible. This analysis also confirms the critical importance
+    of shifting diets high in animal-based foods toward plant-based foods and alternative
+    proteins to improve both environmental and animal welfare outcomes.'
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals It is true that poultry has a lower climate impact per
+    kilogram of protein than beef and lamb, and climate strategies may consider a
+    shift in purchasing from beef toward chicken to continue to provide the same amount
+    of meat to consumers while reducing GHG emissions. However, an important trade-off
+    to consider from an animal welfare perspective is the number of animal lives per
+    unit of protein produced. While alternative systems thought of as “better” might
+    improve the quality of life of the animals to some degree, animal welfare experts
+    also recognize the inherent value of all animals, and companies might choose to
+    factor the number of animals slaughtered into their decision-making as a simple
+    and easily understood indicator of animal welfare.
+    Figure 5 shows the trade-off between climate and animal welfare indicators when
+    shifting between animal-based foods, showing that the foods with the highest climate
+    impact per kg of protein also require the fewest animals to be killed, and vice
+    versa. For example, to produce a kg of protein, more than 100 times as many chickens
+    need to be slaughtered compared to cows.'
+- source_sentence: cost of restoring landscape
+  sentences:
+  - 'The Economic Case for Landscape Restoration in Latin America This report assesses
+    the economic costs and benefits of landscape restoration in Latin America and
+    the Caribbean by monetizing a set of benefits that could flow from 20 million
+    hectares of restored lands. The introduction highlights some of the drivers and
+    impacts of degradation in the Latin America and Caribbean region. The section
+    that follows presents an overview of the method used to monetize the benefits
+    of landscape restoration; detailed descriptions of the methodology and modeling
+    approach are available in the annexes. Next, we present the results—the estimation
+    of net economic benefits from restoration and the different values for biomes
+    and degree of restoration. Finally, we suggest areas where future analysis could
+    provide more location-specific financial estimates.
+    Agriculture and forestry play an important role in the economy and social fabric
+    of Latin America and the Caribbean'
+  - 'Getting Ready Include a more comprehensive analysis of the legal framework for
+    tenure and existing conditions on-the-ground
+    Discuss how tenure conflicts might be addressed as part of the REDD+ strategy
+    Discusses the ability of forest agencies to plan and implement forest management
+    activities Considers the role of non-government stakeholders, including communities,
+    in forest management Links identified governance challenges to proposed REDD+
+    strategy options and implementation framework
+    The NPD provides an overview of recent efforts to improve forest management in
+    RoC, e.g., through the FLEGT Voluntary Partnership Agreement (VPA), certification
+    schemes, and improving the coverage and management of protected areas. According
+    to the NPD, the FLEGT process identified numerous forest sector challenges that
+    should be addressed as part of a REDD+ program, notably lack of forest administration
+    capacity and the need to strengthen involvement of local populations in forest
+    management decision-making. According to the NPD, over 4 million hectares of concessions
+    have been developed since 2001, but the NPD does not discuss the role of the private
+    sector in forest management activities in detail.'
+  - The Economic Case for Landscape Restoration in Latin America Nevertheless, because
+    E&M activities will always require more than a single year to be fully implemented,
+    the full per hectare cost should not be assigned to the first year of restoration
+    alone, but rather to a number of initial years along the restoration time horizon.
+    In the case of lightly degraded landscapes, the total cost/ha (from Tables 7 and
+    8) has been divided and assigned equally to the first four years (or roughly the
+    first half) of the restoration time horizon. In the case of moderately degraded
+    lands, the total cost has been subtracted from annual benefit flow values in equal
+    annual tranches over the first 8 years (again, roughly the first half of the restoration
+    time horizon).  Finally, total costs for severely degraded lands are subtracted
+    in equal annual amounts over the first 25 years of the restoration time horizon.
+    Allocating costs over a 25-year time horizon has the effect of discounting costs
+    relative to the benefits.
+- source_sentence: what is the wri meat initiative?
+  sentences:
+  - 'Pilot analysis of global ecosystems: Grassland ecosystems Although GLASOD was
+    by necessity a somewhat subjective assessment it was extremely carefully prepared
+    by leading experts in the field. It remains the only global database on the status
+    of human-induced soil degradation, and no other data set comes as close to defining
+    the extent of desertification at the global scale (UNEP 1997: V).'
+  - 'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate
+    and Sustainability Goals Toward “Better” Meat? Aligning meat sourcing strategies
+    with corporate climate and sustainability goals
+    WOR L D WOR L D R E S O U R C E S R E S O U R C E S I NS T I T U T E I NS T I
+    T U T E
+    RICHARD WAITE is the Acting Director for Agriculture Initiatives at WRI.
+    is a doctoral student with Oxford University’s Environmental Change Institute
+    and a former Research Analyst for WRI’s Food and Climate Programs.
+    CLARA CHO is the Data Analyst for the Coolfood initiative at WRI. Contact: clara.cho@wri.org.
+    We are pleased to acknowledge our institutional strategic partners that provide
+    core funding to WRI: the Netherlands Ministry of Foreign Affairs, Royal Danish
+    Ministry of Foreign Affairs, and Swedish International Development Cooperation
+    Agency.
+    The authors acknowledge the following individuals for their valuable guidance
+    and critical reviews:'
+  - 'Getting Ready THE IMPORTANCE OF FOREST GOVERNANCE TO THE REDD+ READINESS PROCESS
+    Strengthening forest governance will be an essential component of the activities
+    implemented by countries seeking to achieve significant and lasting emission reductions
+    through REDD+. Poor forest governance is often characterized by weak capacity
+    to manage natural resources, lack of decision-maker accountability to impacted
+    stakeholders, and lack of public access to information about the status and use
+    of forest resources. Potential drivers of deforestation and forest degradation—such
+    as illegal logging, unplanned forest conversion, and conflicts over access to
+    land and resources—are often symptoms of weak forest governance. To develop effective
+    national REDD+ strategies, governments need to better understand these challenges
+    and develop measures to strengthen forest governance in ways that build the trust
+    of domestic and international stakeholders.'
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+metrics:
+- cosine_accuracy@1
+- cosine_accuracy@3
+- cosine_accuracy@5
+- cosine_accuracy@10
+- cosine_precision@1
+- cosine_precision@3
+- cosine_precision@5
+- cosine_precision@10
+- cosine_recall@1
+- cosine_recall@3
+- cosine_recall@5
+- cosine_recall@10
+- cosine_ndcg@10
+- cosine_mrr@10
+- cosine_map@100
+model-index:
+- name: SentenceTransformer based on BAAI/bge-m3
+  results:
+  - task:
+      type: information-retrieval
+      name: Information Retrieval
+    dataset:
+      name: ir eval
+      type: ir-eval
+    metrics:
+    - type: cosine_accuracy@1
+      value: 0.34030612244897956
+      name: Cosine Accuracy@1
+    - type: cosine_accuracy@3
+      value: 0.5389030612244898
+      name: Cosine Accuracy@3
+    - type: cosine_accuracy@5
+      value: 0.6211734693877551
+      name: Cosine Accuracy@5
+    - type: cosine_accuracy@10
+      value: 0.7122448979591837
+      name: Cosine Accuracy@10
+    - type: cosine_precision@1
+      value: 0.34030612244897956
+      name: Cosine Precision@1
+    - type: cosine_precision@3
+      value: 0.17963435374149658
+      name: Cosine Precision@3
+    - type: cosine_precision@5
+      value: 0.12423469387755101
+      name: Cosine Precision@5
+    - type: cosine_precision@10
+      value: 0.07122448979591836
+      name: Cosine Precision@10
+    - type: cosine_recall@1
+      value: 0.34030612244897956
+      name: Cosine Recall@1
+    - type: cosine_recall@3
+      value: 0.5389030612244898
+      name: Cosine Recall@3
+    - type: cosine_recall@5
+      value: 0.6211734693877551
+      name: Cosine Recall@5
+    - type: cosine_recall@10
+      value: 0.7122448979591837
+      name: Cosine Recall@10
+    - type: cosine_ndcg@10
+      value: 0.5191028810993514
+      name: Cosine Ndcg@10
+    - type: cosine_mrr@10
+      value: 0.458020782717851
+      name: Cosine Mrr@10
+    - type: cosine_map@100
+      value: 0.46727356494811056
+      name: Cosine Map@100
+---
+# SentenceTransformer based on BAAI/bge-m3
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) <!-- at revision 5617a9f61b028005a4858fdac845db406aefb181 -->
+- **Maximum Sequence Length:** 8192 tokens
+- **Output Dimensionality:** 1024 dimensions
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
+  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("collaborativeearth/bge-m3_wri")
+# Run inference
+sentences = [
+    'what is the wri meat initiative?',
+    'Toward "Better" Meat? Aligning Meat Sourcing Strategies with Corporate Climate and Sustainability Goals Toward “Better” Meat? Aligning meat sourcing strategies with corporate climate and sustainability goals\n\nWOR L D WOR L D R E S O U R C E S R E S O U R C E S I NS T I T U T E I NS T I T U T E\n\nRICHARD WAITE is the Acting Director for Agriculture Initiatives at WRI.\n\nis a doctoral student with Oxford University’s Environmental Change Institute and a former Research Analyst for WRI’s Food and Climate Programs.\n\nCLARA CHO is the Data Analyst for the Coolfood initiative at WRI. Contact: clara.cho@wri.org.\n\nWe are pleased to acknowledge our institutional strategic partners that provide core funding to WRI: the Netherlands Ministry of Foreign Affairs, Royal Danish Ministry of Foreign Affairs, and Swedish International Development Cooperation Agency.\n\nThe authors acknowledge the following individuals for their valuable guidance and critical reviews:',
+    'Pilot analysis of global ecosystems: Grassland ecosystems Although GLASOD was by necessity a somewhat subjective assessment it was extremely carefully prepared by leading experts in the field. It remains the only global database on the status of human-induced soil degradation, and no other data set comes as close to defining the extent of desertification at the global scale (UNEP 1997: V).',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 1024]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+## Evaluation
+### Metrics
+#### Information Retrieval
+* Dataset: `ir-eval`
+* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
+| Metric              | Value      |
+|:--------------------|:-----------|
+| cosine_accuracy@1   | 0.3403     |
+| cosine_accuracy@3   | 0.5389     |
+| cosine_accuracy@5   | 0.6212     |
+| cosine_accuracy@10  | 0.7122     |
+| cosine_precision@1  | 0.3403     |
+| cosine_precision@3  | 0.1796     |
+| cosine_precision@5  | 0.1242     |
+| cosine_precision@10 | 0.0712     |
+| cosine_recall@1     | 0.3403     |
+| cosine_recall@3     | 0.5389     |
+| cosine_recall@5     | 0.6212     |
+| cosine_recall@10    | 0.7122     |
+| **cosine_ndcg@10**  | **0.5191** |
+| cosine_mrr@10       | 0.458      |
+| cosine_map@100      | 0.4673     |
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 82,169 training samples
+* Columns: <code>question</code> and <code>answer</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | question                                                                          | answer                                                                               |
+  |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                               |
+  | details | <ul><li>min: 4 tokens</li><li>mean: 10.62 tokens</li><li>max: 31 tokens</li></ul> | <ul><li>min: 53 tokens</li><li>mean: 232.17 tokens</li><li>max: 337 tokens</li></ul> |
+* Samples:
+  | question                                                             | answer                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+  |:---------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>what is the economic case of restoration</code>                | <code>The Economic Case for Landscape Restoration in Latin America The Economic Case for Landscape Restoration in Latin America<br><br>THE ECONOMIC CASE FOR LANDSCAPE RESTORATION IN LATIN AMERICA<br><br>WALTER VERGARA, LUCIANA GALLARDO LOMELI, ANA R. RIOS, PAUL ISBELL, STEVEN PRAGER, RONNIE DE CAMINO<br><br>Land use and land-use change are central to the economic and social fabric of Latin America and the Caribbean, and essential to the region’s prospects for sustainable development. Countries are realizing that now, more than ever, is the time for action. Eleven countries, three Brazilian states and several regional programs have already committed to restoring more than 27 million hectares of degraded land in Latin America—but can these ambitions become a reality while supporting good living standards and economic development?</code>                                                                                                                                                                           |
+  | <code>economic case of landscape restoration in latin america</code> | <code>The Economic Case for Landscape Restoration in Latin America The Economic Case for Landscape Restoration in Latin America<br><br>THE ECONOMIC CASE FOR LANDSCAPE RESTORATION IN LATIN AMERICA<br><br>WALTER VERGARA, LUCIANA GALLARDO LOMELI, ANA R. RIOS, PAUL ISBELL, STEVEN PRAGER, RONNIE DE CAMINO<br><br>Land use and land-use change are central to the economic and social fabric of Latin America and the Caribbean, and essential to the region’s prospects for sustainable development. Countries are realizing that now, more than ever, is the time for action. Eleven countries, three Brazilian states and several regional programs have already committed to restoring more than 27 million hectares of degraded land in Latin America—but can these ambitions become a reality while supporting good living standards and economic development?</code>                                                                                                                                                                           |
+  | <code>what is lata-american landscape</code>                         | <code>The Economic Case for Landscape Restoration in Latin America Agriculture and forestry exports from Latin America represent about 13 percent of the global trade of food, feed, and fiber and account for a majority of employment outside large urban areas—numbers only expected to grow as Latin America is called upon to meet an increasing global demand for food. Yet, since the turn of the century, about 37 million hectares of natural forests, savannas and wetlands have been transformed to expand agriculture. Cumulative, unsustainable land-use practices have led to the degradation of about 300 million hectares, resulting in a reduction in yields and quality of production, and in losses in biomass content, soil quality, surface water hydrology, and biodiversity. Deforestation, land-use change, and unsustainable agricultural activities are also currently the largest drivers of climate change in the region, accounting for 56 percent of all greenhouse gas emissions. Today, while some progress ha...</code> |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 32
+- `learning_rate`: 1e-06
+- `num_train_epochs`: 2
+- `warmup_ratio`: 0.1
+- `fp16`: True
+- `gradient_checkpointing`: True
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 32
+- `per_device_eval_batch_size`: 8
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 1e-06
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 2
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: True
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `tp_size`: 0
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `gradient_checkpointing`: True
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss | ir-eval_cosine_ndcg@10 |
+|:------:|:----:|:-------------:|:----------------------:|
+| -1     | -1   | -             | 0.4718                 |
+| 0.0389 | 100  | 0.7439        | -                      |
+| 0.0779 | 200  | 0.6208        | -                      |
+| 0.1168 | 300  | 0.4568        | -                      |
+| 0.1558 | 400  | 0.3713        | -                      |
+| 0.1947 | 500  | 0.3263        | 0.5004                 |
+| 0.2336 | 600  | 0.2722        | -                      |
+| 0.2726 | 700  | 0.2521        | -                      |
+| 0.3115 | 800  | 0.2541        | -                      |
+| 0.3505 | 900  | 0.2348        | -                      |
+| 0.3894 | 1000 | 0.2321        | 0.5090                 |
+| 0.4283 | 1100 | 0.2313        | -                      |
+| 0.4673 | 1200 | 0.2195        | -                      |
+| 0.5062 | 1300 | 0.2286        | -                      |
+| 0.5452 | 1400 | 0.2188        | -                      |
+| 0.5841 | 1500 | 0.2166        | 0.5115                 |
+| 0.6231 | 1600 | 0.2194        | -                      |
+| 0.6620 | 1700 | 0.2006        | -                      |
+| 0.7009 | 1800 | 0.1954        | -                      |
+| 0.7399 | 1900 | 0.2157        | -                      |
+| 0.7788 | 2000 | 0.2059        | 0.5154                 |
+| 0.8178 | 2100 | 0.203         | -                      |
+| 0.8567 | 2200 | 0.1949        | -                      |
+| 0.8956 | 2300 | 0.1943        | -                      |
+| 0.9346 | 2400 | 0.206         | -                      |
+| 0.9735 | 2500 | 0.2015        | 0.5175                 |
+| 1.0125 | 2600 | 0.1801        | -                      |
+| 1.0514 | 2700 | 0.1867        | -                      |
+| 1.0903 | 2800 | 0.1914        | -                      |
+| 1.1293 | 2900 | 0.1827        | -                      |
+| 1.1682 | 3000 | 0.1899        | 0.5165                 |
+| 1.2072 | 3100 | 0.1707        | -                      |
+| 1.2461 | 3200 | 0.1872        | -                      |
+| 1.2850 | 3300 | 0.1943        | -                      |
+| 1.3240 | 3400 | 0.1854        | -                      |
+| 1.3629 | 3500 | 0.1747        | 0.5182                 |
+| 1.4019 | 3600 | 0.1764        | -                      |
+| 1.4408 | 3700 | 0.1866        | -                      |
+| 1.4798 | 3800 | 0.1855        | -                      |
+| 1.5187 | 3900 | 0.1782        | -                      |
+| 1.5576 | 4000 | 0.1744        | 0.5181                 |
+| 1.5966 | 4100 | 0.1793        | -                      |
+| 1.6355 | 4200 | 0.187         | -                      |
+| 1.6745 | 4300 | 0.1907        | -                      |
+| 1.7134 | 4400 | 0.1781        | -                      |
+| 1.7523 | 4500 | 0.1825        | 0.5185                 |
+| 1.7913 | 4600 | 0.1981        | -                      |
+| 1.8302 | 4700 | 0.1751        | -                      |
+| 1.8692 | 4800 | 0.1824        | -                      |
+| 1.9081 | 4900 | 0.1866        | -                      |
+| 1.9470 | 5000 | 0.188         | 0.5191                 |
+| 1.9860 | 5100 | 0.1838        | -                      |
+### Framework Versions
+- Python: 3.11.12
+- Sentence Transformers: 4.1.0
+- Transformers: 4.51.3
+- PyTorch: 2.6.0+cu124
+- Accelerate: 1.6.0
+- Datasets: 2.14.4
+- Tokenizers: 0.21.1
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,27 @@

+{
+  "architectures": [
+    "XLMRobertaModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
+  "classifier_dropout": null,
+  "eos_token_id": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 1024,
+  "initializer_range": 0.02,
+  "intermediate_size": 4096,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 8194,
+  "model_type": "xlm-roberta",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "output_past": true,
+  "pad_token_id": 1,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.51.3",
+  "type_vocab_size": 1,
+  "use_cache": true,
+  "vocab_size": 250002
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "4.1.0",
+    "transformers": "4.51.3",
+    "pytorch": "2.6.0+cu124"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": "cosine"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a37e65d7dbd0677dbc3752d7b8727c91b525519648e042cc02c3cddc58a3a04e
+size 2271064456

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 8192,
+  "do_lower_case": false
+}

sentencepiece.bpe.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cfc8146abe2a0488e9e2a0c56de7952f7c11ab059eca145a0a727afce0db2865
+size 5069051

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "<mask>",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e4f7e21bec3fb0044ca0bb2d50eb5d4d8c596273c422baef84466d2c73748b9c
+size 17083053

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,63 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "250001": {
+      "content": "<mask>",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "<s>",
+  "eos_token": "</s>",
+  "extra_special_tokens": {},
+  "mask_token": "<mask>",
+  "max_length": 8192,
+  "model_max_length": 8192,
+  "pad_to_multiple_of": null,
+  "pad_token": "<pad>",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "</s>",
+  "sp_model_kwargs": {},
+  "stride": 0,
+  "tokenizer_class": "XLMRobertaTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "<unk>"
+}