AwAppp
/

benchmarks_4bit_batch_size30

Transformers

Model card Files Files and versions

xet

Community

AwAppp commited on Mar 10, 2024

Commit

36a6b96

verified ·

1 Parent(s): f853ccf

Upload TextGenerationReport

Browse files

Files changed (2) hide show

README.md +199 -0
benchmark_report.json +203 -0

README.md ADDED Viewed

	@@ -0,0 +1,199 @@

+---
+library_name: transformers
+tags: []
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]

benchmark_report.json ADDED Viewed

	@@ -0,0 +1,203 @@

+{
+    "prefill": {
+        "memory": {
+            "unit": "MB",
+            "max_ram": 3377.37728,
+            "max_vram": 6058.672128,
+            "max_reserved": 5576.327168,
+            "max_allocated": 5355.380736
+        },
+        "latency": {
+            "unit": "s",
+            "mean": 0.24580386203672833,
+            "stdev": 0.00022219677637899774,
+            "values": [
+                0.24691427612304687,
+                0.24614707946777345,
+                0.24578355407714844,
+                0.24561459350585937,
+                0.2458961944580078,
+                0.24560025024414062,
+                0.2456575927734375,
+                0.24577023315429689,
+                0.24575999450683594,
+                0.24567091369628907,
+                0.2458787841796875,
+                0.24583171081542968,
+                0.24586341857910157,
+                0.24571495056152343,
+                0.24590336608886718,
+                0.24583168029785157,
+                0.24580096435546875,
+                0.24568524169921874,
+                0.24596377563476562,
+                0.24577023315429689,
+                0.24546406555175782,
+                0.24596377563476562,
+                0.24557772827148439,
+                0.24591360473632812,
+                0.24575999450683594,
+                0.2457108154296875,
+                0.24578866577148437,
+                0.2458992614746094,
+                0.24572927856445312,
+                0.2458675231933594,
+                0.24559309387207032,
+                0.2457733154296875,
+                0.24587980651855468,
+                0.24561048889160156,
+                0.24586239624023437,
+                0.2460037078857422,
+                0.24556236267089843,
+                0.24560127258300782,
+                0.24583680725097656,
+                0.2457057342529297,
+                0.24579583740234376
+            ]
+        },
+        "throughput": {
+            "unit": "tokens/s",
+            "value": 1952.7764780533748
+        },
+        "energy": null,
+        "efficiency": null
+    },
+    "decode": {
+        "memory": {
+            "unit": "MB",
+            "max_ram": 3377.37728,
+            "max_vram": 7197.425664,
+            "max_reserved": 6712.983552,
+            "max_allocated": 6292.99456
+        },
+        "latency": {
+            "unit": "s",
+            "mean": 14.316073944091789,
+            "stdev": 0,
+            "values": [
+                14.316073944091789
+            ]
+        },
+        "throughput": {
+            "unit": "tokens/s",
+            "value": 207.45911285445072
+        },
+        "energy": null,
+        "efficiency": null
+    },
+    "per_token": {
+        "memory": null,
+        "latency": {
+            "unit": "s",
+            "mean": 0.14460680751607868,
+            "stdev": 0.002633980375935467,
+            "values": [
+                0.14012416076660156,
+                0.14033407592773436,
+                0.14042930603027343,
+                0.14048460388183595,
+                0.14043341064453124,
+                0.14061260986328125,
+                0.14064947509765624,
+                0.14090342712402343,
+                0.14090138244628905,
+                0.14114303588867189,
+                0.14110208129882812,
+                0.14144717407226562,
+                0.14132838439941406,
+                0.14151271057128906,
+                0.14150758361816407,
+                0.14181170654296876,
+                0.14139903259277345,
+                0.14184550476074217,
+                0.14165708923339843,
+                0.14212300109863282,
+                0.14184141540527342,
+                0.14210560607910155,
+                0.14203392028808592,
+                0.1424148406982422,
+                0.1421895751953125,
+                0.1424486389160156,
+                0.14228172302246095,
+                0.14273228454589842,
+                0.14244146728515625,
+                0.14266983032226563,
+                0.14270361328125,
+                0.1431470031738281,
+                0.14275686645507812,
+                0.14305381774902343,
+                0.14302105712890625,
+                0.14357913208007814,
+                0.14316645812988282,
+                0.14344499206542968,
+                0.14331596374511718,
+                0.1438586883544922,
+                0.1435484161376953,
+                0.14384640502929688,
+                0.14373989868164064,
+                0.14430105590820314,
+                0.14400717163085938,
+                0.1441095733642578,
+                0.1440245819091797,
+                0.14447718811035157,
+                0.14414540100097656,
+                0.14444032287597655,
+                0.14435224914550782,
+                0.1449502716064453,
+                0.14458265686035157,
+                0.14494207763671876,
+                0.14483045959472657,
+                0.14541413879394532,
+                0.14501580810546874,
+                0.14532608032226563,
+                0.14521241760253906,
+                0.14594866943359375,
+                0.14552268981933594,
+                0.14566502380371094,
+                0.14550834655761719,
+                0.14620364379882814,
+                0.14571417236328124,
+                0.14605722045898437,
+                0.14588825988769533,
+                0.14663066101074218,
+                0.14609100341796874,
+                0.14640538024902344,
+                0.1462906951904297,
+                0.14695526123046876,
+                0.1464596405029297,
+                0.14691635131835937,
+                0.14672076416015625,
+                0.14758912658691406,
+                0.1468211212158203,
+                0.1471631317138672,
+                0.14699417114257812,
+                0.14770176696777343,
+                0.14713037109375,
+                0.14749798583984375,
+                0.14733413696289063,
+                0.1480693817138672,
+                0.14749900817871095,
+                0.1478656005859375,
+                0.14795468139648438,
+                0.14866021728515624,
+                0.148168701171875,
+                0.14849331665039062,
+                0.1483356170654297,
+                0.14925926208496093,
+                0.14847488403320314,
+                0.14881587219238282,
+                0.14865408325195312,
+                0.1494906921386719,
+                0.1488107452392578,
+                0.14924082946777345,
+                0.1488486328125
+            ]
+        },
+        "throughput": {
+            "unit": "tokens/s",
+            "value": 207.45911285445072
+        },
+        "energy": null,
+        "efficiency": null
+    }
+}