Sidharthkr
/

InstructTweetSummarizer

@@ -16,16 +16,6 @@ pipeline_tag: summarization
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# InstructTweetSummarizer
-This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.3548
-- Rouge1: 47.5134
-- Rouge2: 24.7121
-- Rougel: 35.7366
-- Rougelsum: 35.6499
-- Gen Len: 111.96
 ## Model description
@@ -33,56 +23,210 @@ More information needed
 ## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 6
-- eval_batch_size: 4
-- seed: 42
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 12
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- num_epochs: 3
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
-|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
-| No log        | 1.0   | 417  | 0.3468          | 44.9326 | 22.3736 | 33.008  | 32.9247   | 116.43  |
-| 0.5244        | 2.0   | 834  | 0.3440          | 46.9139 | 24.683  | 35.3699 | 35.333    | 119.65  |
-| 0.2061        | 3.0   | 1251 | 0.3548          | 47.5134 | 24.7121 | 35.7366 | 35.6499   | 111.96  |
 ### How to use
 Here is how to use this model with the [pipeline API](https://huggingface.co/transformers/main_classes/pipelines.html):
 ```python
-from transformers import pipeline
-summarizer = pipeline("summarization", model="Sidharthkr/InstructTweetSummarizer")
-def summarymaker(instruction = "", tweets = ""):
-    ARTICLE = f"""[INST] {instruction} [/INST] \\n [TWEETS] {tweets} [/TWEETS]"""
-    out = summarizer(ARTICLE, max_length=130, min_length=10, do_sample=False)
-    out = out[0]['summary_text'].split("[SUMMARY]")[-1].split("[/")[0].split("[via")[0].strip()
-    return out
-summarymaker(instruction = "Summarize the tweets for Stellantis in 100 words",
-             tweets = """Stellantis - arch critic of Chinese EVs coming to Europe - is in talks with CATL to build a European plant. \n\nIt has concluded that cutting the price of EVs by using Chinese LFP batteries is more important.\n\n@FT story: \nhttps://t.co/l7nGggRFxH. State-of-the-art North America Battery Technology Centre begins to take shape at Stellantis' Automotive Research and Development Centre (ARDC) in Windsor, Ontario.\n\nhttps://t.co/04RO7CL1O5. RT @UAW: 🧵After the historic Stand Up Strike, UAW members at Ford, General Motors and Stellantis have voted to ratify their new contracts,…. RT @atorsoli: Stellantis and CATL are set to supply lower-cost EV batteries together for Europe, signaling automaker's efforts to tighten t…. RT @atorsoli: Stellantis and CATL are set to supply lower-cost EV batteries together for Europe, signaling automaker's efforts to tighten""")
->>> 'Stellantis is in talks with CATL to build a European plant, with a focus on cutting the price of EVs by using Chinese LFP batteries. The company is also developing a state-of-the-art North America Battery Technology Centre in Windsor, Ontario, and has ratified its new contracts with the UAW.'
 ```
-### Framework versions
-- Transformers 4.34.1
-- Pytorch 2.1.0
-- Datasets 2.14.7
-- Tokenizers 0.14.1

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 ## Model description
 ## Intended uses & limitations
+Run on consumer Grade GPU
+## GPU
+Tesla M60 16GB VRAM
 ### Training hyperparameters
 The following hyperparameters were used during training:
+    per_device_train_batch_size: int = 2
+    per_device_eval_batch_size: int = 2
+    gradient_accumulation_steps: int = 16
+    learning_rate: float = 2e-4
+    weight_decay: float = 0.01
+    warmup_ratio: float = 0.03
+    logging_steps: int = 1
+    save_steps: int = 200
+    eval_steps: int = 200
+    max_seq_length: int = 1024
+    num_train_epochs: int = 1
+    max_grad_norm: float = 0.3
+    num_epochs: 5
 ### Training results
+Step	Training Loss	Validation Loss	Entropy	Num Tokens	Mean Token Accuracy
+50	1.399500	1.379406	1.427977	166273.000000	0.684534
+100	1.350000	1.272701	1.238351	331643.000000	0.698206
+150	1.391500	1.252361	1.240065	497339.000000	0.701490
+200	1.175000	1.243332	1.248332	664364.000000	0.701699
+250	1.357100	1.235908	1.209817	830792.000000	0.703880
+300	1.341700	1.226673	1.196961	995955.000000	0.705412
+350	1.211000	1.223105	1.219540	1161755.000000	0.705137
+400	1.414100	1.219148	1.218188	1330892.000000	0.706035
+450	1.088200	1.214209	1.244467	1494009.000000	0.707179
+500	1.302800	1.210984	1.203838	1659876.000000	0.707986
+550	1.192800	1.208378	1.201593	1828355.000000	0.708459
+600	1.302300	1.206382	1.212914	1989352.000000	0.708516
+650	1.177800	1.205050	1.245975	2155580.000000	0.708198
+700	1.156600	1.201754	1.201212	2323534.000000	0.709032
+750	1.271000	1.201216	1.218800	2488415.000000	0.708988
+800	1.264100	1.198175	1.182730	2655756.000000	0.710219
+850	1.324600	1.196617	1.189218	2822068.000000	0.710231
+900	1.159400	1.198235	1.207774	2988438.000000	0.708831
+950	1.294200	1.194295	1.211270	3153113.000000	0.709955
+1000	1.370000	1.192295	1.215226	3321550.000000	0.710322
+1050	1.157300	1.190316	1.214881	3485313.000000	0.710768
+1100	1.124000	1.189019	1.210650	3651712.000000	0.711739
+1150	1.139700	1.188874	1.209716	3815535.000000	0.711151
+1200	1.293600	1.187840	1.198137	3980373.000000	0.710808
+1250	1.199800	1.186739	1.226214	4146077.000000	0.711442
+...
+XXXX 7700 steps XXXX
 ### How to use
 Here is how to use this model with the [pipeline API](https://huggingface.co/transformers/main_classes/pipelines.html):
 ```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+import time
+import os
+BASE_MODEL = "Qwen/Qwen3-0.6B"     # change this for Qwen / Phi / MPT
+ADAPTER_PATH = "Sidharthkr/Qwen3_0.6B_alpaca_lora"      # your LoRA output_dir
+DEVICE_MAP = "auto"                        # or "cuda" if single-GPU
+DTYPE = torch.float32                     # Tesla M60: fp16, NOT bf16
+def create_alpaca_prompt(instruction: str, inp: str = "") -> str:
+    """Format prompt in Alpaca style."""
+    if inp.strip():
+        prompt = (
+            "Below is an instruction that describes a task, paired with an input that provides further context. "
+            "Write a response that appropriately completes the request.\n\n"
+            f"### Instruction:\n{instruction.strip()}\n\n"
+            f"### Input:\n{inp.strip()}\n\n"
+            "### Response:\n"
+        )
+    else:
+        prompt = (
+            "Below is an instruction that describes a task. "
+            "Write a response that appropriately completes the request.\n\n"
+            f"### Instruction:\n{instruction.strip()}\n\n"
+            "### Response:\n"
+        )
+    return prompt
+def load_model_and_tokenizer():
+    print(f"Loading base model: {BASE_MODEL}")
+    tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, use_fast=True)
+    if tokenizer.pad_token is None:
+        tokenizer.pad_token = tokenizer.eos_token
+    tokenizer.padding_side = "left"
+    base_model = AutoModelForCausalLM.from_pretrained(
+        BASE_MODEL,
+        torch_dtype=DTYPE,
+        device_map=DEVICE_MAP,
+    )
+    print(f"Loading LoRA adapter from {ADAPTER_PATH}")
+    model = PeftModel.from_pretrained(base_model, ADAPTER_PATH)
+    print("Merging LoRA weights into base model for speed...")
+    model = model.merge_and_unload()
+    model.eval()
+    # For safety with older GPUs
+    torch.backends.cuda.matmul.allow_tf32 = False
+    #torch.backends.cudnn.allow_tf32 = False
+    return model, tokenizer
+@torch.no_grad()
+def generate_single(
+    model,
+    tokenizer,
+    instruction: str,
+    inp: str = "",
+    max_new_tokens: int = 256,
+    temperature: float = 0.7,
+    top_p: float = 0.9,
+):
+    prompt = create_alpaca_prompt(instruction, inp)
+    inputs = tokenizer(
+        prompt,
+        return_tensors="pt",
+    ).to(model.device)
+    output_ids = model.generate(
+        **inputs,
+        max_new_tokens=max_new_tokens,
+        do_sample=False,             # ✅ no sampling → no multinomial
+        temperature=None,            # ignored when do_sample=False
+        top_p=None,
+        pad_token_id=tokenizer.eos_token_id,
+        use_cache=True,
+    )
+    full_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
+    # Strip the prompt part to keep only the response
+    if "### Response:" in full_text:
+        response = full_text.split("### Response:")[-1].strip()
+    else:
+        response = full_text.strip()
+    return response
+@torch.no_grad()
+def generate_batch(
+    model,
+    tokenizer,
+    instructions,
+    inputs=None,
+    max_new_tokens: int = 256,
+    temperature: float = 0.7,
+    top_p: float = 0.9,
+):
+    if inputs is None:
+        inputs = [""] * len(instructions)
+    prompts = [
+        create_alpaca_prompt(inst, inp)
+        for inst, inp in zip(instructions, inputs)
+    ]
+    tokenized = tokenizer(
+        prompts,
+        return_tensors="pt",
+        #padding=True,
+        #truncation=True,
+    ).to(model.device)
+    output_ids = model.generate(
+        **tokenized,
+        max_new_tokens=max_new_tokens,
+        do_sample=False,             # ✅ no sampling → no multinomial
+        temperature=None,            # ignored when do_sample=False
+        top_p=None,
+        # do_sample=True,
+        # temperature=temperature,
+        # top_p=top_p,
+        pad_token_id=tokenizer.eos_token_id,
+    )
+    outputs = []
+    for i in range(len(prompts)):
+        full_text = tokenizer.decode(output_ids[i], skip_special_tokens=True)
+        if "### Response:" in full_text:
+            response = full_text.split("### Response:")[-1].strip()
+        else:
+            response = full_text.strip()
+        outputs.append(response)
+    return outputs
+model, tokenizer = load_model_and_tokenizer()
+t1 = time.time()  # ⏱ start
+# ---------- Example: single prediction ----------
+instruction = "Explain what a GPU is to a 15 year old."
+inp = ""
+response = generate_single(model, tokenizer, instruction, inp, max_new_tokens=512)
+t2 = time.time()
+print(f"Total time:          {t2 - t1:.2f} seconds")
+print("=== Single prediction ===")
+print(response)
+>>> Total time:          4.42 seconds
+=== Single prediction ===
+A GPU (Graphics Processing Unit) is a type of computer processor used to generate images and videos. It is used in computers and other devices to create visual content, such as games and movies. It is much faster than a CPU (Central Processing Unit) and can process more data in less time.
 ```