Model Card for Model ID

Model Details

Model Description

FiLLM is a Filipino-optimized Large Language Model, designed to enhance natural language processing (NLP) capabilities in the Filipino language. Built upon the SeaLLM-7B 2.5 model, FiLLM leverages Low-Rank Adaptation (LoRA) fine-tuning to optimize memory efficiency while maintaining task-specific performance. The model was trained and evaluated on diverse Filipino datasets to address key NLP tasks, including Named Entity Recognition (NER), Part-of-Speech (POS) tagging, Dependency Parsing, and Text Summarization

FiLLM-POSDEPSUM is the model that handles POS tagging, dependency parsing, and text summarization. The other model, FiLLM-NER, can be found here.

Developed by: Isaiah Job Cuenca Enriquez, Carlos Jude Maminta, and Deandre Nigel Corpuz Nuñez
Funded by: Self-funded
Model type: Transformers
Language(s) (NLP): Filipino
License: [More Information Needed]
Finetuned from model: SeaLLM-7B, which in turn is a finetuned model of Gemma 7B

For more information, please consult the paper that can be found below.

Model Sources [optional]

Repository: [More Information Needed]
Paper: FiLLM - A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM)

Uses

This particular model is used to extract dependency parsing and POS (part of speech tagging) or summarize a text in Filipino.

Sample usage:

POS tagging and dependency parsing:

Prompt: Ano ang POS at dependency parsing ng teksto? "Maraming salamat po!" sabi ko.
Output: POS: " (PUNCT), Marami (DET), -ng (PART), salamat (NOUN), po (PART), ! (PUNCT), " (PUNCT), sabi (VERB), ko (PRON), . (PUNCT) Dependency Parsing: " (punct, head: salamat), Marami (det, head: salamat), -ng (mark, head: Marami), salamat (compound, head: Marami), po (discourse, head: salamat), ! (punct, head: salamat), " (punct, head: salamat), sabi (root), ko (obj, head: sabi), . (punct, head: sabi)

Text summarization

Prompt: "Ano ang summary ng teksto? Ang stand-up na komidyanteng si Mitch Benn ay simulat ang I'm Proud of the BBC bilang tugon sa mga kritisismo na tinatanggap ng korporasyong pinupondohan ng bayad ng lisensya mula sa mga komersyal na kakumpitensya at ang right-wing press."
Output: Si Mitch Benn ay gumawa ng awit na "I'm Proud of the BBC" bilang tugon sa mga kritisismo sa korporasyon

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

The model may exhibit hallucinations, especially then the prompt does not end in a period.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month: 15

Safetensors

Model size

9B params

Tensor type

BF16

Model tree for jobenriquez/FiLLM-POSDEPSUM

Base model

SeaLLMs/SeaLLM-7B-v2.5

Finetuned

(7)

this model

Quantizations

2 models

Paper for jobenriquez/FiLLM-POSDEPSUM

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 34