Communicative Engagement Classification API

Overview

This project deploys an XGBoost machine learning model as a cloud-based inference API using FastAPI. The model predicts participant engagement behavior during online meetings based on Recall.ai participant event data extracted from Zoom meetings.

The API is designed to support a larger event-driven engagement analytics pipeline that processes participant activity after meetings end.

Purpose

The purpose of this model is to classify meeting participants into behavioral engagement groups using engineered participation features derived from Recall.ai event streams.

The model predicts one of three engagement labels:

Label	Meaning
Silent Observer	Participant attended but rarely or never verbally engaged
Occasional Participant	Participant engaged intermittently
Active Participant	Participant frequently engaged verbally and behaviorally

Input Features

The model uses the following engineered features:

Feature	Description
total_time	Total amount of time participant remained in the meeting
was_webcam_on	Binary indicator for whether webcam was used
screenshare_usage	Number of screenshare events triggered
never_spoke	Binary indicator for whether participant never spoke
speech_turns	Number of speaking sessions detected

Data Source

The input data is generated from Recall.ai participant event logs collected from Zoom meetings.

Examples of participant events include:

join
leave
speech_on
speech_off
webcam_on
webcam_off
screenshare_on
screenshare_off

These events are processed into participant-level behavioral features before inference.

Model Architecture

Component	Value
Model Type	XGBoost Classifier
Task	Multi-class Classification
Output Classes	3
Training Data	Recall.ai participant meeting features
Framework	xgboost
API Framework	FastAPI

API Endpoints

Health Check

GET /

Returns API health status.

Example Response

{
  "status": "running"
}

Prediction Endpoint

POST /predict

Runs engagement classification on participant feature rows.

Request Format

{
  "rows": [
    {
      "student_name": "John",
      "meeting_id": "meeting_123",
      "total_time": 3200,
      "was_webcam_on": 1,
      "screenshare_usage": 0,
      "never_spoke": 0,
      "speech_turns": 5
    }
  ]
}

Response Format

[
  {
    "student_name": "John",
    "meeting_id": "meeting_123",
    "cluster_label": "Occasional Participant"
  }
]

Deployment Purpose

This API is intended to serve as the inference layer for a cloud-based communicative engagement analytics pipeline.

The larger architecture consists of:

Recall.ai
    ↓
Webhook Trigger
    ↓
FastAPI Cloud Server
    ↓
Participant Event Processing
    ↓
XGBoost Inference API
    ↓
Google Sheets / Analytics Storage

File Structure

.
├── app.py
├── engagement_xgb_model.json
├── requirements.txt
└── README.md

requirements.txt

fastapi
uvicorn
xgboost
pandas
scikit-learn

app.py

from fastapi import FastAPI
from pydantic import BaseModel

import xgboost as xgb
import pandas as pd

app = FastAPI()

# =========================
# LOAD MODEL
# =========================

model = xgb.XGBClassifier()

model.load_model(
    "engagement_xgb_model.json"
)

# =========================
# FEATURES
# =========================

FEATURES = [

    "total_time",

    "was_webcam_on",

    "screenshare_usage",

    "never_spoke",

    "speech_turns"
]

# =========================
# LABELS
# =========================

LABEL_MAP = {

    0: "Silent Observer",

    1: "Occasional Participant",

    2: "Active Participant"
}

# =========================
# REQUEST MODEL
# =========================

class PredictionRequest(BaseModel):

    rows: list

# =========================
# HEALTH CHECK
# =========================

@app.get("/")
def home():

    return {
        "status": "running"
    }

# =========================
# PREDICTION ENDPOINT
# =========================

@app.post("/predict")
def predict(request: PredictionRequest):

    df = pd.DataFrame(request.rows)

    # Ensure required columns exist
    for col in FEATURES:

        if col not in df:

            df[col] = 0

    X = df[FEATURES]

    preds = model.predict(X)

    df["cluster_label"] = [

        LABEL_MAP[p]
        for p in preds
    ]

    output = df[[

        "student_name",

        "meeting_id",

        "cluster_label"
    ]]

    return output.to_dict(
        orient="records"
    )

Deployment Instructions

Step 1

Create a new Space on Hugging Face.

Use:

SDK: Docker or FastAPI
Visibility: Public or Private

Step 2

Upload:

app.py
requirements.txt
engagement_xgb_model.json
README.md

Step 3

Wait for Hugging Face to build the API.

Example API URL

https://your-space-name.hf.space/predict

Example cURL Request

curl -X POST \
  https://your-space-name.hf.space/predict \
  -H "Content-Type: application/json" \
  -d '{
    "rows": [
      {
        "student_name": "John",
        "meeting_id": "meeting_123",
        "total_time": 3200,
        "was_webcam_on": 1,
        "screenshare_usage": 0,
        "never_spoke": 0,
        "speech_turns": 5
      }
    ]
}'

Notes

This API performs inference only.
Training is not performed inside the deployed service.
The model is optimized for lightweight CPU inference.
The API is intended for integration into event-driven engagement analytics systems.

Downloads last month: -; Downloads are not tracked for this model. How to track

mjpsm
/

communicative-engagement-model