Communicative Engagement Classification API

Overview

This project deploys an XGBoost machine learning model as a cloud-based inference API using FastAPI. The model predicts participant engagement behavior during online meetings based on Recall.ai participant event data extracted from Zoom meetings.

The API is designed to support a larger event-driven engagement analytics pipeline that processes participant activity after meetings end.


Purpose

The purpose of this model is to classify meeting participants into behavioral engagement groups using engineered participation features derived from Recall.ai event streams.

The model predicts one of three engagement labels:

Label Meaning
Silent Observer Participant attended but rarely or never verbally engaged
Occasional Participant Participant engaged intermittently
Active Participant Participant frequently engaged verbally and behaviorally

Input Features

The model uses the following engineered features:

Feature Description
total_time Total amount of time participant remained in the meeting
was_webcam_on Binary indicator for whether webcam was used
screenshare_usage Number of screenshare events triggered
never_spoke Binary indicator for whether participant never spoke
speech_turns Number of speaking sessions detected

Data Source

The input data is generated from Recall.ai participant event logs collected from Zoom meetings.

Examples of participant events include:

  • join
  • leave
  • speech_on
  • speech_off
  • webcam_on
  • webcam_off
  • screenshare_on
  • screenshare_off

These events are processed into participant-level behavioral features before inference.


Model Architecture

Component Value
Model Type XGBoost Classifier
Task Multi-class Classification
Output Classes 3
Training Data Recall.ai participant meeting features
Framework xgboost
API Framework FastAPI

API Endpoints

Health Check

GET /

Returns API health status.

Example Response

{
  "status": "running"
}

Prediction Endpoint

POST /predict

Runs engagement classification on participant feature rows.

Request Format

{
  "rows": [
    {
      "student_name": "John",
      "meeting_id": "meeting_123",
      "total_time": 3200,
      "was_webcam_on": 1,
      "screenshare_usage": 0,
      "never_spoke": 0,
      "speech_turns": 5
    }
  ]
}

Response Format

[
  {
    "student_name": "John",
    "meeting_id": "meeting_123",
    "cluster_label": "Occasional Participant"
  }
]

Deployment Purpose

This API is intended to serve as the inference layer for a cloud-based communicative engagement analytics pipeline.

The larger architecture consists of:

Recall.ai
    ↓
Webhook Trigger
    ↓
FastAPI Cloud Server
    ↓
Participant Event Processing
    ↓
XGBoost Inference API
    ↓
Google Sheets / Analytics Storage

File Structure

.
β”œβ”€β”€ app.py
β”œβ”€β”€ engagement_xgb_model.json
β”œβ”€β”€ requirements.txt
└── README.md

requirements.txt

fastapi
uvicorn
xgboost
pandas
scikit-learn

app.py

from fastapi import FastAPI
from pydantic import BaseModel

import xgboost as xgb
import pandas as pd

app = FastAPI()

# =========================
# LOAD MODEL
# =========================

model = xgb.XGBClassifier()

model.load_model(
    "engagement_xgb_model.json"
)

# =========================
# FEATURES
# =========================

FEATURES = [

    "total_time",

    "was_webcam_on",

    "screenshare_usage",

    "never_spoke",

    "speech_turns"
]

# =========================
# LABELS
# =========================

LABEL_MAP = {

    0: "Silent Observer",

    1: "Occasional Participant",

    2: "Active Participant"
}

# =========================
# REQUEST MODEL
# =========================

class PredictionRequest(BaseModel):

    rows: list

# =========================
# HEALTH CHECK
# =========================

@app.get("/")
def home():

    return {
        "status": "running"
    }

# =========================
# PREDICTION ENDPOINT
# =========================

@app.post("/predict")
def predict(request: PredictionRequest):

    df = pd.DataFrame(request.rows)

    # Ensure required columns exist
    for col in FEATURES:

        if col not in df:

            df[col] = 0

    X = df[FEATURES]

    preds = model.predict(X)

    df["cluster_label"] = [

        LABEL_MAP[p]
        for p in preds
    ]

    output = df[[

        "student_name",

        "meeting_id",

        "cluster_label"
    ]]

    return output.to_dict(
        orient="records"
    )

Deployment Instructions

Step 1

Create a new Space on Hugging Face.

Use:

  • SDK: Docker or FastAPI
  • Visibility: Public or Private

Step 2

Upload:

  • app.py
  • requirements.txt
  • engagement_xgb_model.json
  • README.md

Step 3

Wait for Hugging Face to build the API.


Example API URL

https://your-space-name.hf.space/predict

Example cURL Request

curl -X POST \
  https://your-space-name.hf.space/predict \
  -H "Content-Type: application/json" \
  -d '{
    "rows": [
      {
        "student_name": "John",
        "meeting_id": "meeting_123",
        "total_time": 3200,
        "was_webcam_on": 1,
        "screenshare_usage": 0,
        "never_spoke": 0,
        "speech_turns": 5
      }
    ]
}'

Notes

  • This API performs inference only.
  • Training is not performed inside the deployed service.
  • The model is optimized for lightweight CPU inference.
  • The API is intended for integration into event-driven engagement analytics systems.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using mjpsm/communicative-engagement-model 1