Communicative Engagement Classification API
Overview
This project deploys an XGBoost machine learning model as a cloud-based inference API using FastAPI. The model predicts participant engagement behavior during online meetings based on Recall.ai participant event data extracted from Zoom meetings.
The API is designed to support a larger event-driven engagement analytics pipeline that processes participant activity after meetings end.
Purpose
The purpose of this model is to classify meeting participants into behavioral engagement groups using engineered participation features derived from Recall.ai event streams.
The model predicts one of three engagement labels:
| Label | Meaning |
|---|---|
| Silent Observer | Participant attended but rarely or never verbally engaged |
| Occasional Participant | Participant engaged intermittently |
| Active Participant | Participant frequently engaged verbally and behaviorally |
Input Features
The model uses the following engineered features:
| Feature | Description |
|---|---|
| total_time | Total amount of time participant remained in the meeting |
| was_webcam_on | Binary indicator for whether webcam was used |
| screenshare_usage | Number of screenshare events triggered |
| never_spoke | Binary indicator for whether participant never spoke |
| speech_turns | Number of speaking sessions detected |
Data Source
The input data is generated from Recall.ai participant event logs collected from Zoom meetings.
Examples of participant events include:
- join
- leave
- speech_on
- speech_off
- webcam_on
- webcam_off
- screenshare_on
- screenshare_off
These events are processed into participant-level behavioral features before inference.
Model Architecture
| Component | Value |
|---|---|
| Model Type | XGBoost Classifier |
| Task | Multi-class Classification |
| Output Classes | 3 |
| Training Data | Recall.ai participant meeting features |
| Framework | xgboost |
| API Framework | FastAPI |
API Endpoints
Health Check
GET /
Returns API health status.
Example Response
{
"status": "running"
}
Prediction Endpoint
POST /predict
Runs engagement classification on participant feature rows.
Request Format
{
"rows": [
{
"student_name": "John",
"meeting_id": "meeting_123",
"total_time": 3200,
"was_webcam_on": 1,
"screenshare_usage": 0,
"never_spoke": 0,
"speech_turns": 5
}
]
}
Response Format
[
{
"student_name": "John",
"meeting_id": "meeting_123",
"cluster_label": "Occasional Participant"
}
]
Deployment Purpose
This API is intended to serve as the inference layer for a cloud-based communicative engagement analytics pipeline.
The larger architecture consists of:
Recall.ai
β
Webhook Trigger
β
FastAPI Cloud Server
β
Participant Event Processing
β
XGBoost Inference API
β
Google Sheets / Analytics Storage
File Structure
.
βββ app.py
βββ engagement_xgb_model.json
βββ requirements.txt
βββ README.md
requirements.txt
fastapi
uvicorn
xgboost
pandas
scikit-learn
app.py
from fastapi import FastAPI
from pydantic import BaseModel
import xgboost as xgb
import pandas as pd
app = FastAPI()
# =========================
# LOAD MODEL
# =========================
model = xgb.XGBClassifier()
model.load_model(
"engagement_xgb_model.json"
)
# =========================
# FEATURES
# =========================
FEATURES = [
"total_time",
"was_webcam_on",
"screenshare_usage",
"never_spoke",
"speech_turns"
]
# =========================
# LABELS
# =========================
LABEL_MAP = {
0: "Silent Observer",
1: "Occasional Participant",
2: "Active Participant"
}
# =========================
# REQUEST MODEL
# =========================
class PredictionRequest(BaseModel):
rows: list
# =========================
# HEALTH CHECK
# =========================
@app.get("/")
def home():
return {
"status": "running"
}
# =========================
# PREDICTION ENDPOINT
# =========================
@app.post("/predict")
def predict(request: PredictionRequest):
df = pd.DataFrame(request.rows)
# Ensure required columns exist
for col in FEATURES:
if col not in df:
df[col] = 0
X = df[FEATURES]
preds = model.predict(X)
df["cluster_label"] = [
LABEL_MAP[p]
for p in preds
]
output = df[[
"student_name",
"meeting_id",
"cluster_label"
]]
return output.to_dict(
orient="records"
)
Deployment Instructions
Step 1
Create a new Space on Hugging Face.
Use:
- SDK: Docker or FastAPI
- Visibility: Public or Private
Step 2
Upload:
- app.py
- requirements.txt
- engagement_xgb_model.json
- README.md
Step 3
Wait for Hugging Face to build the API.
Example API URL
https://your-space-name.hf.space/predict
Example cURL Request
curl -X POST \
https://your-space-name.hf.space/predict \
-H "Content-Type: application/json" \
-d '{
"rows": [
{
"student_name": "John",
"meeting_id": "meeting_123",
"total_time": 3200,
"was_webcam_on": 1,
"screenshare_usage": 0,
"never_spoke": 0,
"speech_turns": 5
}
]
}'
Notes
- This API performs inference only.
- Training is not performed inside the deployed service.
- The model is optimized for lightweight CPU inference.
- The API is intended for integration into event-driven engagement analytics systems.