Spaces:

jkbennitt
/

felix-framework

Paused

App Files Files Community

felix-framework / docs /architecture /core /hypothesis_mathematics.md

jkbennitt

Clean hf-space branch and prepare for HuggingFace Spaces deployment

fb867c3 7 months ago

preview code

raw

history blame contribute delete

12.1 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

Mathematical Formulation of Research Hypotheses

Document Version: 1.0
Date: 2025-08-18
Status: Formal Specification
Related: research/initial_hypothesis.md, docs/architecture/core/mathematical_model.md

Abstract

This document provides rigorous mathematical formulations for the three primary research hypotheses of the Felix Framework. Each hypothesis is translated from empirical predictions into testable mathematical statements with formal proofs, statistical tests, and measurable criteria.

Mathematical Foundations

Notation and Definitions

Agent count: $N \in \mathbb{N}$ (total number of agents)
Time parameter: $\tau \in [0, T]$ (global system time)
Path parameter: $t \in [0,1]$ (position along helix)
Agent $i$ spawn time: $T_i \sim \mathcal{U}(0,1)$
Agent $i$ workload: $W_i(\tau) \in \mathbb{R}^+$
System completion time: $T_c \in \mathbb{R}^+$

Agent State Functions

For agent $i$ at time $\tau$:

Position: $\mathbf{r}_i(\tau) = \mathbf{r}(T_i + (\tau - T_i))$ if $\tau \geq T_i$
Activity: $A_i(\tau) = \mathbb{I}[\tau \geq T_i \text{ and } \tau \leq T_i + P_i]$
Progress: $p_i(\tau) = \min(1, \frac{\tau - T_i}{P_i})$ if $\tau \geq T_i$

Where $P_i$ is the processing duration for agent $i$.

Hypothesis H1: Helical Agent Paths Improve Task Distribution

H1.1 Mathematical Statement

Null Hypothesis ($H_{1,0}$): The coefficient of variation in agent workload for helix architecture is greater than or equal to that of linear pipeline architecture.

$CV_{\text{helix}} \geq CV_{\text{linear}}$

Alternative Hypothesis ($H_{1,1}$): Helical architecture provides better workload distribution.

$CV_{\text{helix}} < CV_{\text{linear}}$

Where the coefficient of variation is: $CV = \frac{\sigma_W}{\mu_W} = \frac{\sqrt{\frac{1}{N}\sum_{i=1}^N (W_i - \bar{W})^2}}{\frac{1}{N}\sum_{i=1}^N W_i}$

H1.2 Theoretical Analysis

Helix Architecture Workload Distribution

In the helix architecture, agent workload is influenced by:

Spawn time distribution: $T_i \sim \mathcal{U}(0,1)$
Geometric constraints: Available processing space $\propto 2\pi R(t)$
Natural load balancing: Tapering radius creates bottlenecks

The expected workload for agent $i$ is: $\mathbb{E}[W_i] = \int_0^1 \lambda(t) \cdot \mathbb{P}(\text{agent } i \text{ at position } t) \, dt$

Where $\lambda(t)$ is the workload density function: $\lambda(t) = \frac{\text{Total Work}}{2\pi R(t) \cdot \rho(t)}$

And $\rho(t)$ is the expected agent density at position $t$.

Linear Pipeline Workload Distribution

In linear architecture, workload follows sequential processing: $W_i^{\text{linear}} = \frac{\text{Total Work}}{N} + \epsilon_i$

Where $\epsilon_i$ represents load imbalance due to task heterogeneity.

H1.3 Statistical Test Design

Test Statistic: Two-sample F-test for variance equality $F = \frac{s_{\text{linear}}^2}{s_{\text{helix}}^2}$

Rejection Region: $F > F_{\alpha, N-1, N-1}$ where $\alpha = 0.05$

Power Analysis: For effect size $\delta = \frac{|CV_{\text{helix}} - CV_{\text{linear}}|}{\sigma_{CV}}$, required sample size: $N = \frac{2(z_{\alpha/2} + z_\beta)^2}{\delta^2}$

H1.4 Measurable Criteria

Primary Metric: $CV_{\text{helix}} < 0.2$ and $CV_{\text{linear}} > 0.4$
Secondary Metric: $0.9 \leq \frac{T_c^{\text{helix}}}{T_c^{\text{linear}}} \leq 1.1$
Statistical Significance: $p < 0.05$ for F-test

Hypothesis H2: Spoke Communication Reduces Coordination Overhead

H2.1 Mathematical Statement

Null Hypothesis ($H_{2,0}$): Spoke-based communication overhead is greater than or equal to mesh-based communication.

$O_{\text{spoke}} \geq O_{\text{mesh}}$

Alternative Hypothesis ($H_{2,1}$): Spoke-based communication provides lower overhead.

$O_{\text{spoke}} < O_{\text{mesh}}$

H2.2 Communication Complexity Analysis

Spoke Architecture

Message Count: Each agent communicates only with central post $M_{\text{spoke}} = \sum_{i=1}^N m_i = O(N)$

Where $m_i$ is the number of messages sent by agent $i$.

Latency Model: Message latency is distance-dependent $L_i = \alpha + \beta \cdot d_i + \epsilon_i$

Where:

$d_i = R(t_i)$ is the spoke length (distance to central post)
$\alpha$ is base processing latency
$\beta$ is transmission coefficient
$\epsilon_i \sim \mathcal{N}(0, \sigma_\epsilon^2)$ is random noise

Total Communication Cost: $C_{\text{spoke}} = \sum_{i=1}^N (m_i \cdot L_i + s_i)$

Where $s_i$ is storage overhead for agent $i$.

Mesh Architecture

Message Count: Each agent potentially communicates with all others $M_{\text{mesh}} = \sum_{i=1}^N \sum_{j \neq i} m_{ij} = O(N^2)$

Average Distance: Between agents in mesh topology $\bar{d}_{\text{mesh}} = \mathbb{E}[|\mathbf{r}_i - \mathbf{r}_j|]$

Total Communication Cost: $C_{\text{mesh}} = \sum_{i=1}^N \sum_{j \neq i} (m_{ij} \cdot L_{ij} + s_{ij})$

H2.3 Theoretical Proof

Theorem: For fixed task complexity and $N$ agents, spoke architecture has lower asymptotic communication complexity.

Proof:

Message complexity: $O(N) < O(N^2)$ for $N > 1$
Maximum distance: $\max_i d_i = R_{\text{top}} < \max_{i,j} |\mathbf{r}_i - \mathbf{r}j| \leq 2R{\text{top}} + H$
Storage complexity: Central post requires $O(N)$ connections vs $O(N^2)$ in mesh

Therefore: $\lim_{N \to \infty} \frac{C_{\text{spoke}}}{C_{\text{mesh}}} = \lim_{N \to \infty} \frac{O(N)}{O(N^2)} = 0$ ∎

H2.4 Performance Metrics

Message Count Ratio: $R_M = \frac{M_{\text{spoke}}}{M_{\text{mesh}}} = \frac{N}{\frac{N(N-1)}{2}} = \frac{2}{N-1}$

Latency Distribution:

Spoke: $L_{\text{spoke}} \sim \mathcal{N}(\alpha + \beta \bar{R}, \sigma_L^2)$
Mesh: $L_{\text{mesh}} \sim \mathcal{N}(\alpha + \beta \bar{d}_{\text{mesh}}, \sigma_L^2)$

Statistical Test: Welch's t-test for unequal variances $t = \frac{\bar{L}_{\text{mesh}} - \bar{L}_{\text{spoke}}}{\sqrt{\frac{s_{\text{mesh}}^2}{n_{\text{mesh}}} + \frac{s_{\text{spoke}}^2}{n_{\text{spoke}}}}}$

H2.5 Measurable Criteria

Message Scaling: $M_{\text{spoke}} = O(N)$, $M_{\text{mesh}} = O(N^2)$
Latency Targets: $L_{95,\text{spoke}} < 50ms$, $L_{95,\text{mesh}} > 100ms$
Memory Overhead: $S_{\text{spoke}} = O(N)$, $S_{\text{mesh}} = O(N^2)$

Hypothesis H3: Geometric Tapering Implements Natural Attention Focusing

H3.1 Mathematical Statement

Null Hypothesis ($H_{3,0}$): Agent density does not increase toward the narrow end of the helix.

$\frac{d\rho(t)}{dt} \leq 0 \text{ for } t \in [0.5, 1]$

Alternative Hypothesis ($H_{3,1}$): Agent density increases naturally toward the narrow end.

$\frac{d\rho(t)}{dt} > 0 \text{ for } t \in [0.5, 1]$

H3.2 Attention Focusing Mechanism

Geometric Attention Density

The attention density at parameter $t$ is inversely proportional to available circumferential space:

$A(t) = \frac{k}{2\pi R(t)} = \frac{k}{2\pi R_{\text{bottom}} \left(\frac{R_{\text{top}}}{R_{\text{bottom}}}\right)^t}$

Where $k$ is a normalization constant.

Derivative Analysis

$\frac{dA(t)}{dt} = -\frac{k \ln\left(\frac{R_{\text{top}}}{R_{\text{bottom}}}\right)}{2\pi R_{\text{bottom}}} \left(\frac{R_{\text{top}}}{R_{\text{bottom}}}\right)^{t-1}$

Since $R_{\text{top}} > R_{\text{bottom}}$, we have $\ln\left(\frac{R_{\text{top}}}{R_{\text{bottom}}}\right) > 0$.

Therefore: $\frac{dA(t)}{dt} > 0$ for all $t \in [0,1]$ ∎

Agent Density Evolution

The expected agent density follows: $\rho(t, \tau) = \sum_{i: T_i \leq \tau} \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(t - p_i(\tau))^2}{2\sigma^2}\right)$

Where agents are distributed around their current progress positions with variance $\sigma^2$.

H3.3 Bottleneck Theory

Theorem: The tapering helix creates a natural processing bottleneck that concentrates computational effort.

Proof:

Capacity constraint: Processing capacity at position $t$ is $C(t) \propto R(t)$
Flow conservation: Agent throughput must satisfy $\rho(t) \cdot v(t) \leq C(t)$
Velocity adaptation: As $R(t)$ decreases, $v(t)$ must decrease, causing $\rho(t)$ to increase

This creates natural queuing at narrow sections, focusing processing power. ∎

H3.4 Quality Improvement Model

Processing Quality: Assume quality improves with agent density: $Q(t) = Q_0 + \alpha \cdot \rho(t) + \beta \cdot A(t) + \epsilon$

Where:

$Q_0$ is baseline quality
$\alpha$ measures collaboration benefit
$\beta$ measures attention focusing benefit
$\epsilon \sim \mathcal{N}(0, \sigma_Q^2)$ is random variation

Expected Quality Gain: At position $t$ vs linear baseline: $\Delta Q(t) = \alpha \cdot (\rho_{\text{helix}}(t) - \rho_{\text{linear}}) + \beta \cdot A(t)$

H3.5 Statistical Validation

Regression Model: $Q_i = \beta_0 + \beta_1 \rho(t_i) + \beta_2 A(t_i) + \beta_3 X_i + \epsilon_i$

Where $X_i$ are control variables (agent type, task difficulty, etc.).

Hypothesis Test:

$H_0: \beta_1 = \beta_2 = 0$ (no focusing effect)
$H_1: \beta_1 > 0$ or $\beta_2 > 0$ (focusing improves quality)

Test Statistic: F-test for joint significance: $F = \frac{(RSS_0 - RSS_1)/2}{RSS_1/(n-k-1)}$

H3.6 Measurable Criteria

Agent Density: $\rho(t=1) > 1.5 \cdot \rho(t=0)$ (50% increase at narrow end)
Quality Improvement: $Q_{\text{final}} > 1.15 \cdot Q_{\text{baseline}}$ (15% improvement)
Natural Focusing: No explicit prioritization code required
Statistical Significance: $p < 0.05$ for regression coefficients

Integrated Statistical Framework

Experimental Design

Factorial Design: $2^3$ experiment testing:

Architecture type: {Helix, Linear}
Communication: {Spoke, Mesh}
Task complexity: {Low, High}

Response Variables:

Workload coefficient of variation ($CV$)
Communication latency ($L_{95}$)
Agent density gradient ($d\rho/dt$)
Processing quality ($Q$)

Sample Size Calculation: For detecting medium effect size ($\delta = 0.5$) with power $1-\beta = 0.8$: $n = \frac{2(z_{\alpha/2} + z_\beta)^2}{\delta^2} \approx \frac{2(1.96 + 0.84)^2}{0.25} \approx 63$

Multiple Testing Correction

Bonferroni Correction: For $k=3$ primary hypotheses: $\alpha_{\text{adjusted}} = \frac{\alpha}{k} = \frac{0.05}{3} \approx 0.017$

False Discovery Rate: Using Benjamini-Hochberg procedure with $q = 0.05$.

Power Analysis

Effect Size Estimates:

H1: $\delta_1 = \frac{|CV_{\text{helix}} - CV_{\text{linear}}|}{\sigma_{CV}} = 0.8$ (large effect)
H2: $\delta_2 = \frac{|L_{\text{helix}} - L_{\text{linear}}|}{\sigma_L} = 1.2$ (large effect)
H3: $\delta_3 = \frac{|\rho'{\text{helix}} - \rho'{\text{linear}}|}{\sigma_{\rho'}} = 0.6$ (medium effect)

Required Sample Sizes:

H1: $n_1 = 26$ (per group)
H2: $n_2 = 15$ (per group)
H3: $n_3 = 45$ (per group)

Overall Study: $n = \max(n_1, n_2, n_3) = 45$ per experimental condition.

Conclusion

This mathematical framework provides:

Rigorous hypothesis formulations with null and alternative statements
Theoretical proofs for key claims about communication complexity and attention focusing
Statistical test designs with appropriate power calculations
Measurable criteria for empirical validation
Multiple testing corrections for statistical reliability

The framework supports both theoretical analysis and empirical validation of the Felix Framework's advantages over traditional multi-agent architectures.

References

Mathematical model: docs/architecture/core/mathematical_model.md
Initial hypotheses: research/initial_hypothesis.md
Implementation: src/core/helix_geometry.py, src/communication/
Test framework: tests/unit/

Note: This mathematical framework provides the theoretical foundation for rigorous testing of the Felix Framework's research claims and supports peer-reviewed publication of results.