new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Jan 15

Multicell-Fold: geometric learning in folding multicellular life

During developmental processes such as embryogenesis, how a group of cells fold into specific structures, is a central question in biology that defines how living organisms form. Establishing tissue-level morphology critically relies on how every single cell decides to position itself relative to its neighboring cells. Despite its importance, it remains a major challenge to understand and predict the behavior of every cell within the living tissue over time during such intricate processes. To tackle this question, we propose a geometric deep learning model that can predict multicellular folding and embryogenesis, accurately capturing the highly convoluted spatial interactions among cells. We demonstrate that multicellular data can be represented with both granular and foam-like physical pictures through a unified graph data structure, considering both cellular interactions and cell junction networks. We successfully use our model to achieve two important tasks, interpretable 4-D morphological sequence alignment, and predicting local cell rearrangements before they occur at single-cell resolution. Furthermore, using an activation map and ablation studies, we demonstrate that cell geometries and cell junction networks together regulate local cell rearrangement which is critical for embryo morphogenesis. This approach provides a novel paradigm to study morphogenesis, highlighting a unified data structure and harnessing the power of geometric deep learning to accurately model the mechanisms and behaviors of cells during development. It offers a pathway toward creating a unified dynamic morphological atlas for a variety of developmental processes such as embryogenesis.

  • 5 authors
·
Jul 9, 2024

STAGED: A Multi-Agent Neural Network for Learning Cellular Interaction Dynamics

The advent of single-cell technology has significantly improved our understanding of cellular states and subpopulations in various tissues under normal and diseased conditions by employing data-driven approaches such as clustering and trajectory inference. However, these methods consider cells as independent data points of population distributions. With spatial transcriptomics, we can represent cellular organization, along with dynamic cell-cell interactions that lead to changes in cell state. Still, key computational advances are necessary to enable the data-driven learning of such complex interactive cellular dynamics. While agent-based modeling (ABM) provides a powerful framework, traditional approaches rely on handcrafted rules derived from domain knowledge rather than data-driven approaches. To address this, we introduce Spatio Temporal Agent-Based Graph Evolution Dynamics(STAGED) integrating ABM with deep learning to model intercellular communication, and its effect on the intracellular gene regulatory network. Using graph ODE networks (GDEs) with shared weights per cell type, our approach represents genes as vertices and interactions as directed edges, dynamically learning their strengths through a designed attention mechanism. Trained to match continuous trajectories of simulated as well as inferred trajectories from spatial transcriptomics data, the model captures both intercellular and intracellular interactions, enabling a more adaptive and accurate representation of cellular dynamics.

  • 9 authors
·
Jul 15, 2025

Zyxin is all you need: machine learning adherent cell mechanics

Cellular form and function emerge from complex mechanochemical systems within the cytoplasm. No systematic strategy currently exists to infer large-scale physical properties of a cell from its many molecular components. This is a significant obstacle to understanding biophysical processes such as cell adhesion and migration. Here, we develop a data-driven biophysical modeling approach to learn the mechanical behavior of adherent cells. We first train neural networks to predict forces generated by adherent cells from images of cytoskeletal proteins. Strikingly, experimental images of a single focal adhesion protein, such as zyxin, are sufficient to predict forces and generalize to unseen biological regimes. This protein field alone contains enough information to yield accurate predictions even if forces themselves are generated by many interacting proteins. We next develop two approaches - one explicitly constrained by physics, the other more agnostic - that help construct data-driven continuum models of cellular forces using this single focal adhesion field. Both strategies consistently reveal that cellular forces are encoded by two different length scales in adhesion protein distributions. Beyond adherent cell mechanics, our work serves as a case study for how to integrate neural networks in the construction of predictive phenomenological models in cell biology, even when little knowledge of the underlying microscopic mechanisms exist.

  • 8 authors
·
Feb 28, 2023

Limits on the accuracy of contact inhibition of locomotion

Cells that collide with each other repolarize away from contact, in a process called contact inhibition of locomotion (CIL), which is necessary for correct development of the embryo. CIL can occur even when cells make a micron-scale contact with a neighbor - much smaller than their size. How precisely can a cell sense cell-cell contact and repolarize in the correct direction? What factors control whether a cell recognizes it has contacted a neighbor? We propose a theoretical model for the limits of CIL where cells recognize the presence of another cell by binding the protein ephrin with the Eph receptor. This recognition is made difficult by the presence of interfering ligands that bind nonspecifically. Both theoretical predictions and simulation results show that it becomes more difficult to sense cell-cell contact when it is difficult to distinguish ephrin from the interfering ligands, or when there are more interfering ligands, or when the contact width decreases. However, the error of estimating contact position remains almost constant when the contact width changes. This happens because the cell gains spatial information largely from the boundaries of cell-cell contact. We study using statistical decision theory the likelihood of a false positive CIL event in the absence of cell-cell contact, and the likelihood of a false negative where CIL does not occur when another cell is present. Our results suggest that the cell is more likely to make incorrect decisions when the contact width is very small or so large that it nears the cell's perimeter. However, in general, we find that cells have the ability to make reasonably reliable CIL decisions even for very narrow (micron-scale) contacts, even if the concentration of interfering ligands is ten times that of the correct ligands.

  • 2 authors
·
Oct 31, 2023

CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning

High-content screening (HCS) assays based on high-throughput microscopy techniques such as Cell Painting have enabled the interrogation of cells' morphological responses to perturbations at an unprecedented scale. The collection of such data promises to facilitate a better understanding of the relationships between different perturbations and their effects on cellular state. Towards achieving this goal, recent advances in cross-modal contrastive learning could, in theory, be leveraged to learn a unified latent space that aligns perturbations with their corresponding morphological effects. However, the application of such methods to HCS data is not straightforward due to substantial differences in the semantics of Cell Painting images compared to natural images, and the difficulty of representing different classes of perturbations (e.g., small molecule vs CRISPR gene knockout) in a single latent space. In response to these challenges, here we introduce CellCLIP, a cross-modal contrastive learning framework for HCS data. CellCLIP leverages pre-trained image encoders coupled with a novel channel encoding scheme to better capture relationships between different microscopy channels in image embeddings, along with natural language encoders for representing perturbations. Our framework outperforms current open-source models, demonstrating the best performance in both cross-modal retrieval and biologically meaningful downstream tasks while also achieving significant reductions in computation time.

  • 4 authors
·
May 16, 2025

Homogenized C. elegans Neural Activity and Connectivity Data

There is renewed interest in modeling and understanding the nervous system of the nematode Caenorhabditis elegans (C. elegans), as this small model system provides a path to bridge the gap between nervous system structure (connectivity) and function (physiology). However, existing physiology datasets, whether involving passive recording or stimulation, are in distinct formats, and connectome datasets require preprocessing before analysis can commence. Here we compile and homogenize datasets of neural activity and connectivity. Our neural activity dataset is derived from 11 C. elegans neuroimaging experiments, while our connectivity dataset is compiled from 9 connectome annotations based on 3 primary electron microscopy studies and 1 signal propagation study. Physiology datasets, collected under varying protocols, measure calcium fluorescence in labeled subsets of the worm's 300 neurons. Our preprocessing pipeline standardizes these datasets by consistently ordering labeled neurons and resampling traces to a common sampling rate, yielding recordings from approximately 900 worms and 250 uniquely labeled neurons. The connectome datasets, collected from electron microscopy reconstructions, represent the entire nervous system as a graph of connections. Our collection is accessible on HuggingFace, facilitating analysis of the structure-function relationship in biology using modern neural network architectures and enabling cross-lab and cross-animal comparisons.

  • 4 authors
·
Nov 18, 2024

Learning dynamic representations of the functional connectome in neurobiological networks

The static synaptic connectivity of neuronal circuits stands in direct contrast to the dynamics of their function. As in changing community interactions, different neurons can participate actively in various combinations to effect behaviors at different times. We introduce an unsupervised approach to learn the dynamic affinities between neurons in live, behaving animals, and to reveal which communities form among neurons at different times. The inference occurs in two major steps. First, pairwise non-linear affinities between neuronal traces from brain-wide calcium activity are organized by non-negative tensor factorization (NTF). Each factor specifies which groups of neurons are most likely interacting for an inferred interval in time, and for which animals. Finally, a generative model that allows for weighted community detection is applied to the functional motifs produced by NTF to reveal a dynamic functional connectome. Since time codes the different experimental variables (e.g., application of chemical stimuli), this provides an atlas of neural motifs active during separate stages of an experiment (e.g., stimulus application or spontaneous behaviors). Results from our analysis are experimentally validated, confirming that our method is able to robustly predict causal interactions between neurons to generate behavior. Code is available at https://github.com/dyballa/dynamic-connectomes.

  • 5 authors
·
Feb 21, 2024

NuClick: A Deep Learning Framework for Interactive Segmentation of Microscopy Images

Object segmentation is an important step in the workflow of computational pathology. Deep learning based models generally require large amount of labeled data for precise and reliable prediction. However, collecting labeled data is expensive because it often requires expert knowledge, particularly in medical imaging domain where labels are the result of a time-consuming analysis made by one or more human experts. As nuclei, cells and glands are fundamental objects for downstream analysis in computational pathology/cytology, in this paper we propose a simple CNN-based approach to speed up collecting annotations for these objects which requires minimum interaction from the annotator. We show that for nuclei and cells in histology and cytology images, one click inside each object is enough for NuClick to yield a precise annotation. For multicellular structures such as glands, we propose a novel approach to provide the NuClick with a squiggle as a guiding signal, enabling it to segment the glandular boundaries. These supervisory signals are fed to the network as auxiliary inputs along with RGB channels. With detailed experiments, we show that NuClick is adaptable to the object scale, robust against variations in the user input, adaptable to new domains, and delivers reliable annotations. An instance segmentation model trained on masks generated by NuClick achieved the first rank in LYON19 challenge. As exemplar outputs of our framework, we are releasing two datasets: 1) a dataset of lymphocyte annotations within IHC images, and 2) a dataset of segmented WBCs in blood smear images.

  • 4 authors
·
May 29, 2020

Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines

High-throughput screening techniques, such as microscopy imaging of cellular responses to genetic and chemical perturbations, play a crucial role in drug discovery and biomedical research. However, robust perturbation screening for de novo cell lines remains challenging due to the significant morphological and biological heterogeneity across cell lines. To address this, we propose a novel framework that integrates external biological knowledge into existing pretraining strategies to enhance microscopy image profiling models. Our approach explicitly disentangles perturbation-specific and cell line-specific representations using external biological information. Specifically, we construct a knowledge graph leveraging protein interaction data from STRING and Hetionet databases to guide models toward perturbation-specific features during pretraining. Additionally, we incorporate transcriptomic features from single-cell foundation models to capture cell line-specific representations. By learning these disentangled features, our method improves the generalization of imaging models to de novo cell lines. We evaluate our framework on the RxRx database through one-shot fine-tuning on an RxRx1 cell line and few-shot fine-tuning on cell lines from the RxRx19a dataset. Experimental results demonstrate that our method enhances microscopy image profiling for de novo cell lines, highlighting its effectiveness in real-world phenotype-based drug discovery applications.

  • 4 authors
·
Jul 14, 2025

OmniCellTOSG: The First Cell Text-Omic Signaling Graphs Dataset for Joint LLM and GNN Modeling

Complex cell signaling systems -- governed by varying protein abundances and interactions -- generate diverse cell types across organs. These systems evolve under influences such as age, sex, diet, environmental exposures, and diseases, making them challenging to decode given the involvement of tens of thousands of genes and proteins. Recently, hundreds of millions of single-cell omics data have provided a robust foundation for understanding these signaling networks within various cell subpopulations and conditions. Inspired by the success of large foundation models (for example, large language models and large vision models) pre-trained on massive datasets, we introduce OmniCellTOSG, the first dataset of cell text-omic signaling graphs (TOSGs). Each TOSG represents the signaling network of an individual or meta-cell and is labeled with information such as organ, disease, sex, age, and cell subtype. OmniCellTOSG offers two key contributions. First, it introduces a novel graph model that integrates human-readable annotations -- such as biological functions, cellular locations, signaling pathways, related diseases, and drugs -- with quantitative gene and protein abundance data, enabling graph reasoning to decode cell signaling. This approach calls for new joint models combining large language models and graph neural networks. Second, the dataset is built from single-cell RNA sequencing data of approximately 120 million cells from diverse tissues and conditions (healthy and diseased) and is fully compatible with PyTorch. This facilitates the development of innovative cell signaling models that could transform research in life sciences, healthcare, and precision medicine. The OmniCellTOSG dataset is continuously expanding and will be updated regularly. The dataset and code are available at https://github.com/FuhaiLiAiLab/OmniCellTOSG.

  • 13 authors
·
Apr 2, 2025

Simplicial Closure and higher-order link prediction

Networks provide a powerful formalism for modeling complex systems by using a model of pairwise interactions. But much of the structure within these systems involves interactions that take place among more than two nodes at once; for example, communication within a group rather than person-to person, collaboration among a team rather than a pair of coauthors, or biological interaction between a set of molecules rather than just two. Such higher-order interactions are ubiquitous, but their empirical study has received limited attention, and little is known about possible organizational principles of such structures. Here we study the temporal evolution of 19 datasets with explicit accounting for higher-order interactions. We show that there is a rich variety of structure in our datasets but datasets from the same system types have consistent patterns of higher-order structure. Furthermore, we find that tie strength and edge density are competing positive indicators of higher-order organization, and these trends are consistent across interactions involving differing numbers of nodes. To systematically further the study of theories for such higher-order structures, we propose higher-order link prediction as a benchmark problem to assess models and algorithms that predict higher-order structure. We find a fundamental differences from traditional pairwise link prediction, with a greater role for local rather than long-range information in predicting the appearance of new interactions.

  • 5 authors
·
Feb 19, 2018

GRNFormer: A Biologically-Guided Framework for Integrating Gene Regulatory Networks into RNA Foundation Models

Foundation models for single-cell RNA sequencing (scRNA-seq) have shown promising capabilities in capturing gene expression patterns. However, current approaches face critical limitations: they ignore biological prior knowledge encoded in gene regulatory relationships and fail to leverage multi-omics signals that could provide complementary regulatory insights. In this paper, we propose GRNFormer, a new framework that systematically integrates multi-scale Gene Regulatory Networks (GRNs) inferred from multi-omics data into RNA foundation model training. Our framework introduces two key innovations. First, we introduce a pipeline for constructing hierarchical GRNs that capture regulatory relationships at both cell-type-specific and cell-specific resolutions. Second, we design a structure-aware integration framework that addresses the information asymmetry in GRNs through two technical advances: (1) A graph topological adapter using multi-head cross-attention to weight regulatory relationships dynamically, and (2) a novel edge perturbation strategy that perturb GRNs with biologically-informed co-expression links to augment graph neural network training. Comprehensive experiments have been conducted on three representative downstream tasks across multiple model architectures to demonstrate the effectiveness of GRNFormer. It achieves consistent improvements over state-of-the-art (SoTA) baselines: 3.6% increase in drug response prediction correlation, 9.6% improvement in single-cell drug classification AUC, and 1.1% average gain in gene perturbation prediction accuracy.

  • 9 authors
·
Mar 3, 2025

Neuro-inspired Ensemble-to-Ensemble Communication Primitives for Sparse and Efficient ANNs

The structure of biological neural circuits-modular, hierarchical, and sparsely interconnected-reflects an efficient trade-off between wiring cost, functional specialization, and robustness. These principles offer valuable insights for artificial neural network (ANN) design, especially as networks grow in depth and scale. Sparsity, in particular, has been widely explored for reducing memory and computation, improving speed, and enhancing generalization. Motivated by systems neuroscience findings, we explore how patterns of functional connectivity in the mouse visual cortex-specifically, ensemble-to-ensemble communication, can inform ANN design. We introduce G2GNet, a novel architecture that imposes sparse, modular connectivity across feedforward layers. Despite having significantly fewer parameters than fully connected models, G2GNet achieves superior accuracy on standard vision benchmarks. To our knowledge, this is the first architecture to incorporate biologically observed functional connectivity patterns as a structural bias in ANN design. We complement this static bias with a dynamic sparse training (DST) mechanism that prunes and regrows edges during training. We also propose a Hebbian-inspired rewiring rule based on activation correlations, drawing on principles of biological plasticity. G2GNet achieves up to 75% sparsity while improving accuracy by up to 4.3% on benchmarks, including Fashion-MNIST, CIFAR-10, and CIFAR-100, outperforming dense baselines with far fewer computations.

  • 3 authors
·
Aug 19, 2025

CellForge: Agentic Design of Virtual Cell Models

Virtual cell modeling represents an emerging frontier at the intersection of artificial intelligence and biology, aiming to predict quantities such as responses to diverse perturbations quantitatively. However, autonomously building computational models for virtual cells is challenging due to the complexity of biological systems, the heterogeneity of data modalities, and the need for domain-specific expertise across multiple disciplines. Here, we introduce CellForge, an agentic system that leverages a multi-agent framework that transforms presented biological datasets and research objectives directly into optimized computational models for virtual cells. More specifically, given only raw single-cell multi-omics data and task descriptions as input, CellForge outputs both an optimized model architecture and executable code for training virtual cell models and inference. The framework integrates three core modules: Task Analysis for presented dataset characterization and relevant literature retrieval, Method Design, where specialized agents collaboratively develop optimized modeling strategies, and Experiment Execution for automated generation of code. The agents in the Design module are separated into experts with differing perspectives and a central moderator, and have to collaboratively exchange solutions until they achieve a reasonable consensus. We demonstrate CellForge's capabilities in single-cell perturbation prediction, using six diverse datasets that encompass gene knockouts, drug treatments, and cytokine stimulations across multiple modalities. CellForge consistently outperforms task-specific state-of-the-art methods. Overall, CellForge demonstrates how iterative interaction between LLM agents with differing perspectives provides better solutions than directly addressing a modeling challenge. Our code is publicly available at https://github.com/gersteinlab/CellForge.

  • 15 authors
·
Aug 4, 2025 2

Higher-Order Knowledge Representations for Agentic Scientific Reasoning

Scientific inquiry requires systems-level reasoning that integrates heterogeneous experimental data, cross-domain knowledge, and mechanistic evidence into coherent explanations. While Large Language Models (LLMs) offer inferential capabilities, they often depend on retrieval-augmented contexts that lack structural depth. Traditional Knowledge Graphs (KGs) attempt to bridge this gap, yet their pairwise constraints fail to capture the irreducible higher-order interactions that govern emergent physical behavior. To address this, we introduce a methodology for constructing hypergraph-based knowledge representations that faithfully encode multi-entity relationships. Applied to a corpus of ~1,100 manuscripts on biocomposite scaffolds, our framework constructs a global hypergraph of 161,172 nodes and 320,201 hyperedges, revealing a scale-free topology (power law exponent ~1.23) organized around highly connected conceptual hubs. This representation prevents the combinatorial explosion typical of pairwise expansions and explicitly preserves the co-occurrence context of scientific formulations. We further demonstrate that equipping agentic systems with hypergraph traversal tools, specifically using node-intersection constraints, enables them to bridge semantically distant concepts. By exploiting these higher-order pathways, the system successfully generates grounded mechanistic hypotheses for novel composite materials, such as linking cerium oxide to PCL scaffolds via chitosan intermediates. This work establishes a "teacherless" agentic reasoning system where hypergraph topology acts as a verifiable guardrail, accelerating scientific discovery by uncovering relationships obscured by traditional graph methods.

  • 2 authors
·
Jan 8

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

Numerous biological and physical processes can be modeled as systems of interacting entities evolving continuously over time, e.g. the dynamics of communicating cells or physical particles. Learning the dynamics of such systems is essential for predicting the temporal evolution of populations across novel samples and unseen environments. Flow-based models allow for learning these dynamics at the population level - they model the evolution of the entire distribution of samples. However, current flow-based models are limited to a single initial population and a set of predefined conditions which describe different dynamics. We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities. That is, the change of the population at any moment in time depends on the population itself due to the interactions between samples. In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depends on the microenvironment of cells specific to each patient. We propose Meta Flow Matching (MFM), a practical approach to integrating along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations. Namely, we embed the population of samples using a Graph Neural Network (GNN) and use these embeddings to train a Flow Matching model. This gives MFM the ability to generalize over the initial distributions unlike previously proposed methods. We demonstrate the ability of MFM to improve prediction of individual treatment responses on a large scale multi-patient single-cell drug screen dataset.

  • 8 authors
·
Aug 26, 2024 2

LangCell: Language-Cell Pre-training for Cell Identity Understanding

Cell identity encompasses various semantic aspects of a cell, including cell type, pathway information, disease information, and more, which are essential for biologists to gain insights into its biological characteristics. Understanding cell identity from the transcriptomic data, such as annotating cell types, has become an important task in bioinformatics. As these semantic aspects are determined by human experts, it is impossible for AI models to effectively carry out cell identity understanding tasks without the supervision signals provided by single-cell and label pairs. The single-cell pre-trained language models (PLMs) currently used for this task are trained only on a single modality, transcriptomics data, lack an understanding of cell identity knowledge. As a result, they have to be fine-tuned for downstream tasks and struggle when lacking labeled data with the desired semantic labels. To address this issue, we propose an innovative solution by constructing a unified representation of single-cell data and natural language during the pre-training phase, allowing the model to directly incorporate insights related to cell identity. More specifically, we introduce LangCell, the first Language-Cell pre-training framework. LangCell utilizes texts enriched with cell identity information to gain a profound comprehension of cross-modal knowledge. Results from experiments conducted on different benchmarks show that LangCell is the only single-cell PLM that can work effectively in zero-shot cell identity understanding scenarios, and also significantly outperforms existing models in few-shot and fine-tuning cell identity understanding scenarios.

  • 5 authors
·
May 9, 2024

UMMAN: Unsupervised Multi-graph Merge Adversarial Network for Disease Prediction Based on Intestinal Flora

The abundance of intestinal flora is closely related to human diseases, but diseases are not caused by a single gut microbe. Instead, they result from the complex interplay of numerous microbial entities. This intricate and implicit connection among gut microbes poses a significant challenge for disease prediction using abundance information from OTU data. Recently, several methods have shown potential in predicting corresponding diseases. However, these methods fail to learn the inner association among gut microbes from different hosts, leading to unsatisfactory performance. In this paper, we present a novel architecture, Unsupervised Multi-graph Merge Adversarial Network (UMMAN). UMMAN can obtain the embeddings of nodes in the Multi-Graph in an unsupervised scenario, so that it helps learn the multiplex association. Our method is the first to combine Graph Neural Network with the task of intestinal flora disease prediction. We employ complex relation-types to construct the Original-Graph and disrupt the relationships among nodes to generate corresponding Shuffled-Graph. We introduce the Node Feature Global Integration (NFGI) module to represent the global features of the graph. Furthermore, we design a joint loss comprising adversarial loss and hybrid attention loss to ensure that the real graph embedding aligns closely with the Original-Graph and diverges from the Shuffled-Graph. Comprehensive experiments on five classical OTU gut microbiome datasets demonstrate the effectiveness and stability of our method. (We will release our code soon.)

  • 5 authors
·
Jul 31, 2024

SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images

Traditional fluorescence staining is phototoxic to live cells, slow, and expensive; thus, the subcellular structure prediction (SSP) from transmitted light (TL) images is emerging as a label-free, faster, low-cost alternative. However, existing approaches utilize 3D networks for one-to-one voxel level dense prediction, which necessitates a frequent and time-consuming Z-axis imaging process. Moreover, 3D convolutions inevitably lead to significant computation and GPU memory overhead. Therefore, we propose an efficient framework, SparseSSP, predicting fluorescent intensities within the target voxel grid in an efficient paradigm instead of relying entirely on 3D topologies. In particular, SparseSSP makes two pivotal improvements to prior works. First, SparseSSP introduces a one-to-many voxel mapping paradigm, which permits the sparse TL slices to reconstruct the subcellular structure. Secondly, we propose a hybrid dimensions topology, which folds the Z-axis information into channel features, enabling the 2D network layers to tackle SSP under low computational cost. We conduct extensive experiments to validate the effectiveness and advantages of SparseSSP on diverse sparse imaging ratios, and our approach achieves a leading performance compared to pure 3D topologies. SparseSSP reduces imaging frequencies compared to previous dense-view SSP (i.e., the number of imaging is reduced up to 87.5% at most), which is significant in visualizing rapid biological dynamics on low-cost devices and samples.

  • 6 authors
·
Jul 2, 2024

Random Spatial Networks: Small Worlds without Clustering, Traveling Waves, and Hop-and-Spread Disease Dynamics

Random network models play a prominent role in modeling, analyzing and understanding complex phenomena on real-life networks. However, a key property of networks is often neglected: many real-world networks exhibit spatial structure, the tendency of a node to select neighbors with a probability depending on physical distance. Here, we introduce a class of random spatial networks (RSNs) which generalizes many existing random network models but adds spatial structure. In these networks, nodes are placed randomly in space and joined in edges with a probability depending on their distance and their individual expected degrees, in a manner that crucially remains analytically tractable. We use this network class to propose a new generalization of small-world networks, where the average shortest path lengths in the graph are small, as in classical Watts-Strogatz small-world networks, but with close spatial proximity of nodes that are neighbors in the network playing the role of large clustering. Small-world effects are demonstrated on these spatial small-world networks without clustering. We are able to derive partial integro-differential equations governing susceptible-infectious-recovered disease spreading through an RSN, and we demonstrate the existence of traveling wave solutions. If the distance kernel governing edge placement decays slower than exponential, the population-scale dynamics are dominated by long-range hops followed by local spread of traveling waves. This provides a theoretical modeling framework for recent observations of how epidemics like Ebola evolve in modern connected societies, with long-range connections seeding new focal points from which the epidemic locally spreads in a wavelike manner.

  • 4 authors
·
Feb 4, 2017

Living Capillary Bridges

Biological tissues exhibit complex behaviors with their dynamics often resembling inert soft matter such as liquids, polymers, colloids, and liquid crystals. These analogies enable physics-based approaches for investigations of emergent behaviors in biological processes. A well-studied case is the spreading of cellular aggregates on solid surfaces, where they display dynamics similar to viscous droplets. In vivo, however, cells and tissues are in a confined environment with varying geometries and mechanical properties to which they need to adapt. In this work, we compressed cellular aggregates between two solid surfaces and studied their dynamics using microscopy, and computer simulations. The confined cellular aggregates transitioned from compressed spheres into dynamic living capillary bridges exhibiting bridge thinning and a convex-to-concave meniscus curvature transition. We found that the stability of the bridge is determined by the interplay between cell growth and cell spreading on the confining surfaces. This interaction leads to bridge rupture at a critical length scale determined by the distance between the plates. The force distributions, formation and stability regimes of the living capillary bridges were characterized with full 3D computer simulations that included cell division, migration and growth dynamics, directly showing how mechanical principles govern the behavior of the living bridges; cellular aggregates display jamming and stiffening analogously to granular matter, and cell division along the long axis enhances thinning. Based on our results, we propose a new class of active soft matter behavior, where cellular aggregates exhibit liquid-like adaptation to confinement, but with self-organized rupturing driven by biological activity.

  • 8 authors
·
Oct 16, 2025

From Cities to Series: Complex Networks and Deep Learning for Improved Spatial and Temporal Analytics*

Graphs have often been used to answer questions about the interaction between real-world entities by taking advantage of their capacity to represent complex topologies. Complex networks are known to be graphs that capture such non-trivial topologies; they are able to represent human phenomena such as epidemic processes, the dynamics of populations, and the urbanization of cities. The investigation of complex networks has been extrapolated to many fields of science, with particular emphasis on computing techniques, including artificial intelligence. In such a case, the analysis of the interaction between entities of interest is transposed to the internal learning of algorithms, a paradigm whose investigation is able to expand the state of the art in Computer Science. By exploring this paradigm, this thesis puts together complex networks and machine learning techniques to improve the understanding of the human phenomena observed in pandemics, pendular migration, and street networks. Accordingly, we contribute with: (i) a new neural network architecture capable of modeling dynamic processes observed in spatial and temporal data with applications in epidemics propagation, weather forecasting, and patient monitoring in intensive care units; (ii) a machine-learning methodology for analyzing and predicting links in the scope of human mobility between all the cities of Brazil; and, (iii) techniques for identifying inconsistencies in the urban planning of cities while tracking the most influential vertices, with applications over Brazilian and worldwide cities. We obtained results sustained by sound evidence of advances to the state of the art in artificial intelligence, rigorous formalisms, and ample experimentation. Our findings rely upon real-world applications in a range of domains, demonstrating the applicability of our methodologies.

  • 2 authors
·
Jun 1, 2022

How connectivity structure shapes rich and lazy learning in neural circuits

In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learning dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biology, neural circuit connectivity could exhibit a low-rank structure and therefore differs markedly from the random initializations generally used for these studies. As such, here we investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime. Through both empirical and theoretical analyses, we discover that high-rank initializations typically yield smaller network changes indicative of lazier learning, a finding we also confirm with experimentally-driven initial connectivity in recurrent neural networks. Conversely, low-rank initialization biases learning towards richer learning. Importantly, however, as an exception to this rule, we find lazier learning can still occur with a low-rank initialization that aligns with task and data statistics. Our research highlights the pivotal role of initial weight structures in shaping learning regimes, with implications for metabolic costs of plasticity and risks of catastrophic forgetting.

  • 6 authors
·
Oct 12, 2023

Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs

Cellular sheaves equip graphs with a "geometrical" structure by assigning vector spaces and linear maps to nodes and edges. Graph Neural Networks (GNNs) implicitly assume a graph with a trivial underlying sheaf. This choice is reflected in the structure of the graph Laplacian operator, the properties of the associated diffusion equation, and the characteristics of the convolutional models that discretise this equation. In this paper, we use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs in heterophilic settings and their oversmoothing behaviour. By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands. At the same time, we prove that when the sheaf is non-trivial, discretised parametric diffusion processes have greater control than GNNs over their asymptotic behaviour. On the practical side, we study how sheaves can be learned from data. The resulting sheaf diffusion models have many desirable properties that address the limitations of classical graph diffusion equations (and corresponding GNN models) and obtain competitive results in heterophilic settings. Overall, our work provides new connections between GNNs and algebraic topology and would be of interest to both fields.

  • 5 authors
·
Feb 9, 2022

Landscaping Linear Mode Connectivity

The presence of linear paths in parameter space between two different network solutions in certain cases, i.e., linear mode connectivity (LMC), has garnered interest from both theoretical and practical fronts. There has been significant research that either practically designs algorithms catered for connecting networks by adjusting for the permutation symmetries as well as some others that more theoretically construct paths through which networks can be connected. Yet, the core reasons for the occurrence of LMC, when in fact it does occur, in the highly non-convex loss landscapes of neural networks are far from clear. In this work, we take a step towards understanding it by providing a model of how the loss landscape needs to behave topographically for LMC (or the lack thereof) to manifest. Concretely, we present a `mountainside and ridge' perspective that helps to neatly tie together different geometric features that can be spotted in the loss landscape along the training runs. We also complement this perspective by providing a theoretical analysis of the barrier height, for which we provide empirical support, and which additionally extends as a faithful predictor of layer-wise LMC. We close with a toy example that provides further intuition on how barriers arise in the first place, all in all, showcasing the larger aim of the work -- to provide a working model of the landscape and its topography for the occurrence of LMC.

  • 6 authors
·
Jun 23, 2024

Astrocyte-Enabled Advancements in Spiking Neural Networks for Large Language Modeling

Within the complex neuroarchitecture of the brain, astrocytes play crucial roles in development, structure, and metabolism. These cells regulate neural activity through tripartite synapses, directly impacting cognitive processes such as learning and memory. Despite the growing recognition of astrocytes' significance, traditional Spiking Neural Network (SNN) models remain predominantly neuron-centric, overlooking the profound influence of astrocytes on neural dynamics. Inspired by these biological insights, we have developed an Astrocyte-Modulated Spiking Unit (AM-SU), an innovative framework that integrates neuron-astrocyte interactions into the computational paradigm, demonstrating wide applicability across various hardware platforms. Our Astrocyte-Modulated Spiking Neural Network (AstroSNN) exhibits exceptional performance in tasks involving memory retention and natural language generation, particularly in handling long-term dependencies and complex linguistic structures. The design of AstroSNN not only enhances its biological authenticity but also introduces novel computational dynamics, enabling more effective processing of complex temporal dependencies. Furthermore, AstroSNN shows low latency, high throughput, and reduced memory usage in practical applications, making it highly suitable for resource-constrained environments. By successfully integrating astrocytic dynamics into intelligent neural networks, our work narrows the gap between biological plausibility and neural modeling, laying the groundwork for future biologically-inspired neural computing research that includes both neurons and astrocytes.

  • 7 authors
·
Dec 12, 2023

MIPHEI-ViT: Multiplex Immunofluorescence Prediction from H&E Images using ViT Foundation Models

Histopathological analysis is a cornerstone of cancer diagnosis, with Hematoxylin and Eosin (H&E) staining routinely acquired for every patient to visualize cell morphology and tissue architecture. On the other hand, multiplex immunofluorescence (mIF) enables more precise cell type identification via proteomic markers, but has yet to achieve widespread clinical adoption due to cost and logistical constraints. To bridge this gap, we introduce MIPHEI (Multiplex Immunofluorescence Prediction from H&E), a U-Net-inspired architecture that integrates state-of-the-art ViT foundation models as encoders to predict mIF signals from H&E images. MIPHEI targets a comprehensive panel of markers spanning nuclear content, immune lineages (T cells, B cells, myeloid), epithelium, stroma, vasculature, and proliferation. We train our model using the publicly available ORION dataset of restained H&E and mIF images from colorectal cancer tissue, and validate it on two independent datasets. MIPHEI achieves accurate cell-type classification from H&E alone, with F1 scores of 0.88 for Pan-CK, 0.57 for CD3e, 0.56 for SMA, 0.36 for CD68, and 0.30 for CD20, substantially outperforming both a state-of-the-art baseline and a random classifier for most markers. Our results indicate that our model effectively captures the complex relationships between nuclear morphologies in their tissue context, as visible in H&E images and molecular markers defining specific cell types. MIPHEI offers a promising step toward enabling cell-type-aware analysis of large-scale H&E datasets, in view of uncovering relationships between spatial cellular organization and patient outcomes.

  • 5 authors
·
May 15, 2025

Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy

Accurate detection and segmentation of cell nuclei in volumetric (3D) fluorescence microscopy datasets is an important step in many biomedical research projects. Although many automated methods for these tasks exist, they often struggle for images with low signal-to-noise ratios and/or dense packing of nuclei. It was recently shown for 2D microscopy images that these issues can be alleviated by training a neural network to directly predict a suitable shape representation (star-convex polygon) for cell nuclei. In this paper, we adopt and extend this approach to 3D volumes by using star-convex polyhedra to represent cell nuclei and similar shapes. To that end, we overcome the challenges of 1) finding parameter-efficient star-convex polyhedra representations that can faithfully describe cell nuclei shapes, 2) adapting to anisotropic voxel sizes often found in fluorescence microscopy datasets, and 3) efficiently computing intersections between pairs of star-convex polyhedra (required for non-maximum suppression). Although our approach is quite general, since star-convex polyhedra include common shapes like bounding boxes and spheres as special cases, our focus is on accurate detection and segmentation of cell nuclei. Finally, we demonstrate on two challenging datasets that our approach (StarDist-3D) leads to superior results when compared to classical and deep learning based methods.

  • 5 authors
·
Aug 9, 2019

Generating Drug Repurposing Hypotheses through the Combination of Disease-Specific Hypergraphs

The drug development pipeline for a new compound can last 10-20 years and cost over 10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on biomedical knowledge graph representations have recently yielded new drug repurposing hypotheses. In this study, we present a novel, disease-specific hypergraph representation learning technique to derive contextual embeddings of biological pathways of various lengths but that all start at any given drug and all end at the disease of interest. Further, we extend this method to multi-disease hypergraphs. To determine the repurposing potential of each of the 1,522 drugs, we derive drug-specific distributions of cosine similarity values and ultimately consider the median for ranking. Cosine similarity values are computed between (1) all biological pathways starting at the considered drug and ending at the disease of interest and (2) all biological pathways starting at drugs currently prescribed against that disease and ending at the disease of interest. We illustrate our approach with Alzheimer's disease (AD) and two of its risk factors: hypertension (HTN) and type 2 diabetes (T2D). We compare each drug's rank across four hypergraph settings (single- or multi-disease): AD only, AD + HTN, AD + T2D, and AD + HTN + T2D. Notably, our framework led to the identification of two promising drugs whose repurposing potential was significantly higher in hypergraphs combining two diseases: dapagliflozin (antidiabetic; moved up, from top 32% to top 7%, across all considered drugs) and debrisoquine (antihypertensive; moved up, from top 76% to top 23%). Our approach serves as a hypothesis generation tool, to be paired with a validation pipeline relying on laboratory experiments and semi-automated parsing of the biomedical literature.

  • 5 authors
·
Nov 16, 2023

From time-series to complex networks: Application to the cerebrovascular flow patterns in atrial fibrillation

A network-based approach is presented to investigate the cerebrovascular flow patterns during atrial fibrillation (AF) with respect to normal sinus rhythm (NSR). AF, the most common cardiac arrhythmia with faster and irregular beating, has been recently and independently associated with the increased risk of dementia. However, the underlying hemodynamic mechanisms relating the two pathologies remain mainly undetermined so far; thus the contribution of modeling and refined statistical tools is valuable. Pressure and flow rate temporal series in NSR and AF are here evaluated along representative cerebral sites (from carotid arteries to capillary brain circulation), exploiting reliable artificially built signals recently obtained from an in silico approach. The complex network analysis evidences, in a synthetic and original way, a dramatic signal variation towards the distal/capillary cerebral regions during AF, which has no counterpart in NSR conditions. At the large artery level, networks obtained from both AF and NSR hemodynamic signals exhibit elongated and chained features, which are typical of pseudo-periodic series. These aspects are almost completely lost towards the microcirculation during AF, where the networks are topologically more circular and present random-like characteristics. As a consequence, all the physiological phenomena at microcerebral level ruled by periodicity - such as regular perfusion, mean pressure per beat, and average nutrient supply at cellular level - can be strongly compromised, since the AF hemodynamic signals assume irregular behaviour and random-like features. Through a powerful approach which is complementary to the classical statistical tools, the present findings further strengthen the potential link between AF hemodynamic and cognitive decline.

  • 3 authors
·
Sep 26, 2017

K-Paths: Reasoning over Graph Paths for Drug Repurposing and Drug Interaction Prediction

Drug discovery is a complex and time-intensive process that requires identifying and validating new therapeutic candidates. Computational approaches using large-scale biomedical knowledge graphs (KGs) offer a promising solution to accelerate this process. However, extracting meaningful insights from large-scale KGs remains challenging due to the complexity of graph traversal. Existing subgraph-based methods are tailored to graph neural networks (GNNs), making them incompatible with other models, such as large language models (LLMs). We introduce K-Paths, a retrieval framework that extracts structured, diverse, and biologically meaningful paths from KGs. Integrating these paths enables LLMs and GNNs to effectively predict unobserved drug-drug and drug-disease interactions. Unlike traditional path-ranking approaches, K-Paths retrieves and transforms paths into a structured format that LLMs can directly process, facilitating explainable reasoning. K-Paths employs a diversity-aware adaptation of Yen's algorithm to retrieve the K shortest loopless paths between entities in an interaction query, prioritizing biologically relevant and diverse relationships. Our experiments on benchmark datasets show that K-Paths improves the zero-shot performance of Llama 8.1B's F1-score by 12.45 points on drug repurposing and 13.42 points on interaction severity prediction. We also show that Llama 70B achieves F1-score gains of 6.18 and 8.46 points, respectively. K-Paths also improves the supervised training efficiency of EmerGNN, a state-of-the-art GNN, by reducing KG size by 90% while maintaining strong predictive performance. Beyond its scalability and efficiency, K-Paths uniquely bridges the gap between KGs and LLMs, providing explainable rationales for predicted interactions. These capabilities show that K-Paths is a valuable tool for efficient data-driven drug discovery.

  • 7 authors
·
Feb 18, 2025

Gene Regulatory Network Inference in the Presence of Dropouts: a Causal View

Gene regulatory network inference (GRNI) is a challenging problem, particularly owing to the presence of zeros in single-cell RNA sequencing data: some are biological zeros representing no gene expression, while some others are technical zeros arising from the sequencing procedure (aka dropouts), which may bias GRNI by distorting the joint distribution of the measured gene expressions. Existing approaches typically handle dropout error via imputation, which may introduce spurious relations as the true joint distribution is generally unidentifiable. To tackle this issue, we introduce a causal graphical model to characterize the dropout mechanism, namely, Causal Dropout Model. We provide a simple yet effective theoretical result: interestingly, the conditional independence (CI) relations in the data with dropouts, after deleting the samples with zero values (regardless if technical or not) for the conditioned variables, are asymptotically identical to the CI relations in the original data without dropouts. This particular test-wise deletion procedure, in which we perform CI tests on the samples without zeros for the conditioned variables, can be seamlessly integrated with existing structure learning approaches including constraint-based and greedy score-based methods, thus giving rise to a principled framework for GRNI in the presence of dropouts. We further show that the causal dropout model can be validated from data, and many existing statistical models to handle dropouts fit into our model as specific parametric instances. Empirical evaluation on synthetic, curated, and real-world experimental transcriptomic data comprehensively demonstrate the efficacy of our method.

  • 6 authors
·
Mar 21, 2024

ProteinRPN: Towards Accurate Protein Function Prediction with Graph-Based Region Proposals

Protein function prediction is a crucial task in bioinformatics, with significant implications for understanding biological processes and disease mechanisms. While the relationship between sequence and function has been extensively explored, translating protein structure to function continues to present substantial challenges. Various models, particularly, CNN and graph-based deep learning approaches that integrate structural and functional data, have been proposed to address these challenges. However, these methods often fall short in elucidating the functional significance of key residues essential for protein functionality, as they predominantly adopt a retrospective perspective, leading to suboptimal performance. Inspired by region proposal networks in computer vision, we introduce the Protein Region Proposal Network (ProteinRPN) for accurate protein function prediction. Specifically, the region proposal module component of ProteinRPN identifies potential functional regions (anchors) which are refined through the hierarchy-aware node drop pooling layer favoring nodes with defined secondary structures and spatial proximity. The representations of the predicted functional nodes are enriched using attention mechanisms and subsequently fed into a Graph Multiset Transformer, which is trained with supervised contrastive (SupCon) and InfoNCE losses on perturbed protein structures. Our model demonstrates significant improvements in predicting Gene Ontology (GO) terms, effectively localizing functional residues within protein structures. The proposed framework provides a robust, scalable solution for protein function annotation, advancing the understanding of protein structure-function relationships in computational biology.

  • 3 authors
·
Sep 1, 2024

BioGraphFusion: Graph Knowledge Embedding for Biological Completion and Reasoning

Motivation: Biomedical knowledge graphs (KGs) are crucial for drug discovery and disease understanding, yet their completion and reasoning are challenging. Knowledge Embedding (KE) methods capture global semantics but struggle with dynamic structural integration, while Graph Neural Networks (GNNs) excel locally but often lack semantic understanding. Even ensemble approaches, including those leveraging language models, often fail to achieve a deep, adaptive, and synergistic co-evolution between semantic comprehension and structural learning. Addressing this critical gap in fostering continuous, reciprocal refinement between these two aspects in complex biomedical KGs is paramount. Results: We introduce BioGraphFusion, a novel framework for deeply synergistic semantic and structural learning. BioGraphFusion establishes a global semantic foundation via tensor decomposition, guiding an LSTM-driven mechanism to dynamically refine relation embeddings during graph propagation. This fosters adaptive interplay between semantic understanding and structural learning, further enhanced by query-guided subgraph construction and a hybrid scoring mechanism. Experiments across three key biomedical tasks demonstrate BioGraphFusion's superior performance over state-of-the-art KE, GNN, and ensemble models. A case study on Cutaneous Malignant Melanoma 1 (CMM1) highlights its ability to unveil biologically meaningful pathways. Availability and Implementation: Source code and all training data are freely available for download at https://github.com/Y-TARL/BioGraphFusion. Supplementary information: Supplementary data are available at Bioinformatics online.

  • 6 authors
·
Jul 19, 2025

Equivariant Graph Attention Networks with Structural Motifs for Predicting Cell Line-Specific Synergistic Drug Combinations

Cancer is the second leading cause of death, with chemotherapy as one of the primary forms of treatment. As a result, researchers are turning to drug combination therapy to decrease drug resistance and increase efficacy. Current methods of drug combination screening, such as in vivo and in vitro, are inefficient due to stark time and monetary costs. In silico methods have become increasingly important for screening drugs, but current methods are inaccurate and generalize poorly to unseen anticancer drugs. In this paper, I employ a geometric deep-learning model utilizing a graph attention network that is equivariant to 3D rotations, translations, and reflections with structural motifs. Additionally, the gene expression of cancer cell lines is utilized to classify synergistic drug combinations specific to each cell line. I compared the proposed geometric deep learning framework to current state-of-the-art (SOTA) methods, and the proposed model architecture achieved greater performance on all 12 benchmark tasks performed on the DrugComb dataset. Specifically, the proposed framework outperformed other SOTA methods by an accuracy difference greater than 28%. Based on these results, I believe that the equivariant graph attention network's capability of learning geometric data accounts for the large performance improvements. The model's ability to generalize to foreign drugs is thought to be due to the structural motifs providing a better representation of the molecule. Overall, I believe that the proposed equivariant geometric deep learning framework serves as an effective tool for virtually screening anticancer drug combinations for further validation in a wet lab environment. The code for this work is made available online at: https://github.com/WeToTheMoon/EGAT_DrugSynergy.

  • 1 authors
·
Nov 7, 2024

PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs

Deep learning-based computational methods have achieved promising results in predicting protein-protein interactions (PPIs). However, existing benchmarks predominantly focus on isolated pairwise evaluations, overlooking a model's capability to reconstruct biologically meaningful PPI networks, which is crucial for biology research. To address this gap, we introduce PRING, the first comprehensive benchmark that evaluates protein-protein interaction prediction from a graph-level perspective. PRING curates a high-quality, multi-species PPI network dataset comprising 21,484 proteins and 186,818 interactions, with well-designed strategies to address both data redundancy and leakage. Building on this golden-standard dataset, we establish two complementary evaluation paradigms: (1) topology-oriented tasks, which assess intra and cross-species PPI network construction, and (2) function-oriented tasks, including protein complex pathway prediction, GO module analysis, and essential protein justification. These evaluations not only reflect the model's capability to understand the network topology but also facilitate protein function annotation, biological module detection, and even disease mechanism analysis. Extensive experiments on four representative model categories, consisting of sequence similarity-based, naive sequence-based, protein language model-based, and structure-based approaches, demonstrate that current PPI models have potential limitations in recovering both structural and functional properties of PPI networks, highlighting the gap in supporting real-world biological applications. We believe PRING provides a reliable platform to guide the development of more effective PPI prediction models for the community. The dataset and source code of PRING are available at https://github.com/SophieSarceau/PRING.

  • 12 authors
·
Jul 7, 2025 1

FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures

Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

  • 9 authors
·
Mar 29, 2024

MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention

Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease. Multi-modal self-supervised learning has demonstrated remarkable potential in learning pathological representations by integrating diverse data sources. Conventional multi-modal integration methods primarily emphasize modality alignment, while paying insufficient attention to retaining the modality-specific structures. However, unlike conventional scenarios where multi-modal inputs share highly overlapping features, histopathology and transcriptomics exhibit pronounced heterogeneity, offering orthogonal yet complementary insights. Histopathology provides morphological and spatial context, elucidating tissue architecture and cellular topology, whereas transcriptomics delineates molecular signatures through gene expression patterns. This inherent disparity introduces a major challenge in aligning them while maintaining modality-specific fidelity. To address these challenges, we present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention. MIRROR employs dedicated encoders to extract comprehensive features for each modality, which is further complemented by a modality alignment module to achieve seamless integration between phenotype patterns and molecular profiles. Furthermore, a modality retention module safeguards unique attributes from each modality, while a style clustering module mitigates redundancy and enhances disease-relevant information by modeling and aligning consistent pathological signatures within a clustering space. Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance, demonstrating its effectiveness in constructing comprehensive oncological feature representations and benefiting the cancer diagnosis.

  • 7 authors
·
Mar 1, 2025

Modeling Edge-Specific Node Features through Co-Representation Neural Hypergraph Diffusion

Hypergraphs are widely being employed to represent complex higher-order relations in real-world applications. Most existing research on hypergraph learning focuses on node-level or edge-level tasks. A practically relevant and more challenging task, edge-dependent node classification (ENC), is still under-explored. In ENC, a node can have different labels across different hyperedges, which requires the modeling of node features unique to each hyperedge. The state-of-the-art ENC solution, WHATsNet, only outputs single node and edge representations, leading to the limitations of entangled edge-specific features and non-adaptive representation sizes when applied to ENC. Additionally, WHATsNet suffers from the common oversmoothing issue in most HGNNs. To address these limitations, we propose CoNHD, a novel HGNN architecture specifically designed to model edge-specific features for ENC. Instead of learning separate representations for nodes and edges, CoNHD reformulates within-edge and within-node interactions as a hypergraph diffusion process over node-edge co-representations. We develop a neural implementation of the proposed diffusion process, leveraging equivariant networks as diffusion operators to effectively learn the diffusion dynamics from data. Extensive experiments demonstrate that CoNHD achieves the best performance across all benchmark ENC datasets and several downstream tasks without sacrificing efficiency. Our implementation is available at https://github.com/zhengyijia/CoNHD.

  • 2 authors
·
May 23, 2024

Unraveling the cognitive patterns of Large Language Models through module communities

Large Language Models (LLMs) have reshaped our world with significant advancements in science, engineering, and society through applications ranging from scientific discoveries and medical diagnostics to Chatbots. Despite their ubiquity and utility, the underlying mechanisms of LLM remain concealed within billions of parameters and complex structures, making their inner architecture and cognitive processes challenging to comprehend. We address this gap by adopting approaches to understanding emerging cognition in biology and developing a network-based framework that links cognitive skills, LLM architectures, and datasets, ushering in a paradigm shift in foundation model analysis. The skill distribution in the module communities demonstrates that while LLMs do not strictly parallel the focalized specialization observed in specific biological systems, they exhibit unique communities of modules whose emergent skill patterns partially mirror the distributed yet interconnected cognitive organization seen in avian and small mammalian brains. Our numerical results highlight a key divergence from biological systems to LLMs, where skill acquisition benefits substantially from dynamic, cross-regional interactions and neural plasticity. By integrating cognitive science principles with machine learning, our framework provides new insights into LLM interpretability and suggests that effective fine-tuning strategies should leverage distributed learning dynamics rather than rigid modular interventions.

  • 3 authors
·
Aug 25, 2025 2

Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements

Graphs are essential data structures for modeling complex interactions in domains such as social networks, molecular structures, and biological systems. Graph-level tasks, which predict properties or classes for the entire graph, are critical for applications, such as molecular property prediction and subgraph counting. Graph Neural Networks (GNNs) have shown promise in these tasks, but their evaluations are often limited to narrow datasets, tasks, and inconsistent experimental setups, restricting their generalizability. To address these limitations, we propose a unified evaluation framework for graph-level GNNs. This framework provides a standardized setting to evaluate GNNs across diverse datasets, various graph tasks (e.g., graph classification and regression), and challenging scenarios, including noisy, imbalanced, and few-shot graphs. Additionally, we propose a novel GNN model with enhanced expressivity and generalization capabilities. Specifically, we enhance the expressivity of GNNs through a k-path rooted subgraph approach, enabling the model to effectively count subgraphs (e.g., paths and cycles). Moreover, we introduce a unified graph contrastive learning algorithm for graphs across diverse domains, which adaptively removes unimportant edges to augment graphs, thereby significantly improving generalization performance. Extensive experiments demonstrate that our model achieves superior performance against fourteen effective baselines across twenty-seven graph datasets, establishing it as a robust and generalizable model for graph-level tasks.

  • 6 authors
·
Jan 1, 2025

Fine-Tuning and Training of DenseNet for Histopathology Image Representation Using TCGA Diagnostic Slides

Feature vectors provided by pre-trained deep artificial neural networks have become a dominant source for image representation in recent literature. Their contribution to the performance of image analysis can be improved through finetuning. As an ultimate solution, one might even train a deep network from scratch with the domain-relevant images, a highly desirable option which is generally impeded in pathology by lack of labeled images and the computational expense. In this study, we propose a new network, namely KimiaNet, that employs the topology of the DenseNet with four dense blocks, fine-tuned and trained with histopathology images in different configurations. We used more than 240,000 image patches with 1000x1000 pixels acquired at 20x magnification through our proposed "highcellularity mosaic" approach to enable the usage of weak labels of 7,126 whole slide images of formalin-fixed paraffin-embedded human pathology samples publicly available through the The Cancer Genome Atlas (TCGA) repository. We tested KimiaNet using three public datasets, namely TCGA, endometrial cancer images, and colorectal cancer images by evaluating the performance of search and classification when corresponding features of different networks are used for image representation. As well, we designed and trained multiple convolutional batch-normalized ReLU (CBR) networks. The results show that KimiaNet provides superior results compared to the original DenseNet and smaller CBR networks when used as feature extractor to represent histopathology images.

  • 22 authors
·
Jan 19, 2021

A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following

Large language models excel at interpreting complex natural language instructions, enabling them to perform a wide range of tasks. In the life sciences, single-cell RNA sequencing (scRNA-seq) data serves as the "language of cellular biology", capturing intricate gene expression patterns at the single-cell level. However, interacting with this "language" through conventional tools is often inefficient and unintuitive, posing challenges for researchers. To address these limitations, we present InstructCell, a multi-modal AI copilot that leverages natural language as a medium for more direct and flexible single-cell analysis. We construct a comprehensive multi-modal instruction dataset that pairs text-based instructions with scRNA-seq profiles from diverse tissues and species. Building on this, we develop a multi-modal cell language architecture capable of simultaneously interpreting and processing both modalities. InstructCell empowers researchers to accomplish critical tasks-such as cell type annotation, conditional pseudo-cell generation, and drug sensitivity prediction-using straightforward natural language commands. Extensive evaluations demonstrate that InstructCell consistently meets or exceeds the performance of existing single-cell foundation models, while adapting to diverse experimental conditions. More importantly, InstructCell provides an accessible and intuitive tool for exploring complex single-cell data, lowering technical barriers and enabling deeper biological insights.

  • 8 authors
·
Jan 14, 2025 2

Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges

Estimating single-cell responses across various perturbations facilitates the identification of key genes and enhances drug screening, significantly boosting experimental efficiency. However, single-cell sequencing is a destructive process, making it impossible to capture the same cell's phenotype before and after perturbation. Consequently, data collected under perturbed and unperturbed conditions are inherently unpaired. Existing methods either attempt to forcibly pair unpaired data using random sampling, or neglect the inherent relationship between unperturbed and perturbed cells during the modeling. In this work, we propose a framework based on Dual Diffusion Implicit Bridges (DDIB) to learn the mapping between different data distributions, effectively addressing the challenge of unpaired data. We further interpret this framework as a form of data augmentation. We integrate gene regulatory network (GRN) information to propagate perturbation signals in a biologically meaningful way, and further incorporate a masking mechanism to predict silent genes, improving the quality of generated profiles. Moreover, gene expression under the same perturbation often varies significantly across cells, frequently exhibiting a bimodal distribution that reflects intrinsic heterogeneity. To capture this, we introduce a more suitable evaluation metric. We propose Unlasting, dual conditional diffusion models that overcome the problem of unpaired single-cell perturbation data and strengthen the model's insight into perturbations under the guidance of the GRN, with a dedicated mask model designed to improve generation quality by predicting silent genes. In addition, we introduce a biologically grounded evaluation metric that better reflects the inherent heterogeneity in single-cell responses.

  • 8 authors
·
Jun 26, 2025

ChAda-ViT : Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images

Unlike color photography images, which are consistently encoded into RGB channels, biological images encompass various modalities, where the type of microscopy and the meaning of each channel varies with each experiment. Importantly, the number of channels can range from one to a dozen and their correlation is often comparatively much lower than RGB, as each of them brings specific information content. This aspect is largely overlooked by methods designed out of the bioimage field, and current solutions mostly focus on intra-channel spatial attention, often ignoring the relationship between channels, yet crucial in most biological applications. Importantly, the variable channel type and count prevent the projection of several experiments to a unified representation for large scale pre-training. In this study, we propose ChAda-ViT, a novel Channel Adaptive Vision Transformer architecture employing an Inter-Channel Attention mechanism on images with an arbitrary number, order and type of channels. We also introduce IDRCell100k, a bioimage dataset with a rich set of 79 experiments covering 7 microscope modalities, with a multitude of channel types, and channel counts varying from 1 to 10 per experiment. Our proposed architecture, trained in a self-supervised manner, outperforms existing approaches in several biologically relevant downstream tasks. Additionally, it can be used to bridge the gap for the first time between assays with different microscopes, channel numbers or types by embedding various image and experimental modalities into a unified biological image representation. The latter should facilitate interdisciplinary studies and pave the way for better adoption of deep learning in biological image-based analyses. Code and Data to be released soon.

  • 7 authors
·
Nov 26, 2023

CPP-Net: Context-aware Polygon Proposal Network for Nucleus Segmentation

Nucleus segmentation is a challenging task due to the crowded distribution and blurry boundaries of nuclei. Recent approaches represent nuclei by means of polygons to differentiate between touching and overlapping nuclei and have accordingly achieved promising performance. Each polygon is represented by a set of centroid-to-boundary distances, which are in turn predicted by features of the centroid pixel for a single nucleus. However, using the centroid pixel alone does not provide sufficient contextual information for robust prediction and thus degrades the segmentation accuracy. To handle this problem, we propose a Context-aware Polygon Proposal Network (CPP-Net) for nucleus segmentation. First, we sample a point set rather than one single pixel within each cell for distance prediction. This strategy substantially enhances contextual information and thereby improves the robustness of the prediction. Second, we propose a Confidence-based Weighting Module, which adaptively fuses the predictions from the sampled point set. Third, we introduce a novel Shape-Aware Perceptual (SAP) loss that constrains the shape of the predicted polygons. Here, the SAP loss is based on an additional network that is pre-trained by means of mapping the centroid probability map and the pixel-to-boundary distance maps to a different nucleus representation. Extensive experiments justify the effectiveness of each component in the proposed CPP-Net. Finally, CPP-Net is found to achieve state-of-the-art performance on three publicly available databases, namely DSB2018, BBBC06, and PanNuke. Code of this paper is available at \url{https://github.com/csccsccsccsc/cpp-net

  • 5 authors
·
Feb 13, 2021

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

The relationship between computing systems and the brain has served as motivation for pioneering theoreticians since John von Neumann and Alan Turing. Uniform, scale-free biological networks, such as the brain, have powerful properties, including generalizing over time, which is the main barrier for Machine Learning on the path to Universal Reasoning Models. We introduce `Dragon Hatchling' (BDH), a new Large Language Model architecture based on a scale-free biologically inspired network of \n locally-interacting neuron particles. BDH couples strong theoretical foundations and inherent interpretability without sacrificing Transformer-like performance. BDH is a practical, performant state-of-the-art attention-based state space sequence learning architecture. In addition to being a graph model, BDH admits a GPU-friendly formulation. It exhibits Transformer-like scaling laws: empirically BDH rivals GPT2 performance on language and translation tasks, at the same number of parameters (10M to 1B), for the same training data. BDH can be represented as a brain model. The working memory of BDH during inference entirely relies on synaptic plasticity with Hebbian learning using spiking neurons. We confirm empirically that specific, individual synapses strengthen connection whenever BDH hears or reasons about a specific concept while processing language inputs. The neuron interaction network of BDH is a graph of high modularity with heavy-tailed degree distribution. The BDH model is biologically plausible, explaining one possible mechanism which human neurons could use to achieve speech. BDH is designed for interpretability. Activation vectors of BDH are sparse and positive. We demonstrate monosemanticity in BDH on language tasks. Interpretability of state, which goes beyond interpretability of neurons and model parameters, is an inherent feature of the BDH architecture.

pathwaycom Pathway
·
Sep 30, 2025 28

A Topological Perspective on Demystifying GNN-Based Link Prediction Performance

Graph Neural Networks (GNNs) have shown great promise in learning node embeddings for link prediction (LP). While numerous studies aim to improve the overall LP performance of GNNs, none have explored its varying performance across different nodes and its underlying reasons. To this end, we aim to demystify which nodes will perform better from the perspective of their local topology. Despite the widespread belief that low-degree nodes exhibit poorer LP performance, our empirical findings provide nuances to this viewpoint and prompt us to propose a better metric, Topological Concentration (TC), based on the intersection of the local subgraph of each node with the ones of its neighbors. We empirically demonstrate that TC has a higher correlation with LP performance than other node-level topological metrics like degree and subgraph density, offering a better way to identify low-performing nodes than using cold-start. With TC, we discover a novel topological distribution shift issue in which newly joined neighbors of a node tend to become less interactive with that node's existing neighbors, compromising the generalizability of node embeddings for LP at testing time. To make the computation of TC scalable, We further propose Approximated Topological Concentration (ATC) and theoretically/empirically justify its efficacy in approximating TC and reducing the computation complexity. Given the positive correlation between node TC and its LP performance, we explore the potential of boosting LP performance via enhancing TC by re-weighting edges in the message-passing and discuss its effectiveness with limitations. Our code is publicly available at https://github.com/YuWVandy/Topo_LP_GNN.

  • 7 authors
·
Oct 6, 2023

Reconstructing commuters network using machine learning and urban indicators

Human mobility has a significant impact on several layers of society, from infrastructural planning and economics to the spread of diseases and crime. Representing the system as a complex network, in which nodes are assigned to regions (e.g., a city) and links indicate the flow of people between two of them, physics-inspired models have been proposed to quantify the number of people migrating from one city to the other. Despite the advances made by these models, our ability to predict the number of commuters and reconstruct mobility networks remains limited. Here, we propose an alternative approach using machine learning and 22 urban indicators to predict the flow of people and reconstruct the intercity commuters network. Our results reveal that predictions based on machine learning algorithms and urban indicators can reconstruct the commuters network with 90.4% of accuracy and describe 77.6% of the variance observed in the flow of people between cities. We also identify essential features to recover the network structure and the urban indicators mostly related to commuting patterns. As previously reported, distance plays a significant role in commuting, but other indicators, such as Gross Domestic Product (GDP) and unemployment rate, are also driven-forces for people to commute. We believe that our results shed new lights on the modeling of migration and reinforce the role of urban indicators on commuting patterns. Also, because link-prediction and network reconstruction are still open challenges in network science, our results have implications in other areas, like economics, social sciences, and biology, where node attributes can give us information about the existence of links connecting entities in the network.

  • 4 authors
·
Aug 9, 2019

The TYC Dataset for Understanding Instance-Level Semantics and Motions of Cells in Microstructures

Segmenting cells and tracking their motion over time is a common task in biomedical applications. However, predicting accurate instance-wise segmentation and cell motions from microscopy imagery remains a challenging task. Using microstructured environments for analyzing single cells in a constant flow of media adds additional complexity. While large-scale labeled microscopy datasets are available, we are not aware of any large-scale dataset, including both cells and microstructures. In this paper, we introduce the trapped yeast cell (TYC) dataset, a novel dataset for understanding instance-level semantics and motions of cells in microstructures. We release 105 dense annotated high-resolution brightfield microscopy images, including about 19k instance masks. We also release 261 curated video clips composed of 1293 high-resolution microscopy images to facilitate unsupervised understanding of cell motions and morphology. TYC offers ten times more instance annotations than the previously largest dataset, including cells and microstructures. Our effort also exceeds previous attempts in terms of microstructure variability, resolution, complexity, and capturing device (microscopy) variability. We facilitate a unified comparison on our novel dataset by introducing a standardized evaluation strategy. TYC and evaluation code are publicly available under CC BY 4.0 license.

  • 3 authors
·
Aug 23, 2023

Critical Nodes Identification in Complex Networks: A Survey

Complex networks have become essential tools for understanding diverse phenomena in social systems, traffic systems, biomolecular systems, and financial systems. Identifying critical nodes is a central theme in contemporary research, serving as a vital bridge between theoretical foundations and practical applications. Nevertheless, the intrinsic complexity and structural heterogeneity characterizing real-world networks, with particular emphasis on dynamic and higher-order networks, present substantial obstacles to the development of universal frameworks for critical node identification. This paper provides a comprehensive review of critical node identification techniques, categorizing them into seven main classes: centrality, critical nodes deletion problem, influence maximization, network control, artificial intelligence, higher-order and dynamic methods. Our review bridges the gaps in existing surveys by systematically classifying methods based on their methodological foundations and practical implications, and by highlighting their strengths, limitations, and applicability across different network types. Our work enhances the understanding of critical node research by identifying key challenges, such as algorithmic universality, real-time evaluation in dynamic networks, analysis of higher-order structures, and computational efficiency in large-scale networks. The structured synthesis consolidates current progress and highlights open questions, particularly in modeling temporal dynamics, advancing efficient algorithms, integrating machine learning approaches, and developing scalable and interpretable metrics for complex systems.

  • 8 authors
·
Jul 8, 2025

From Graphs to Hypergraphs: Hypergraph Projection and its Remediation

We study the implications of the modeling choice to use a graph, instead of a hypergraph, to represent real-world interconnected systems whose constituent relationships are of higher order by nature. Such a modeling choice typically involves an underlying projection process that maps the original hypergraph onto a graph, and is common in graph-based analysis. While hypergraph projection can potentially lead to loss of higher-order relations, there exists very limited studies on the consequences of doing so, as well as its remediation. This work fills this gap by doing two things: (1) we develop analysis based on graph and set theory, showing two ubiquitous patterns of hyperedges that are root to structural information loss in all hypergraph projections; we also quantify the combinatorial impossibility of recovering the lost higher-order structures if no extra help is provided; (2) we still seek to recover the lost higher-order structures in hypergraph projection, and in light of (1)'s findings we propose to relax the problem into a learning-based setting. Under this setting, we develop a learning-based hypergraph reconstruction method based on an important statistic of hyperedge distributions that we find. Our reconstruction method is evaluated on 8 real-world datasets under different settings, and exhibits consistently good performance. We also demonstrate benefits of the reconstructed hypergraphs via use cases of protein rankings and link predictions.

  • 2 authors
·
Jan 16, 2024

SyNDock: N Rigid Protein Docking via Learnable Group Synchronization

The regulation of various cellular processes heavily relies on the protein complexes within a living cell, necessitating a comprehensive understanding of their three-dimensional structures to elucidate the underlying mechanisms. While neural docking techniques have exhibited promising outcomes in binary protein docking, the application of advanced neural architectures to multimeric protein docking remains uncertain. This study introduces SyNDock, an automated framework that swiftly assembles precise multimeric complexes within seconds, showcasing performance that can potentially surpass or be on par with recent advanced approaches. SyNDock possesses several appealing advantages not present in previous approaches. Firstly, SyNDock formulates multimeric protein docking as a problem of learning global transformations to holistically depict the placement of chain units of a complex, enabling a learning-centric solution. Secondly, SyNDock proposes a trainable two-step SE(3) algorithm, involving initial pairwise transformation and confidence estimation, followed by global transformation synchronization. This enables effective learning for assembling the complex in a globally consistent manner. Lastly, extensive experiments conducted on our proposed benchmark dataset demonstrate that SyNDock outperforms existing docking software in crucial performance metrics, including accuracy and runtime. For instance, it achieves a 4.5% improvement in performance and a remarkable millionfold acceleration in speed.

  • 5 authors
·
May 23, 2023

One-hot Generalized Linear Model for Switching Brain State Discovery

Exposing meaningful and interpretable neural interactions is critical to understanding neural circuits. Inferred neural interactions from neural signals primarily reflect functional interactions. In a long experiment, subject animals may experience different stages defined by the experiment, stimuli, or behavioral states, and hence functional interactions can change over time. To model dynamically changing functional interactions, prior work employs state-switching generalized linear models with hidden Markov models (i.e., HMM-GLMs). However, we argue they lack biological plausibility, as functional interactions are shaped and confined by the underlying anatomical connectome. Here, we propose a novel prior-informed state-switching GLM. We introduce both a Gaussian prior and a one-hot prior over the GLM in each state. The priors are learnable. We will show that the learned prior should capture the state-constant interaction, shedding light on the underlying anatomical connectome and revealing more likely physical neuron interactions. The state-dependent interaction modeled by each GLM offers traceability to capture functional variations across multiple brain states. Our methods effectively recover true interaction structures in simulated data, achieve the highest predictive likelihood with real neural datasets, and render interaction structures and hidden states more interpretable when applied to real neural data.

  • 5 authors
·
Oct 23, 2023

scE^2TM: Toward Interpretable Single-Cell Embedding via Topic Modeling

Recent advances in sequencing technologies have enabled researchers to explore cellular heterogeneity at single-cell resolution. Meanwhile, interpretability has gained prominence parallel to the rapid increase in the complexity and performance of deep learning models. In recent years, topic models have been widely used for interpretable single-cell embedding learning and clustering analysis, which we refer to as single-cell embedded topic models. However, previous studies evaluated the interpretability of the models mainly through qualitative analysis, and these single-cell embedded topic models suffer from the potential problem of interpretation collapse. Furthermore, their neglect of external biological knowledge constrains analytical performance. Here, we present scE2TM, an external knowledge-guided single-cell embedded topic model that provides a high-quality cell embedding and strong interpretation, contributing to comprehensive scRNA-seq data analysis. Our comprehensive evaluation across 20 scRNA-seq datasets demonstrates that scE2TM achieves significant clustering performance gains compared to 7 state-of-the-art methods. In addition, we propose a new interpretability evaluation benchmark that introduces 10 metrics to quantitatively assess the interpretability of single-cell embedded topic models. The results show that the interpretation provided by scE2TM performs encouragingly in terms of diversity and consistency with the underlying biological signals, contributing to a better revealing of the underlying biological mechanisms.

  • 6 authors
·
Jul 11, 2025

Multi-marginal Schrödinger Bridges with Iterative Reference Refinement

Practitioners frequently aim to infer an unobserved population trajectory using sample snapshots at multiple time points. For instance, in single-cell sequencing, scientists would like to learn how gene expression evolves over time. But sequencing any cell destroys that cell. So we cannot access any cell's full trajectory, but we can access snapshot samples from many cells. Stochastic differential equations are commonly used to analyze systems with full individual-trajectory access; since here we have only sample snapshots, these methods are inapplicable. The deep learning community has recently explored using Schr\"odinger bridges (SBs) and their extensions to estimate these dynamics. However, these methods either (1) interpolate between just two time points or (2) require a single fixed reference dynamic within the SB, which is often just set to be Brownian motion. But learning piecewise from adjacent time points can fail to capture long-term dependencies. And practitioners are typically able to specify a model class for the reference dynamic but not the exact values of the parameters within it. So we propose a new method that (1) learns the unobserved trajectories from sample snapshots across multiple time points and (2) requires specification only of a class of reference dynamics, not a single fixed one. In particular, we suggest an iterative projection method inspired by Schr\"odinger bridges; we alternate between learning a piecewise SB on the unobserved trajectories and using the learned SB to refine our best guess for the dynamics within the reference class. We demonstrate the advantages of our method via a well-known simulated parametric model from ecology, simulated and real data from systems biology, and real motion-capture data.

  • 3 authors
·
Aug 12, 2024

How do neurons operate on sparse distributed representations? A mathematical theory of sparsity, neurons and active dendrites

We propose a formal mathematical model for sparse representations and active dendrites in neocortex. Our model is inspired by recent experimental findings on active dendritic processing and NMDA spikes in pyramidal neurons. These experimental and modeling studies suggest that the basic unit of pattern memory in the neocortex is instantiated by small clusters of synapses operated on by localized non-linear dendritic processes. We derive a number of scaling laws that characterize the accuracy of such dendrites in detecting activation patterns in a neuronal population under adverse conditions. We introduce the union property which shows that synapses for multiple patterns can be randomly mixed together within a segment and still lead to highly accurate recognition. We describe simulation results that provide further insight into sparse representations as well as two primary results. First we show that pattern recognition by a neuron with active dendrites can be extremely accurate and robust with high dimensional sparse inputs even when using a tiny number of synapses to recognize large patterns. Second, equations representing recognition accuracy of a dendrite predict optimal NMDA spiking thresholds under a generous set of assumptions. The prediction tightly matches NMDA spiking thresholds measured in the literature. Our model matches many of the known properties of pyramidal neurons. As such the theory provides a mathematical framework for understanding the benefits and limits of sparse representations in cortical networks.

  • 2 authors
·
Jan 4, 2016