Kolmogorov-Arnold causal generative models

WisPaper

Scholar Search

Scholar QA

Pricing

TrueCite

Workspace

Home

Blog

Kolmogorov-Arnold causal generative models

[ICLR 2025] KaCGM: Bridging the Gap Between Deep Causal Generative Models and Functional Transparency

Summary

Problem

Method

Results

Takeaways

Abstract

The paper introduces KaCGM (Kolmogorov-Arnold Causal Generative Model), a framework that parameterizes Structural Causal Models (SCMs) using Kolmogorov-Arnold Networks (KANs). It achieves SOTA performance in tabular causal inference, enabling joint observational, interventional, and counterfactual queries while providing transparent, symbolic structural equations.

TL;DR

KaCGM is a new class of causal generative models that replaces "black-box" neural structural equations with Kolmogorov-Arnold Networks (KANs). Unlike previous deep causal models that only offer structural interpretability (the graph), KaCGM provides functional interpretability, allowing users to extract closed-form symbolic equations (e.g., $y = x^{2} + sin (x)$ ) for every causal mechanism while remaining competitive with SOTA models in counterfactual accuracy.

Problem & Motivation: The "Black Box" of Causal Logic

In high-stakes domains like personalized medicine or public policy, knowing that an intervention works is insufficient; one must know why and how. While Structural Causal Models (SCMs) provide the mathematical framework for this, deep learning implementations of SCMs (like Causal Flows or Diffusion-based CGMs) essentially hide the "physics" of the system inside thousands of non-linear weights.

The authors identify a critical gap: Modern CGMs are Query-Agnostic (good) but Functionally Opaque (bad). Regulatory frameworks like the GDPR "Right to Explanation" demand a level of transparency that standard MLPs simply cannot provide.

Methodology: KANs as Structural Causal Mechanisms

The core innovation is the KaCGM architecture. Each endogenous variable $x_{j}$ in the causal graph is modeled as: $x_{j} = K A N (p a (j)) + u_{j}$ where $u_{j}$ is the exogenous noise.

Why KANs?

Unlike standard MLPs that have fixed activation functions on neurons, KANs have learnable univariate functions on edges. This allows the model to:

Prune: Automatically silence irrelevant parent-child influences.
Symbolic Regression: Substitute a learned spline with a human-readable atom (like $exp$ , $lo g$ , or polynomials).
Handle Mixed Data: Through "KaCGM-mix," using Logistic-KANs to model categorical probability mass functions.

KaCGM Architecture Fig 1: The KaCGM pipeline, from the causal graph to the symbolic structural equations.

Experiments & Results

The authors tested KaCGM against heavyweights like DBCM (Diffusion-based) and Causal Flows.

1. The Additivity Trade-off

KaCGM excels when the underlying data-generating process (DGP) is additive. In synthetic tests on 11 different graph structures (Chains, Colliders, Forks), KaCGM reached near-perfect counterfactual MAE, outperforming universal density approximators (CausalFlows) when sample sizes were small (N=100 - 1000).

2. Sensitivity Analysis

A standout contribution is the "Validation Pipeline." Since we don't have ground truth in real-world data, the authors use HSIC (Hilbert-Schmidt Independence Criterion) to test if the inferred noise $u_{j}$ is truly independent of the parents. This acts as a "falsification" test—if the noise isn't independent, the model (and its explanations) shouldn't be trusted.

Performance Visuals Fig 2: Sensitivity analysis showing how HSIC and MMD metrics can detect when the additive noise assumption is violated.

Real-World Case Study: Cardiovascular Risk

Applying KaCGM to a real cardiovascular dataset, the model didn't just predict the risk of Ischemia; it extracted the "Law of Ischemia": $e x t P r (I sc h e mia) = σ (0.055 e x t A g e^{2} + 0.173 e x t A g e + \dots - 1.871)$ This quadratic relationship matches clinical intuition perfectly—cardiovascular risk accelerates non-linearly with age.

Interpretability Tools Fig 3: Probability Radar Plots (PRP) and Partial Dependence Plots (PDP) allowing clinicians to audit specific patient risks.

Critical Analysis & Conclusion

Takeaway: KaCGM proves that we don't need to sacrifice performance for interpretability in tabular causal tasks. By using KANs, the model becomes a "glass-box" that can be symbolically audited.

Limitations:

Additivity Bias: If the true world is highly "entangled" (non-additive or heteroscedastic), KaCGM will struggle compared to black-box Causal Flows.
Scalability: While powerful for tabular data (10-50 variables), extracting symbolic expressions for graphs with hundreds of nodes remains a challenge.

Future Work: The authors suggest integrating KANs into Normalizing Flow architectures, potentially creating a "Universal Interpretable Approximator" for even more complex causal systems.

Find Similar Papers

Try Our Examples

Search for recent papers that combine Kolmogorov-Arnold Networks (KANs) with Normalizing Flows to handle non-additive causal noise.
Which original studies established the identifiability of Non-linear Additive Noise Models (ANMs), and how does the KAN-based approximation specifically vary from spline-based methods used there?
Investigate the application of symbolic causal generative models in medical decision support systems for individualized treatment effect (ITE) estimation.

Contents

[ICLR 2025] KaCGM: Bridging the Gap Between Deep Causal Generative Models and Functional Transparency

1. TL;DR

2. Problem & Motivation: The "Black Box" of Causal Logic

3. Methodology: KANs as Structural Causal Mechanisms

3.1. Why KANs?

4. Experiments & Results

4.1. 1. The Additivity Trade-off

4.2. 2. Sensitivity Analysis

5. Real-World Case Study: Cardiovascular Risk

6. Critical Analysis & Conclusion