← ML Research Wiki / 2404.19756

KAN: Kolmogorov-Arnold Networks

Ziming Liu [email protected] Massachusetts Institute of Technology The NSF Institute for Artificial Intelligence and Fundamental Interactions, Yixuan Wang California Institute of Technology, Sachin Vaidya Massachusetts Institute of Technology, Fabian Ruehle Northeastern University The NSF Institute for Artificial Intelligence and Fundamental Interactions, James Halverson Northeastern University The NSF Institute for Artificial Intelligence and Fundamental Interactions, Marin Soljačić Massachusetts Institute of Technology The NSF Institute for Artificial Intelligence and Fundamental Interactions, Thomas Y Hou California Institute of Technology, Max Tegmark Massachusetts Institute of Technology The NSF Institute for Artificial Intelligence and Fundamental Interactions (2024)

Paper Information

arXiv ID

2404.19756

Venue

arXiv.org

Domain

Artificial Intelligence, Deep Learning, Scientific Computing

SOTA Claim

Yes

Reproducibility

8/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs).While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights").KANs have no linear weights at all -every weight parameter is replaced by a univariate function parametrized as a spline.We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability, on small-scale AI + Science tasks.For accuracy, smaller KANs can achieve comparable or better accuracy than larger MLPs in function fitting tasks.Theoretically and empirically, KANs possess faster neural scaling laws than MLPs.For interpretability, KANs can be intuitively visualized and can easily interact with human users.Through two examples in mathematics and physics, KANs are shown to be useful "collaborators" helping scientists (re)discover mathematical and physical laws.In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today's deep learning models which rely heavily on MLPs.

Summary

This paper proposes Kolmogorov-Arnold Networks (KANs) as an innovative alternative to traditional Multi-Layer Perceptrons (MLPs). Inspired by the Kolmogorov-Arnold representation theorem, KANs employ learnable activation functions situated on edges, in lieu of fixed activation functions at nodes as seen in MLPs. This modification allows KANs to outperform MLPs in both accuracy and interpretability for small-scale AI + Science tasks. The authors demonstrate that KANs can represent functions more efficiently than MLPs, especially in scenarios involving high-dimensional data. Extensive numerical experiments showcase the theoretical and empirical benefits of KANs, elucidating their potential in scientific inquiries across mathematics and physics. The paper details KAN architecture, scaling laws, interpretability features, and applications in solving partial differential equations and scientific discovery, establishing KANs as a promising tool in the intersection of AI and scientific research.

Methods

This paper employs the following methods:

KAN
MLP

Models Used

KAN
MLP

Datasets

The following datasets were used in this research:

Feynman_no_units
Knot Theory
Special Functions

Evaluation Metrics

Accuracy
RMSE
Pareto Frontier

Results

KANs outperform MLPs in accuracy and interpretability on small-scale AI + Science tasks.
KANs possess better neural scaling laws than MLPs.
KANs can facilitate rediscovery of mathematical and physical laws.

Limitations

The authors identified the following limitations:

KANs require slower training times compared to MLPs.
Current implementations may still be inefficient in complex environments.

Technical Requirements

Number of GPUs: None specified
GPU Type: None specified

Keywords

Kolmogorov-Arnold Networks KON interpretability scientific discovery symbolic regression neural scaling laws

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 143
Influential Citations: 87

KAN: Kolmogorov-Arnold Networks

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers