Ziming Liu [email protected] Massachusetts Institute of Technology The NSF Institute for Artificial Intelligence and Fundamental Interactions, Yixuan Wang California Institute of Technology, Sachin Vaidya Massachusetts Institute of Technology, Fabian Ruehle Northeastern University The NSF Institute for Artificial Intelligence and Fundamental Interactions, James Halverson Northeastern University The NSF Institute for Artificial Intelligence and Fundamental Interactions, Marin Soljačić Massachusetts Institute of Technology The NSF Institute for Artificial Intelligence and Fundamental Interactions, Thomas Y Hou California Institute of Technology, Max Tegmark Massachusetts Institute of Technology The NSF Institute for Artificial Intelligence and Fundamental Interactions (2024)
This paper proposes Kolmogorov-Arnold Networks (KANs) as an innovative alternative to traditional Multi-Layer Perceptrons (MLPs). Inspired by the Kolmogorov-Arnold representation theorem, KANs employ learnable activation functions situated on edges, in lieu of fixed activation functions at nodes as seen in MLPs. This modification allows KANs to outperform MLPs in both accuracy and interpretability for small-scale AI + Science tasks. The authors demonstrate that KANs can represent functions more efficiently than MLPs, especially in scenarios involving high-dimensional data. Extensive numerical experiments showcase the theoretical and empirical benefits of KANs, elucidating their potential in scientific inquiries across mathematics and physics. The paper details KAN architecture, scaling laws, interpretability features, and applications in solving partial differential equations and scientific discovery, establishing KANs as a promising tool in the intersection of AI and scientific research.
This paper employs the following methods:
The following datasets were used in this research:
The authors identified the following limitations: