Generating Asset Pricing Model via Symbolic Modeling—a Machine Learning-based Approach

Published 27 March, 2025

In a breakthrough for artificial intelligence (AI) and finance, computer scientists from Texas A&M University have developed a machine learning based method called Symbolic Modeling to handle financial asset pricing tasks. Published in The Journal of Finance and Data Science, their approach reduces prediction errors compared to the widely used Fama-French 3-Factor model, while maintaining an acceptable model length.

Classic asset pricing models rely on linear combinations of manually selected financial factors, such as market volatility and company size. While effective, these models struggle with complex market dynamics. The proposed approach instead uses genetic programming and deep learning to automatically generate nonlinear expression that adapt to multiple datasets simultaneously.

“Most of the current financial asset pricing models are like a fixed recipe—you combine predetermined ingredients in set proportions. Symbolic Modeling is more like a master chef who creates entirely new recipes optimized for each diner,” explains Xiangwu Zuo, first author of this study, “The expressions generated by our method take into consideration for hidden relationships between market factors that humans might overlook.”

Zuo and the co-author, Andrew Jiang, generated an asset pricing model based on nearly four decades (1980–2018) of financial data from hundreds of companies, including well-known firms like Coca-Cola and ExxonMobil. The researchers tested their approach across both training and testing data, finding consistent improvements over traditional models.

“Our model achieves lower prediction errors compared to the classic asset pricing models,” shares Zuo. “Beyond just improved prediction accuracy, the model also reduces the “alpha” value—a crucial metric in finance representing unexplained returns.”

Notably, the models generated by the Symbolic Modeling method incorporated nonlinear factors combinations, which are absent in human-designed asset pricing models. “This allows better capture of nonlinear market behaviors,” adds Zuo.

What makes Symbolic Modeling particularly innovative is its ability to discover a unified mathematical model that can represent multiple assets simultaneously. Unlike traditional Symbolic Regression, which would create separate formulas for each company’s data, this new approach finds a single flexible expression that adapts to different datasets by adjusting its coefficients.

The authors see potential AI applications in financial asset pricing, portfolio optimization and development of trading strategies for complex market conditions. “We plan to expand our work by integrating the technique with more machine learning approaches to further enhance AI in finance,” says Zuo.

COMPARISON OF FUNCTIONS DISCOVERED BY SYMBOLIC REGRESSION, AND FUNCTIONS DISCOVERED BY SYMBOLIC MODELING.

Contact author details: Xiangwu Zuo, Department of Computer Science, Texas A&M University, College Station, Texas, USA, dkflame@tamu.edu

Funder: This research was supported in part by NSF Project CCF-2416361. 

Conflict of interest: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. 

See the article: Xiangwu Zuo and Anxiao (Andrew) Jiang, Symbolic Modeling for Financial Asset Pricing, in The Journal of Finance and Data Science, January 2025. doi: https://doi.org/10.1016/j.jfds.2025.100150

Back to News

Stay Informed

Register your interest and receive email alerts tailored to your needs. Sign up below.