Abstract • Financial fraud is a widespread problem that can cause significant economic losses. Traditional fraud detection methods often rely on manual audits and rules-based systems, which can be time-consuming and error-prone. In recent years, machine learning methods have emerged as a promising approach to automating fraud detection by leveraging large-scale data analysis. This article explores the use of machine learning methods to detect financial fraud by using tax, invoice, and big data. We first introduce the challenges and opportunities of using these data sources for fraud detection, and then survey various machine learning techniques that have been applied to this problem. We also discuss the evaluation metrics and case studies of these methods, and highlight the potential benefits and limitations of using machine learning for fraud detection. Finally, we identify some future research directions and challenges in this area. This article aims to provide a comprehensive overview of the state-of-the-art in using machine learning methods for financial fraud detection, and to inspire further research and development in this important field. Our results show that the F1-socre, AUC, and KS values of the model were ***, *** and ***, respectively.
Key words: machine learning, financial fraud, tax data, invoice data, big data, fraud detection.