Journal of Applied Sciences ›› 2023, Vol. 41 ›› Issue (4): 657-668.doi: 10.3969/j.issn.0255-8297.2023.04.010

• Communication Engineering • Previous Articles     Next Articles

Gaussian Mixture Model Convolution Neural Network Based on Imbalanced Problem

XU Hong1, JIAO Guie2,3, ZHANG Wenjun3   

  1. 1. School of Information, Shanghai Ocean University, Shanghai 201306, China;
    2. School of Information, Shanghai Jianqiao University, Shanghai 201306, China;
    3. Shanghai Film Academy, Shanghai University, Shanghai 200072, China
  • Received:2021-09-25 Published:2023-08-02

Abstract: Imbalanced data classification is a challenging task in big data mining. The distribution of imbalanced data seriously affects the classification performance of models, especially for minority classes. In this paper, an expectation-maximum weighted resampling (EMWRS) algorithm and weighted cross entropy Loss (WCELoss) function are proposed to improve the classification performance of imbalanced data. The proposed approach utilizes a Gaussian mixture model to preprocess the data and employs weighted sampling and cost-sensitive learning to construct a convolutional neural network model. The constructed convolutional neural network is evaluated using F1 and G-mean as indicators, and compared with various classic algorithms such as SMOTE (synthetic minor over sampling technique) and ADASYN (adaptive synthetic sampling) on the adult datasets of UCI (university of California irvine). The experimental results demonstrate that the proposed model outperforms ADASYN and other classical algorithms in terms of F1 and G-mean on UCI adult datasets, which indicates that the proposed model effectively enhances the accuracy of minority classification.

Key words: imbalance data, Gaussian mixture model, sample weighting, cost loss, convolutional neural network

CLC Number: