基于FinalBlock与JRC的双流点击率预测模型

doi:10.3969/j.issn.0255-8297.2025.05.004

摘要/Abstract

摘要： 点击率（click-through rate,CTR）预测是推荐系统中的基本任务之一。双流模型凭借其出色的灵活性和扩展性，以及高效的信息交互与融合能力，广泛应用于主流推荐模型中。为进一步提升其在CTR预测中的性能表现，本文在双流模型结构基础上提出了一种融合因子交互模块（factorized interaction block,FinalBlock）和校准排序损失联合优化算法（jointranking and calibration loss optimization algorithm,JRC）的FJ混合网络（FinalBlock-JRChybrid network,FJHN）模型。首先，通过特征门控层实现差异化特征输入，提升重要特征的权重，并将FinalBlock与多层感知机组合，以强化高阶特征的交互学习能力；其次，采用增强型交互聚合层来融合流级输出，进一步加深特征交互程度；最后，应用改进后的JRC模型计算损失函数，有效提升模型的预测准确性及多应用场景下的适应能力。基于3个公开基准数据集的实验结果表明，与包括自注意力模型在内的多种主流模型相比，FJHN模型在性能上提升显著。

关键词: 特征门控, 分层交互, 流级融合, 排序损失, 交叉熵损失

Abstract: Click-through rate (CTR) prediction is one of the fundamental tasks in recommendation systems. Dual-stream models have been widely adopted in mainstream recommendation frameworks due to their superior flexibility, scalability, and efficiency in information interaction and fusion. To further enhance CTR prediction performance, this paper proposes the FJ hybrid network (FinalBlock-JRC hybrid network, FJHN), which integrates the factorized interaction block (FinalBlock) and the joint ranking and calibration loss optimization algorithm (JRC) based on the structure of the dual-stream model. First, a feature gating layer is introduced to enable differentiated feature inputs, thereby enhancing the representation of important features. Then, FinalBlock is combined with a multilayer perceptron (MLP) to strengthen high-order feature interaction learning. Furthermore, an enhanced interaction aggregation layer is employed to fuse the outputs of each tower, deepening the degree of feature interaction. Finally, an improved JRC mechanism is applied to compute the loss function, which effectively improves the model’s prediction accuracy and adaptability across diverse application scenarios. Experimental results on three publicly available benchmark datasets demonstrate that compared with several mainstream models including self-attention model (SAM), the FJHN model achieves noticeable performance gains in CTR prediction.

Key words: feature gating, hierarchical interaction, stream-level fusion, ranking loss, cross entropy loss

中图分类号:

TP391

巫辰伟, 禹素萍, 范红, 许武军. 基于FinalBlock与JRC的双流点击率预测模型[J]. 应用科学学报, 2025, 43(5): 757-770.

WU Chenwei, YU Suping, FAN Hong, XU Wujun. A Dual-Stream Click-Through Rate Prediction Model Based on FinalBlock and JRC[J]. Journal of Applied Sciences, 2025, 43(5): 757-770.

参考文献

[1] Mao K L, Zhu J M, Su L C, et al. FinalMLP: an enhanced two-stream MLP model for CTR prediction [J]. AAAI Conference on Artificial Intelligence, 2023, 37(4): 4552-4560.
[2] Rendle S, Krichene W, Zhang L, et al. Neural collaborative filtering vs. matrix factorization revisited [C]//Fourteenth ACM Conference on Recommender Systems, 2020: 240-248.
[3] Wang Fu B, Fu G, et al. Deep & cross network for ad click predictions [C]//2017 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ADKDD’17), 2017: 1-7.
[4] Guo H, Tang R, Ye Y, et al. DeepFM: a factorization-machine based neural network for CTR prediction [DB/OL]. (2017-03-17) [2024-11-30]. https://arxiv.org/abs/1703.04247.
[5] Lian J X, Zhou X H, Zhang F Z, et al. xDeepFM: combining explicit and implicit feature interactions for recommender systems [C]//24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018: 1754-1763.
[6] Juan Y, Zhuang Y, Chin W S, et al. Field-aware factorization machines for CTR prediction [C]//10th ACM Conference on Recommender Systems, 2016: 43-50.
[7] Liu B, Zhu C X, Li G L, et al. AutoFIS: automatic feature interaction selection in factorization models for click-through rate prediction [C]//26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020: 2636-2645.
[8] Franklin J. The elements of statistical learning: data mining, inference and prediction [J]. The Mathematical Intelligencer, 2005, 27(2): 83-85.
[9] Lin Z T, Pan J W, Zhang S Y, et al. Understanding the ranking loss for recommendation with sparse user feedback [C]//30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024: 5409-5418.
[10] Li C, Lu Y, Mei Q Z, et al. Click-through prediction for advertising in twitter timeline [C]//21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015: 1959-1968.
[11] Bai A J, Jagerman R, Qin Z, et al. Regression compatible listwise objectives for calibrated ranking with binary relevance [C]//32nd ACM International Conference on Information and Knowledge Management, 2023: 4502-4508.
[12] Yue Y G, Xie Y P, Wu H S, et al. Learning to rank for push notifications using pairwise expected regret [DB/OL]. (2022-01-19) [2024-11-30]. https://arxiv.org/abs/2201.07681.
[13] Zhu J M, Jia Q L, Cai G H, et al. FINAL: factorized interaction layer for CTR prediction [C]//Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023: 2006-2010.
[14] Sheng X R, Gao J, Cheng Y, et al. Joint optimization of ranking and calibration with contextualized hybrid model [C]//29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023: 4813-4822.
[15] Guo H F, Chen B, Tang R M, et al. An embedding learning framework for numerical features in CTR prediction [C]//27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021: 2910-2918.
[16] Zhu J M, Liu J Y, Yang S, et al. Open benchmarking for click-through rate prediction [C]//30th ACM International Conference on Information & Knowledge Management, 2021: 2759-2769.
[17] Zhou G R, Mou N, Fan Y, et al. Deep interest evolution network for click-through rate prediction [J]. AAAI Conference on Artificial Intelligence, 2019, 33(1): 5941-5948. 18] Zhou G R, Zhu X Q, Song C R, et al. Deep interest network for click-through rate prediction [C]//24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018: 1059-1068.
[19] Vaswani A, Shazeern, Parmar N, et al. Attention is all you need [J]. Advances in Neural Information Processing Systems, 2017, 30: 5998–6008
[20] Rendle S. Factorization machines [C]//2010 IEEE International Conference on Data Mining, 2010: 995-1000.
[21] Gong J J, Qiu X P, Chen X C, et al. Convolutional interaction network for natural language inference [C]//2018 Conference on Empirical Methods in Natural Language Processing, 2018: 1576-1585.
[22] Li Z K, Cui Z Y, Wu S, et al. Fi-GNN: modeling feature interactions via graph neural networks for CTR prediction [C]//28th ACM International Conference on Information and Knowledge Management, 2019: 539-548.
[23] Cheng Y, Xue Y B. Looking at CTR prediction again: is attention all you need? [C]//44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021: 1279-1287.
[24] Liu T Y. Learning to rank for information retrieval [J]. Foundations and Trends in Information Retrieval, 2009, 3(3): 225-331.
[25] Burges C, Shaked T, Renshaw E, et al. Learning to rank using gradient descent [C]//22nd International Conference on Machine Learning, 2005: 89-96.
[26] Cao Z, Qin T, Liu T Y, et al. Learning to rank: from pairwise approach to listwise approach [C]//24th International Conference on Machine Learning, 2007: 129-136.
[27] Kuo J W, Cheng P J, Wang H M. Learning to rank from Bayesian decision inference [C]//18th ACM Conference on Information and Knowledge Management, 2009: 827-836.
[28] Swezey R, Grover A, Charron B, et al. Pirank: scalable learning to rank via differentiable sorting [J]. Advances in Neural Information Processing Systems, 2021, 34: 21644-21654.
[29] Cheng W Y, Shen Y Y, Huang L P. Adaptive factorization network: learning adaptive-order feature interactions [J]. AAAI Conference on Artificial Intelligence, 2020, 34(4): 3609-3616.
[30] Cheng H T, Koc L, Harmsen J, et al. Wide & deep learning for recommender systems [C]//1st Workshop on Deep Learning for Recommender Systems, 2016: 7-10.
[31] Wang Z, She Q, Zhang J. Masknet: introducing feature-wise multiplication to CTR ranking models by instance-guided mask [DB/OL]. (2021-02-09) [2024-11-30]. https://arxiv.org/abs/2102.07619.
[32] Xiao J, Ye H, He X, et al. Attentional factorization machines: learning the weight of feature interactions via attention networks [DB/OL]. (2017-08-15) [2024-11-30]. https://arxiv.org/abs/ 1708.04617.
[33] Wang R, Shivanna R, Cheng D Z, et al. DCN-M: improved deep & cross network for feature cross learning in web-scale learning to rank systems [DB/OL]. (2020-08-19) [2024-11-30]. https://arxiv.org/abs/2008.13535.
[34] Wang R X, Shivanna R, Cheng D, et al. DCN V2: improved deep & cross network and practical lessons for web-scale learning to rank systems [C]//The Web Conference 2021, 2021: 1785-1797.
[35] Song W P, Shi C C, Xiao Z P, et al. AutoInt: automatic feature interaction learning via self-attentive neural networks [C]//28th ACM International Conference on Information and Knowledge Management, 2019: 1161-1170.