本文主要针对文本水印技术在中文语境下研究的不足,使用修改式水印与生成式水印两种方案对于中英文文本水印技术进行了实现。利用针对英文的Bert模型和针对中文的WoBert模型,设计了可移植的词替换水印模块,通过替换源文本中指定词元的方式在源文本中嵌入水印信息。对于生成式水印,采用对抗生成式文本水印模型,在中文语料上进行了针对性地修改与迁移,以适应中文文本的语义结构和语言习惯。使用中英文下的人类-ChatGPT对比语料库进行实验,结合准确与语义两方面的文本水印评估指标对2个数据集下不同模型的水印质量进行了评估,以说明水印在多种语料下的有效性。
This study addresses the limitations of text watermarking technology in the Chinese language context, and proposes both modified watermarking and generative watermarking schemes for implementation in English and Chinese. Using the Bert model for English and the WoBert model for Chinese, this study designs a portable word substitution watermarking module, which embeds watermarking information by replacing the specified lexical elements in the source text. For generative watermarking, this study adopts the adversarial generative text watermarking model with targeted modifications and migrations on the Chinese corpus, ensuring compatibility with Chinese semantic structures and linguistic conventions of Chinese text. Experiments are conducted using a human-ChatGPT comparison corpus in both Chinese and English. The effectiveness of the proposed watermarking schemes is evaluated based on text watermarking evaluation metrics in terms of both accuracy and semantics. Results demonstrate the proposed methods’ effectiveness in enhancing watermark robustness and traceability in multilingual text.
[1] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [J]. Advances in Neural Information Processing Systems, 2017, 30: 5997-6008.
[2] Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners [EB/OL]. [2024-10-30]. https://cdn.openai.com/better-language-models/language_models_ are_unsupervised_multitask_learners.pdf.
[3] Bai Y, Jones A, Ndousse K, et al. Training a helpful and harmless assistant with reinforcement learning from human feedback [DB/OL]. (2022-04-22) [2024-10-30]. http://arxiv.org/abs/2204.05862.
[4] Touvron H, Lavril T, Izacard G, et al. Llama: open and efficient foundation language models [DB/OL]. (2023-02-27) [2024-10-30]. http://arxiv.org/abs/2302.13971.
[5] Black S, Biderman S, Hallahan E, et al. GPT-Neox-20B: an open-source autoregressive language model [DB/OL]. (2022-04-14) [2024-10-30]. http://arxiv.org/abs/2204.06745.
[6] Firdhous M F M, Elbreiki W, Abdullahi I, et al. WormGPT: a large language model Chatbot for criminals [C]//202324th International Arab Conference on Information Technology (ACIT). IEEE, 2023: 1-6.
[7] Liu A, Pan L, Lu Y, et al. A survey of text watermarking in the era of large language models [J]. ACM Computing Surveys, 2024, 57(2): 1-36.
[8] Brassil J T, Low S, Maxemchuk N F, et al. Electronic marking and identification techniques to discourage document copying [J]. IEEE Journal on Selected Areas in Communications, 1995, 13(8): 1495-1504.
[9] Por L Y, Wong K S, Chee K O. UniSpaCh: a text-based data hiding method using Unicode space characters [J]. Journal of Systems and Software, 2012, 85(5): 1075-1082.
[10] Sato R, Takezawa Y, Bao H, et al. Embarrassingly simple text watermarks [DB/OL]. (2023- 10-13) [2024-10-30]. http://arxiv.org/abs/2204.06745.
[11] 刘豪, 孙星明, 刘晋飚. 基于字体颜色的文本数字水印算法[J]. 计算机工程, 2005, 31(15): 129-131. Liu H, Sun X M, Liu J B. Color-based watermarking algorithm for text documents [J]. Computer Engineering, 2005, 31(15): 129-131.(in Chinese)
[12] Topkara U, Topkara M, Atallah M J. The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions [C]//8th Workshop on Multimedia and Security, 2006: 164-174.
[13] Munyer T, Tanvir A, Das A, et al. DeepTextMark: a deep learning-driven text watermarking approach for identifying large language model generated text [DB/OL]. (2023-05-09) [2024-10- 30]. http://arxiv.org/abs/2305.05773.
[14] Abdelnabi S, Fritz M. Adversarial watermarking transformer: towards tracing text provenance with data hiding [C]//2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021: 121-140.
[15] Sun Z, Du X, Song F, et al. Coprotector: protect open-source code against unauthorized training usage with data poisoning [C]//ACM Web Conference, 2022: 652-660.
[16] Kirchenbauer J, Geiping J, Wen Y, et al. A watermark for large language models [C]//International Conference on Machine Learning, 2023: 17061-17084.
[17] Christ M, Gunn S, Zamir O. Undetectable watermarks for language models [C]//The Thirty Seventh Annual Conference on Learning Theory, 2024: 1125-1139.
[18] Guo B, Zhang X, Wang Z, et al. How close is ChatGPT to human experts? comparison corpus, evaluation, and detection [DB/OL]. (2023-01-18) [2024-10-30]. http://arxiv.org/abs/ 2301.07597.