Enhanced Model Compression for Lipreading Recognition based on Knowledge Distillation Algorithm

Authors

  • Qian Hu Department of Electrical and Electronics Engineering, Faculty of Engineering, University Malaysia Sarawak (UNIMAS), Kota Samarahan, Sarawak, Malaysia
  • Kuryati Kipli Department of Electrical and Electronics Engineering, Faculty of Engineering, University Malaysia Sarawak (UNIMAS), Kota Samarahan, Sarawak, Malaysia
  • Tengku Mohd Afendi Zulcaffle Department of Electrical and Electronics Engineering, Faculty of Engineering, University Malaysia Sarawak (UNIMAS), Kota Samarahan, Sarawak, Malaysia
  • Yuan Liu Institute of Computer and Information Engineering Qilu Institute of Technology Jinan, Shandong, China
  • Xiangju Liu Institute of Computer and Information Engineering Qilu Institute of Technology Jinan, Shandong, China
  • Bo Wang Institute of Computer and Information Engineering Qilu Institute of Technology Jinan, Shandong, China

Keywords:

Lipreading, Deep learning, Model compression, Knowledge distillation

Abstract

Lipreading is understanding what a speaker is saying by observing changes in the speaker's mouth. The lipreading recognition model LipPC-Net proposed in this paper is built with a large Chinese lipreading data set based on Chinese phonetic rules and grammatical features and consists of two main parts: the P2P sub-model and the P2C sub-model. The P2P sub-model is a model for identifying pinyin sequences from pictures, while the P2C sub-model is a model for identifying Chinese character sequences from pinyin. However, Chinese language features are rich and fuzzy, and the training optimization of lip-reading model requires high GPU computation and storage, so it is difficult to realize large-scale application. Therefore, three knowledge distillation compression algorithms are proposed in this paper: Three different knowledge distillation compression algorithms, an offline model compression algorithm based on multi-feature transfer (MTOF), an online model compression algorithm based on adversarial learning (ALON), and an online model compression algorithm based on consistent regularization(CRON) to complete the compression of the Chinese character sequence output by the model.  Three compression algorithms are used to fit and learn the transformation between different features, so that portable mobile terminals with limited hardware resources can carry the model. Thus, it can realize the practical application value of assisting the communication of deaf-mutes.

Downloads

Download data is not yet available.

Downloads

Published

2025-10-28

How to Cite

Hu, Q. ., Kipli, K., Zulcaffle, T. M. A. ., Liu, Y. ., Liu, X. ., & Wang, B. . (2025). Enhanced Model Compression for Lipreading Recognition based on Knowledge Distillation Algorithm. Journal of Advanced Research Design, 145(1), 208–221. Retrieved from https://akademiabaru.com/submit/index.php/ard/article/view/6358
سرور مجازی ایران Decentralized Exchange

Issue

Section

Articles

Most read articles by the same author(s)

فروشگاه اینترنتی