Ensemble of winning tickets: pruning bidirectional encoder from the transformers attention heads for enhanced model efficiency

Nyalalani Smarts, Rajalakshmi Selvaraj, Venu Madhav Kuthadi

Abstract


The advanced models of deep neural networks like bidirectional encoder from the transformers (BERT) and others, poses challenges in terms of computational resources and model size. In order to tackle these issues, techniques of model pruning have surfaced as the most useful methods in addressing the issues of model complexity. This research paper explores the concept of pruning BERT attention heads across the ensemble of winning tickets in order to enhance the efficiency of the model without sacrificing performance. Experimental evaluations show how effective the approach is, in achieving significant model compression while still maintaining competitive performance across different natural language processing tasks. The key findings of this study include model size that has been reduced by 36%, with our ensemble model reaching greater performance as compared to the baseline BERT model on both Stanford Sentiment Treebank v2 (SST-2) and Corpus of Linguistic Acceptability (CoLA) datasets. The results further show a F1-score of 94% and 96%, respectively, and accuracy scores of 95% and 96% on the two datasets. The findings of this research paper contribute to the ongoing efforts in enhancing the efficiency of large-scale language models.

Keywords


Attention heads; Bidirectional encoder from the transformers; Lottery tickets hypothesis; Natural language processing; Pruning

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v15i2.pp2070-2080

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).