Abstract
This research explores application of Natural Language Processing (NLP) methods to better understand and improve customer service feedback in telecom industry. We collect novel reallife datasets of customer reviews and social media comments within British telecommunication industry over one year. Then, we apply three main modelling approaches. First, we study topic modelling for short text: we propose a new evaluation metric for GSDMM model and demonstrate that our metric helps to choose meaningful topics when texts are very brief. Second, we perform an extensive comparative study of word embedding models for text classification: we test nine word embedding and feature engineering methods including Word2Vec, FastText, BERT, Doc2Vec, TF-IDF, together with seven classifiers on small, medium, and large datasets. We propose the feature engineering method that uses the first principal component in place of taking the average of word embeddings which has been commonplace in practical applications of Word2Vec and FastText. We also measure energy consumption and training time of each model to assess the trade-off between accuracy and efficiency. Third, we study the same word embedding methods applied to text clustering: we compare Self-Organising Map (SOM), K-means, K-medoids, BIRCH, and Gaussian Mixture models on different embeddings, and study the effect of various feature engineering approaches on clustering results. Also, we propose new effective formulas for Class based TF-IDF for cluster representation. Our results show that the new proposed hyper parameter tuning method for GSDMM achieves better topic coherence for short reviews in comparison with other clustering approaches. Moreover, the results of the empirical studies demonstrate a superior performance of our proposed PCA based feature engineering method for Word2Vec and Fast Text in the contexts of short text classification and text clustering. The comparisons of word embedding models show that BERT often gives highest classification accuracy but with much higher energy cost, while BIRCH and K-means are robust clustering choices across embedding models. Finally, we present practical guidelines for telecom analysts: choose parsimonious features for faster inference, balance model complexity with energy use, and adopt our evaluation metric when dealing with short feedback from telecom customers. Therefore, this work contributes both methodological improvement for short-text analytics and actionable insights for industry practitioners.
Awarding Institution(s)
University of Plymouth
Supervisor
Craig McNeile, Malgorzata Wojtys
Document Type
Thesis
Publication Date
2026
Embargo Period
2026-04-30
Deposit Date
April 2026
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Recommended Citation
Abdelmotaleb, H. (2026) Leveraging Natural Language Processing Methods for Next Generation Customer Service in Telecommunications Industry. Thesis. University of Plymouth. Retrieved from https://pearl.plymouth.ac.uk/secam-theses/568
