← Return to homepage

Thesis

Sentiment Analysis from a Bengali Dataset Using Hybrid Deep Learning and Contextualized Word Embedding Techniques.

Abstract

The substantial increase Of online has significantly expanded the scope of sentiment analysis, offering deeper insights into user sentiments and percepticms, However, the Bangla NLP domain faces a critical shortage Of standardized labeled data, hindering the development Of acurate sentiment analysis models. Current Bangla research heavily relies on context-independent word embeddings like Word2Vec, GloVe, and fastText, limiting the nuanced understanding of language.

In response to challenges, this thesis introduces a novel by integrating BERT's transfer learning into a CNN-BiLSTM model to enhance Bangla sentinrnt analysis. Bangla BERT is utilized as an alternative fcr Word2Vec, GloVe, and FastText, offering improved performance in extracting contextual vectors. The CNN-BiLSTM model, known for its sophistication, serves as an alternative to traditional machine learning and learning models.

Additionally, the thesis investigates effectiveness of the proposed model by implementing eight machine learning models and cornparing their peformance. The accuracy results ofthe models are as follows: LR-88.96%, DT-80.58%, RE-86.23%, MNB-86.89%, KNN-79.92%, Linear SVM-89.08%, RBF SVM-90.53%, SGD- 89.26%, while achieving 97% accuracy for the CNN-BiLSTM model.

Innovatively, the thesis also explores a task of sentiment based on scores, offering a nuanced understanding of sentiment intensity within sentences. This task enhances the interpretability depth of sentiment analysis outcomes, providing insights into the intensity of sentiment within Bangla text. The integration Of Bangla BERT with a CNN-BiLSTM model represents a significant advancement in Bangla sentiment analysis, addressing key challenges and introducing innovative methodologies. Through this research, the sentiment analysis techniques in the Bangla language dotnain are poised to achieve greater accuracy and depth, facilitating improved understanding of user sentiments in online content.