Hotel review rating classification using NLP
Sentiment Analysis as the name suggests is a machine learning technique that allows machines to read through human emotions. Allowing machines to read and understand human emotions and extract useful insights through them is a vital resource for many businesses to grow and develop in their field. Hotel reviews collected from the guests can be classified into three subclasses i.e. positive, negative, or neutral and therefore we can analyze the sentiment of the customer. To extract the frequency of words from the reviews we have used the Term Frequency -Inverse Document Frequency (TFIDF) approach.Hotel review rating classification using NLP
The existing system was built using data mining techniques. In this method, they deal with small-level datasets. Here this system will have a very low accuracy score like 60 to 70 % only.
- Low-level data
- Low accuracy score
Sentiment analysis techniques can be classified into three major categories such as 1. Statistical methods, 2. Knowledge-based methods, and 3. Hybrid techniques. Statistical methods are also known as evolutionary approaches the main concept behind the approach is to find the mutual relationship between two words sharing the same context. It leverages some sort of mathematical representation of the text corpus. A simple approach to statistical methods is to sort say 500 words from a list that occurred most frequently in positive texts only excluding the negative ones and vice versa. Then, we can train a model and check whether there are more positive or negative words. This approach will be statistical since we are not leveraging linguistic insight (generally the distinction from a linguistic or lexical model). The goal of Knowledge-based methods is to extract knowledge by classifying text based on categories explicitly present in words such as awesome, sad, happy, unfortunate, poor, etc. The knowledge bases also extract knowledge from unobvious words such as ‘sympathy’ assigned with particular emotions.
- Improve the accuracy score
- Deal with a large amount of dataset
SOFTWARE and HARDWARE REQUIREMENTS:
- OS? Windows 7, 8, and 10 (32 and 64-bit)
- RAM? 4GB