Sentiment Analysis of Medicine Reviews Using NLP with Machine Learning Models
The need to analyze user generated data over the web has recently gained importance due to the abundance of knowledge which can be acquired by careful analysis of such data. Majority of such data is available via online networking websites like Facebook, Twitter, LinkedIn, etc. The data available in such platforms are in the form of opinions and reviews of products, movies, medications, hotels, etc. Mining and analyzing of such data has become an important aspect for the companies to understand the people’s opinion on a particular subject. There has been enough research done in the application of sentiment analysis across domains like product reviews, movies, hotels, etc. However, utilization of such methodologies in the ﬁeld of medicine has to be given more importance as there are several studies conducted by United States Food and Drug Administration on the eﬀects of adverse drug reactions on patients. Studying the eﬀects of commonly used drugs on patients is important for the pharmaceutical companies to understand the positive and negative eﬀect of drugs on the patients. The motive of this research project is to apply machine learning models for the sentiment analysis of reviews posted by patients to determine the polarity of opinion expressed in the reviews which can be positive or negative.
There are few studies which focus on applying sentiment analysis techniques usingmachine learning models.proposed cross domain sentimentmining model where a model trained in a particular domain can be used as a classifierfor the different domain. Data in the form of reviews for different subjects like camera,laptops, summer camps, lawyers, drugs, radio, restaurant and television were chosen forthe study. Support Vector Machine (SVM) was used as a base classifier. To test theperformance of classifier trained in one domain and its classifying power in another do-main, a new SVM classification model was trained for each model and was used to testthe model on the rest of the dataset with a K-fold of 25. The classification accuracywas calculated as the average of K-fold tests. By building two different ensembles whereone used simple majority vote of its component models for each new classification andanother used weighted majority votes, the authors demonstrated that it is possible to im-prove the accuracy by selecting the cross-domain models with lexicons similar to targetdomain lexicon. This work demonstrated that it is possible to deploy a model trained inone domain to a different domain and achieve an acceptable accuracy in classifying thesentiment.
Onducted sentiment analysis on reviews posted on hearing loss for-ums using naive Bayes, logistic regression with Pipeline Technique. Further analysiswas carried out using traditional bag of words approach. A bag of words approach isthe process of extracting features from the text, it involves extraction of words and the infrequency of occurrence in text. The words are assigned a score based on their polarity.The two methods were compared; in each of the test cases, machine learning algorithmsoutperformed the bag of words approach. Algorithm gave the best performance with the over-all agreement of 0.64 which is a good agreement with the model. This study showcased that machine learning with NLP methodscan perform better when compared to existing traditional sentiment analysis methods.
Sentiment Analysis of Medicine Reviews Using NLP
- Operating System : Windows 7, 8 and 10 (32 and 64 bit)
- Front End : Python
- Packages : numpy, Pandas, itertools, matplotlib, sklearn
- Back End : DataSet
- Processor – Dual Core
- Speed – 1 GHz
- RAM – 4 GB
- Hard Disk – 200 GB
- Key Board – Standard Windows Keyboard
- Mouse – Two or Three Button Mouse
- Monitor – SVG
0.00 average based on 0 ratings
More Things You Might Like This
Abstract: Although the educational level of the Portuguese population has improved in the last decades, the statistics keep Portugal at Europe’s tail end due to its high student failure rates. In particular, lack of success in the core classes of Mathematics and the Portuguese language is extremely serious. On the other hand, the fields of
Abstract: Advances in natural language processing (NLP) and educational technology, as well as the availability of unprecedented amounts of educationally-relevant text and speech data, have led to an increasing interest in using NLP to address the needs of teachers and students. Educational applications differ in many ways, however, from the types of applications for which
Machine Learning based Regression Model for Prediction of Soil Surface Humidity over Moderately Vegetated Fields
Abstract: Agriculture is one of the major revenue producing sectors of India and a source of survival. Numerous seasonal, economic and biological patterns influence the crop production but unpredictable changes in these patterns lead to a great loss to farmers. These risks can be reduced when suitable approaches are employed on data related to soil