![]() Then, I processed texts and massaged the data by taking out all the punctuations, signs and numbers with the following code。Īs a result, the data only consisted of tokenized words, which makes it easier to analyze. They store all the words that are parsed from the text files. As a result, we collect corresponding opinion words in the tweets and the count.įirst, I created a positive and negative list in line 5 and line 13 with two downloaded word lists. The idea here is to take each opinion word from the lists, return to the tweets, and count the frequency of each opinion words in the tweets. These two lists contain positive and negative words (sentiment words) that were summarized by Minqing Hu and Bing Liu from research study about presented opinions words in social media. Then we use two opinion word lists to analyze the scraped tweets. Text Mining:īefore getting started, make sure you have Python and a text editor installed on your computer. Welcome to share your innovative crawling experience with me, I am always a passionate learner :)Īfter getting the tweets, export the data as a text file, name the file as “data.txt”. There are also some other ways to crawl the data, and probably you can get a better result than mine. You can scrape as many tweets as possible. Just as simple as it seemed, I got about 10k tweets. I entered “Donald Trump” at the perimeter filed to tell the crawler the keyword. The scraping rule on a template is pre-set with data extraction fields including the Name, ID, Content, Comments and etc. After I logged in, I opened their built-in Twitter template. I downloaded it from its official websites and finished registration by following the instructions. I recommend Octoparse since it is free with no limitation on the number of pages. ![]() Let’s start with web scraping, I need an effective web scraper tool to do all the boring work for me. If you are a beginner, I recommend trying out with your code first before comparing with that in this workshop. Feel free to copy the code and try it yourself. Even you don’t know anything about programming, you should feel comfortable as you read this article. OR/AND IF You know Python but don’t know how to use it for sentiment analysis.IF you don’t know how to scrape contents/comments on social media.Then we will use a combination of text mining and visualization techniques to analyze the public voice about Donald Trump. The goal of this workshop is to use a website scraper to read and pull tweets about Donald Trump. His descriptive words are either highly positive or negative, which are some perfect material for text mining and sentiment analysis. People’s attitude towards him is dramatic and bilateral. However, he has this charismatic sensation effect which occupies most newspapers and social media all the time.
0 Comments
Leave a Reply. |