[Tutorial] How to analyze text data (e.g. speech, social media, abstract, news)

The file on the screen is a speech by president Donald Trump I will now be importing the txt file of president Donald Trump’s speech
into NetMiner All you need to do is select the file and choose the language for analysis
in order to import the unstructured text data also, you can select additional text processing options For example, you can select and extract certain types of parts of speech, or you can define specific terms with the user dictionary You can also use the dictionary to combine synonyms as one word or designate names of organizations and proper nouns to extract them without change. Also, you can easily get rid of unnecessary words by entering them into the dictionary. After selecting all the filtering options, click ‘OK’ to start the import. Now the import is finished. In the upper-left corner, you can see the well-structured data set. In the lower-left corner, a new workfile is created. You can check the frequency, and number of words used in the speech. You can also extract and save the data by sentence, paragraph, or document Also you can create and save network data from certain words, sentences, paragraphs, and documents The Co-Occurrence of these words can be used to form a network. Let’s try visualizing the most frequently used word! The size of the words in the word cloud are decided by their importance. Now I will create a network using the relationship between keywords. With just one menu, you can easily create a Co-occurrence network. By using words that are often used together, you can compose the adjacency/relatable relationship between the words In a matter of seconds, the Co-occurrence network is created The ‘relationship’ between words that are used together often are shown in one row Now I will be analyzing the main keyword of the Co-occurrence network. I used the Degree Centrality method to analyze the centrality of the network Now I will check the word with the highest Centrality As you can see, words like wealth, movement, and power that weren’t emphasized in terms of frequency can be discovered Now I will check the keyword network map. The size of the dots of the important words that were extracted beforehand are bigger Now I will check the original text where the word ‘wealth’ was used. The Script Workbench function that I’ve just run allows the user to order NetMiner’s functions through a script. Instead of having to click multiple times, the Script Workbench allows the user to run processes automatically with a few lines of script Here I am using keywords to find the cohesion groups within the network and create a network map. Now the whole script is processed. The bigger the Centrality, the bigger the size of the dot, and the extracted clusters from before are shown through the different colors of the dots. The Clusters in the keyword network show which keywords are used together more often, and one cluster represents one topic If you look closely at the blue cluster, you can see words like job, company, and factory, which were some keywords that President Donald Trump used to emphasize his talk about jobs.

Leave a Reply

Your email address will not be published. Required fields are marked *