Twitter, a popular microblogging service, has received much attention recently. It is an online social network used by millions of people around the world to stay connected to their friends, family members, and co-workers through their computers and mobile phones.
An important common characteristic of microblogging services is its real-time nature. Although blog users typically update their blogs one every several days, twitter users write tweets several times in a single day. Users can know how other users are doing and often what they are thinking about now, users repeatedly return to the site and check to see what other people are doing. A large number of updates results in numerous reports related to events. They include social events such as parties, baseball games, and presidential campaigns. They also include disastrous events such as storm, fire, traffic jam, riots, heavy rainfall, and earthquakes. Actually, twitter is used for various real-time notifications such as that necessary for help during a large scale fire emergency and live traffic updates.
A brief overview of twitter in japan: the Japanese version of twitter was launched in April 2008. In February 2008, Japan was the No. 2 country with respect to Twitter traffic. At the time of this writing, Japan has the 11th largest number of users (more than half a million users) in the world. Although event detection (particularly the earthquake detection) is currently because of the high density of twitter users and earthquakes in Japan, the study is useful to detect events of various types throughout the world.
An investigation of the real-time nature of twitter and proposes an event notification system that monitors tweets and delivers notification promptly. To obtain tweets on the target event precisely, we apply semantic analysis of a tweet: for example, users might make tweets such as “Earthquake” or “Now it’s shaking” thus earthquake or shaking could be keywords, but users might also make tweets such as “I’m attending an earthquake conference” or “someone is shaking hands with my boss”. We prepare the training data and devise a classified using a support vector machine based on features such as keywords in a tweet, the number of words, and the context of target event words.
We can develop an earthquake reporting system using Japanese tweets. Because of the numerous earthquakes in Japan and the numerous and geographically dispersed twitter users throughout the country, it is sometimes possible to detect an earthquake by monitoring tweets. In other words, many earthquake events occur in Japan. Many sensors are allocated throughout the country. Our system detects an earthquake occurrence and sends an email, possibly before an earthquake actually arrives at a certain location.
Location: An earthquake propagates at about 3-7 km/s. For that reason, a person who is 100km distant from an earthquake has about 20 sec before the arrival of an earthquake wave. Figure 1 portrays a map of twitter users worldwide (obtained from UMBC eBiquity Research Group). Figure 2 depicts a map of earthquake occurrences worldwide (using data from Japan Meteorological Agency (JMA)). It is apparent that the only intersection of the two maps, which means regions with many earthquakes and large twitter users, is Japan (other regions such as Indonesia, Turkey, Iran, and Pacific US cities such as Los Angeles and San Francisco also roughly intersect, although the density is much lower than in Japan).
- LITERATURE SURVEY
In this study, we target event detection. An event is an arbitrary classification of space/time region. An event might have actively participating agents, passive factors, products, and a location in space/time. We target events such as earthquakes, typhoons, and traffic jams which are visible through tweets. These events have several properties:
- They are of large scale (many users experience the event),
- They particularly influence peoples’ daily life (for that reason, they are induced to tweet about it),
- They have both spatial and temporal regions (so that real-time location estimation would be possible).
Semantic Analysis on Tweet
To detect a target event from twitter, we search from twitter and find useful tweets. Tweets might include mentions of the target event. To classify a tweet into a positive or negative class, we use a support vector machine (SVM), which is widely used machine-Learning Algorithm. By preparing positive and negative examples as a training set, we can produce a model to classify tweets automatically into positive and negative categories.
There will be three groups of features for each tweet as follows:
Features A (Statical Features): The number of words in a tweet message, and the position of the query word within a tweet.
Features B (Keyword features): The word in a tweet by the users.
Features C (Word Context Features): The words before and after the query word.
Tweet as a Sensory value
Assumption: Each twitter user is regarded as a sensor. A sensor detects a target event and makes a report probabilistically.
Assumption: Each twitter is associated with a time and location, which is a set of latitude and longitude.
Below figure presents an illustration of the correspondence between sensory data detection and tweet processing.