変身|Metamorphosis: A turbulent ride towards justice


“There is no democracy in Japan”

Such were the last words I heard Jun Hori utter before he left for Japan, having spent a year at UCLA as a visiting scholar.  His quote paints a picture that is reminiscent of the world created by George Orwell’s dystopian classic “Nineteen-Eighty-Four”.  Admittedly, Japan is a far cry from Oceania, the totalitarian regime that conjured the notion of “Big Brother”, but some of the Orwellian factors may indeed be present in today’s society. Jun is a prominent TV personality, working for NHK, Japan’s only public broadcaster.  His easy personality and his 「甘いマスク」(sweet looks) belies a stubborn drive and determination to rectify all that he deems to be wrong with the Japanese society.  The ongoing dilemna that Jun faces is based on the unique position he stands on:  on the one hand, he works for NHK, the largest, most dominant TV presence in the country, but on the other, he is an independent journalist/film maker, driven by his desire to cover stories that NHK refuses to cover.  The juxtaposition of his two sides puts him in constant ire of his employer, NHK, who finally decided to let him go just a few days ago.  Now, Jun is free from the clutches of the media giant that has kept him grounded, like so many others.  His new-found freedom will certainly enable him to pursue his own passion towards an “open” press, one where the people tell the stories.

Jun Hori's last day at UCLA

Jun Hori’s last day at UCLA was spent putting the finishing touches to his movie

Metamorphosis is a movie, a product fueled by the turmoil Jun has felt working for NHK while covering the disaster.  Media in Japan, he says is a “one way” street. “We create the stories based on provided templates, and discard those that don’t fit”.  The results, he says, are carefully  calibrated stories that reflects the desired views of the government, and not the people.  During the tenuous days that followed the 3.11 disasters, the numerous explosions at Fukushima’s Daiichi Nuclear Power Plant prompted the Government, and TEPCO to make questionable decisions during intensely high pressure situations.  Many of these decisions involved the public’s safety, and many of these discussions were withheld from the very public it was trying to keep safe.

Metamorphosis dives straight into this conundrum, as if to ask the question:  What really happened?  Not only does it do so by touching upon the people’s fury and frustration over the government and TEPCO’s many missteps along the way, but it goes further, by exploring nuclear catastrophies through a historical lens.  This is done by Jun himself journeying across the globe to the United States, and visiting sites of nuclear significance:  Three Mile Island (1979) and the lesser known Santa Suzanna (1959).  There he interviewed local residents and joined them in their on-going town hall meetings between community members and nuclear power plant related employee’s, dialogues he claimed would “never” happen in Japan.

But the true essence of the movie comes about through the disclosure of some shocking revelations. Metamorphosis masterfully succeeds in “connecting the dots”, showing a sequence of interviews with key players:  the city council member put in charge of evacuating his 7000 citizens in Naraha Town, the former secretary to the then Prime Minister Kan and the former TEPCO part time employee who was forced to submit a false resume.  These individual stories connect to form a collective narrative that directly challenges and questions the intent, philosophy and character that was and continues to be present in the ongoing nuclear crisis.

It should also be noted that this movie is a triumph to Jun’s ultimate goal to realize a “people’s” news channel.  Almost half of the material in the movie comes from, you got it, the “people” on the ground.  The grainy youtube clips only adds to the creditability and authenticity of the content.  To underscore the power of social media, the theme song for the opening of the movie was provided by none other than Ryuichi Sakamoto, who himself reached out to Jun through Twitter.  Here is the opening scene, beautifully taken by Atsushi Abe, one of Jun’s twitter followers, highlighted by Ryuichi Sakamoto’s theme song.


Maybe Big Brother is watching, but if Jun is to succeed, the people will prevail in Japan.


Japan Earthquake: Emotions at a glance

What was the country “feeling” after the earthquake?  Was it engulfed in sorrow?  Anger?  Fear?  What effect did the hundreds of aftershocks have on the populace?  In an attempt to answer these questions, a social media analysis can provide a window  into the sentiment that was prevalent at each phase of recovery by visualizing each emotion group played out over time.  The charts below stacks each emotion group, one on top of another.  These was generated using the Protovis Javascript API.  Clicking into any of the emotion charts will allow interaction with their values over time:

(view full screen)

In concordance with previous analyses, the most noticeable observations come in the “fear” emotion on April 7th.  Looking at the “Earthquake magnitude” chart, one can see that the second largest aftershock occurs on that day, bringing meaning to the notion that “fear” was a predominant emotional reaction to an already stressed nation at the time.  One can also depict that while the April 7th 7.1 magnitude earthquake was the largest aftershock since the big one on March 11th, that the country was consistently rocked throughout, averaging more than 10 earthquakes a day.  However, as the earthquake chart reveals, the number of quakes had tailored off considerably over time, perhaps causing it to expose even more shock value to the “big” 7.1 quake, at a time when the people were starting to feel a level of normality.

Creating a Twitter Infographic Using Gephi

This infographic was created through a painstaking process that utilized almost 10 different applications to generate the final result. The main application used to create the word cluster graphic was Gephi, an open source platform that lets you visualize complex networked data elements in a visually compelling and interactive environment.  However, coming up with this particular end result was complicated by various factors, one of which was the complexity that arose from using Japanese characters in its analysis.

The Workflow

Step 1

The first step in this Japan Twitter project was to actually collect and archive the twitter data coming out of Japan after the earthquake.  For this, a cron job was written as a PhP script by David Shepard, a member of the UCLA Digital Humanities Collaborative. The script used the Twitter search API to find and filter tweets based on relevant hashtags, and dumping them into our own MySQL database.  The cron job ran every 3 minutes for 30 days, collecting over 650,000 tweets during this time period.

Once the Twitter data was safely in our MySQL database, I queried out and generated 30 separate text files, one for each day following the earthquake.  Each “day” file consisted of just the tweet text from the thousands of tweets that belonged to that day (on average there were about 20,000 tweets per day).

Here, you can see the number of tweets collected on an hourly basis:

Step 2

In order to capture the range of emotions through the different phases of recovery following the disaster, I followed a methodology employed by Eiji Aramaki from Tokyo University, who took the words from an Emotion Dictionary to extract emotion patterns in a set of text files.  Dr. Aramaki provided me with about 2000 of the most commonly used “emotion” words in the Japanese language, sub-divided into 10 different categories. A separate CSV file for each emotion was generated.

I then used WordSmith, an application that allows you to extract word patterns, to find concurrences of every emotion word against each “day” file.  Through WordSmith’s concordance tool, I was able to run a batch process that matched each of my 10 “emotion” files against each of my 30 “day” files.

Here is a screenshot of WordSmith’s concordance function:

Step 3

The data generated from WordSmith was exported as a series of spreadsheets. These spreadsheets were combined, merged, analyzed, and recalculated to produce a single matrix of emotion words by day. While I was able to do most of the work in Excel, because of varying language character problems, I was forced use Google Spreadsheets, mostly to generate the CSV file format that Gephi requires as an input source file (Excel lost the Japanese text on csv export, while Google did not).

In order to create an emotion “measure” for each day, the spreadsheet generated columns that counted the number of times each keyword was found in each of the 30 days. For example, for word 悲しみ (sadness) was found 0.5 times for every 10,000 tweets on March 11th, 3.1 times on March 12th, 325 times on March 13th, and so on.

Step 4

The heart of the word cluster analysis was conducted in Gephi.  Gephi requires you to define your data in two basic elements:  Nodes and Edges.  For this analysis, I chose to define these as follows:

Nodes:  Every emotion word, and every day was used and defined as a Gephi node

Edges:  Every connection between a “word” and a “day” was defined as an edge, and weighted by how many times that word was found for every 10,000 tweets, for each day.

Here is a screen shot of Gephi’s data view:

Once the data elements were defined, Gephi is ready to visualize (ie, the fun part!).  Gephi comes with many layout templates that you can choose from.  Each layout has its own built in algorithms that take the nodes and edges from your database to generate a network diagram.  I chose to use a layout called “Parallel Force Atlas” (it sure sounds good).  You can choose to size and/or color each node by different data attributes, and do the same for the edges, which serve as the connectors between the nodes.  You then press a button, configure a few parameters (such as “gravity”), and voila! you are introduced to a beautiful infographic.

Step 5

What I then thought would be an easy step to export the graphic and create a web viewer (for panning and zooming the huge image) turned out to be a much bigger task than I anticipated. First of all, the Gephi exporters failed to export the Japanese characters… with one exception: SVG format. For some reason, SVG was the only export format that allowed the Japanese characters to survive. Since I wanted to provide a web interface that allows for zooming and panning the graphic, I ended up choosing one that uses the OpenLayers javascript API, which is predominantly used for geo-spatial data visualizations, but also allows you to use on images.  In order to get the image ready for OpenLayers, I used MapTiler, an application that generates the different image “tiles” that are needed for the different zoom levels.  You can see a full screen version of the final infographic here.

Japan Earthquake: What are they tweeting about?

What are they tweeting about?

One key feature of social media is that it provides a snapshot of a moment’s mood, reflected by the content of what people are tweeting about in real time.  In order to analyze the emotional and psychological state of the nation in the days after the disaster, I have taken the tweet content text in the UCLA archive, and divided them into 30 text files, one for each day following the Earthquake, starting on March 11, 2011.  To measure day to day fluctuations of emotions, I will use a similar methodology employed by Eiji Aramaki PhD (Tokyo University) which takes words from an “Emotion Dictionary” (感情表現辞典) and matches it against the tweet content.  The dictionary classifies different emotions into 10 groups:

  1. 喜び – Happiness
  2. 怒る – Anger
  3. 哀しい – Sad
  4. 怖い – Fear
  5. 恥 – Shame
  6. 好き – Like
  7. 厭 – Unpleasant
  8. 昻 – Nervous
  9. 安 – Relief
  10. 驚く – Surprise
In order to visualize the relationship between various emotions keywords against the different days following the earthquake, a visualization was generated using Gephi.  The words are color coded by emotion type, and line thickness of the connectors represents the strength of the connection between the word and the days.

(view full screen)

Top 20 emotion words:

Word Emotion Category Per 10000 Tweets
1 like 3,242.58
2 relief 1,151.22
3 nervous 324.83
4 嬉し泣き happy 322.78
5 sad 322.78
6 誇る happy 292.07
7 心痛 fear 228.80
8 享楽 happy 121.40
9 relief 121.40
10 like 120.92
11 不安がる fear 104.83
12 傷付く sad 74.57
13 恐怖感 fear 53.37
14 悲しみ sad 51.65
15 愛情 like 47.32
16 難苦 unpleasant 45.29
17 怯れ fear 44.97
18 like 43.35
19 深謝 happy 38.77
20 驚愕 surprise 34.45

Emotions by Day

The following animated chart (press the play button to start it), shows the changes for each emotion category over the 30 days.

(view full screen)

Japan Earthquake: Collecting social media and ushahidi data

<< Part 1: How Twitter was used after the earthquake

In order to understand the impact that Twitter had in the post disaster relief efforts, I will look at two different data sources for analysis.

  1. UCLA’s Hypercities Japan Twitter Archive
    A team from UCLA’s Digital Humanities Group archived twitter feeds for 30 days following the disaster, collecting more than 650,000 tweets.  Using Twitter’s public search API,  Tweets were selected based on the following criteria:

    1. User’s location is in Japan
    2. Included one of the following hashtags
      1. #earthquake
      2. #sendai
      3. #jishin
      4. #tsunami
      5. #eqjp
      6. #pray4japan
      7. #japan
      8. #j_j_helpme
      9. #hinan
      10. #anpi
      11. #daijyoubu
      12. #311care
  2. Sinsai.info database
    Courtesy of Makoto Inoue, administrator for the sinsai.info ushahidi website, this database includes the official incident data of more than 20,000 reports curated and posted by hundreds of volunteers.  More than 80% of the reports came from Twitter.

UCLA’s Twitter Archive

UCLA’s archive was collected over a 30 day period, from March 10 – April 11, via a cron job that queried Twitter’s search API every 3 minutes to collect relevant tweets.  The tweets were subsequently saved on UCLA’s own database server.  While the archive has more than 650,000 records, it is a small portion of the supposed 700 million total tweets recorded during the same time period, but nevertheless represents an accurate sampling of the sentiment presented by the social web during this time.  One thing that should be noted is that the tweets were filtered by user’s locations, focusing only on users based in Japan.

Here’s a look at the raw numbers:

  • 666,552 Total number of tweets collected
  • 232,914 Distinct users
  • 558,040 Retweets (with the word “RT” in the text)
  • 186,697 Distinct tweets
These numbers reveal some interesting Twitter usage statistics:
  • 2.86 Average number of tweets per user during this 30 day period
  • 84% Percentage of tweets that were “retweets”

The following chart shows a temporal display of the number of tweets per hour:

It is interesting to note that the highest number of tweets per hour comes about a month after the earthquake on April 7th at 11:32pm.  This is likely to be due to the occurrence of the second largest aftershock that shook Japan at magnitude 7.1 (there was actually a 7.9 earthquake that followed 30 minutes after the main 9.0 earthquake on March 11th).  At a time when the psychological, emotional and physical state of the nation was still frayed, it portrays the existing fears and distress of the population, through tweets like these:

これ以上東北を苦しめないでくれ。胃が痛い。。 #sendai #jishin

Don’t make Northeast Japan suffer anymore. My stomach hurts.
これって、余震なのかな?新たな別の地震なのかな? #saigai #jishin

Is this an aftershock?  Or a new, different earthquake?
もうなんなの‥?何でこんなに、皆が怖くて辛い思いをしなくちゃいけないの‥?もう十分過ぎる程揺れたじゃん‥皆が何したってゆうの(´;ω;`)もう揺れるのやめてよ(´;ω;`) #jishin

What’s going on?  Why are we made to suffer so much?  Haven’t you shaken us enough?  What have we done to deserve this?

Where are the users from?

One of the criteria of the data collection was to filter those that included a user profile location.  Because of this, we are able to map the location of the users in this sample set during the 30 day period following the earthquake.  Many users had the same location in their profile, accounting for a total of only 14,607 distinct locations (out of a total of 666,507 tweets).  This means that many users had the same location in their profiles.  The following are the top 10 most “popular” user profile locations.  The location with the most users was in Shinjuku, Tokyo, with 24,169 users:

Location Count
1 東京都新宿区市谷本村町5-1 24169
2 東京都千代田区大手町 16346
3 東京都渋谷区神南2−2−1 14981
4 島根県松江市 14857
5 東京都千代田区霞が関 中央合同庁舎5号館 13913
6 渋谷区, 東京都 JP 9297
7 東京都新宿区(Tokyo Shinjuku) 7845
8 東京都千代田区霞が関 7563
9 仙台市, 宮城県 JP 7450
10 Tokyo ときどき Kyoto 7311

Out of the top 10 locations, only 3 are located outside of Tokyo.  In number 4 is an odd Shimane Prefecture.  In number 9 comes Sendai, Miyagi, which was the region most devastateed by the Tsunami.  In number 10 is “Tokyo, sometimes Kyoto”.