Creating a Twitter Infographic Using Gephi

This infographic was created through a painstaking process that utilized almost 10 different applications to generate the final result. The main application used to create the word cluster graphic was Gephi, an open source platform that lets you visualize complex networked data elements in a visually compelling and interactive environment.  However, coming up with this particular end result was complicated by various factors, one of which was the complexity that arose from using Japanese characters in its analysis.

The Workflow

Step 1

The first step in this Japan Twitter project was to actually collect and archive the twitter data coming out of Japan after the earthquake.  For this, a cron job was written as a PhP script by David Shepard, a member of the UCLA Digital Humanities Collaborative. The script used the Twitter search API to find and filter tweets based on relevant hashtags, and dumping them into our own MySQL database.  The cron job ran every 3 minutes for 30 days, collecting over 650,000 tweets during this time period.

Once the Twitter data was safely in our MySQL database, I queried out and generated 30 separate text files, one for each day following the earthquake.  Each “day” file consisted of just the tweet text from the thousands of tweets that belonged to that day (on average there were about 20,000 tweets per day).

Here, you can see the number of tweets collected on an hourly basis:

Step 2

In order to capture the range of emotions through the different phases of recovery following the disaster, I followed a methodology employed by Eiji Aramaki from Tokyo University, who took the words from an Emotion Dictionary to extract emotion patterns in a set of text files.  Dr. Aramaki provided me with about 2000 of the most commonly used “emotion” words in the Japanese language, sub-divided into 10 different categories. A separate CSV file for each emotion was generated.

I then used WordSmith, an application that allows you to extract word patterns, to find concurrences of every emotion word against each “day” file.  Through WordSmith’s concordance tool, I was able to run a batch process that matched each of my 10 “emotion” files against each of my 30 “day” files.

Here is a screenshot of WordSmith’s concordance function:

Step 3

The data generated from WordSmith was exported as a series of spreadsheets. These spreadsheets were combined, merged, analyzed, and recalculated to produce a single matrix of emotion words by day. While I was able to do most of the work in Excel, because of varying language character problems, I was forced use Google Spreadsheets, mostly to generate the CSV file format that Gephi requires as an input source file (Excel lost the Japanese text on csv export, while Google did not).

In order to create an emotion “measure” for each day, the spreadsheet generated columns that counted the number of times each keyword was found in each of the 30 days. For example, for word 悲しみ (sadness) was found 0.5 times for every 10,000 tweets on March 11th, 3.1 times on March 12th, 325 times on March 13th, and so on.

Step 4

The heart of the word cluster analysis was conducted in Gephi.  Gephi requires you to define your data in two basic elements:  Nodes and Edges.  For this analysis, I chose to define these as follows:

Nodes:  Every emotion word, and every day was used and defined as a Gephi node

Edges:  Every connection between a “word” and a “day” was defined as an edge, and weighted by how many times that word was found for every 10,000 tweets, for each day.

Here is a screen shot of Gephi’s data view:

Once the data elements were defined, Gephi is ready to visualize (ie, the fun part!).  Gephi comes with many layout templates that you can choose from.  Each layout has its own built in algorithms that take the nodes and edges from your database to generate a network diagram.  I chose to use a layout called “Parallel Force Atlas” (it sure sounds good).  You can choose to size and/or color each node by different data attributes, and do the same for the edges, which serve as the connectors between the nodes.  You then press a button, configure a few parameters (such as “gravity”), and voila! you are introduced to a beautiful infographic.

Step 5

What I then thought would be an easy step to export the graphic and create a web viewer (for panning and zooming the huge image) turned out to be a much bigger task than I anticipated. First of all, the Gephi exporters failed to export the Japanese characters… with one exception: SVG format. For some reason, SVG was the only export format that allowed the Japanese characters to survive. Since I wanted to provide a web interface that allows for zooming and panning the graphic, I ended up choosing one that uses the OpenLayers javascript API, which is predominantly used for geo-spatial data visualizations, but also allows you to use on images.  In order to get the image ready for OpenLayers, I used MapTiler, an application that generates the different image “tiles” that are needed for the different zoom levels.  You can see a full screen version of the final infographic here.

49 thoughts on “Creating a Twitter Infographic Using Gephi

  1. Hello dear Yohman,

    I’m a graduate student of physics. At this time I assist to a course on Complex Networks, and the purpose of the course is to generate a relatively “short” investigation about some interest applications of the theory of complex networks, like those which we can find in the analysis social media, like twitter, in my particular case.

    I’m specifically interested on the issue of trying to find a concrete measure of the impact of this social web media (twitter) in the real society, as a central objective of the project. I’m not a developer, I don’t know how to manage things like PHP, MySQL, java, etc. (thought I already do programs in another languages, for the most standard solution of physics problems) yet I’ve been trying it has been a very difficult and time-consuming task.

    For all this is that I ask for your valuable help on the issue, because I saw that you have passed a not so comfortable time in the minning process and the subsequent steps for treat your data.

    I would be very indebted with you, if you can give me any more specific information about the concrete way in which I can do which you did. If possible, and I know this may be too much to ask, I will be content with a simple explanation in the manner of a short guide, for building any of your initial steps like that cron job dedicated to collect the data, as this is the more difficult of the steps for me to do in the short time since the request of the work (one and a half month ago).

    Thanks for your attention, and I’m grateful in advance for your friendly help.

    -Alvaro Diaz.

  2. Pingback: ¡¾ËÍÁÏŸoÁÏ¡¿Öؤͤê¤Ü¤ó¸¶¤­2way´óÈ˥ѥó¥×¥¹¥ì¥Ç¥£©`¥¹/¥Ñ¥ó¥×¥¹/5.0cm¥Ò©`¥ë/22.0cm/µÍ·´°k/Í´¤¯¤Ê¤¤/¥¦¥¨¥Ã¥¸/ÈÕ±¾Ñu¡¾smtb-KD¡¿£º¥¢¥ß¥¢¥ß£¨‹DÈËÑ¥¤Èëj

  3. Pingback: ɽÉÆ(YAMAZEN) ¥­¥ã¥ó¥Ñ©`¥º¥³¥ì¥¯¥·¥ç¥ó ¥Þ¥¹¥³¥Ã¥È¥Æ©`¥Ö¥ë(·ù56°ÂÐÐ34) YMT-3456(NBE) ¥ì¥¸¥ã©`¥Æ©`¥Ö¥ë Õۤꤿ¤¿¤ß¥Æ©`¥Ö¥ë ¥­¥ã¥ó¥× ¥¢¥¦¥È¥É¥¢ ¥Ð©`¥Ù¥­

  4. Pingback: ¥¿¥«¥·¥ç©`¡¡¥¬©`¥Ç¥Ë¥ó¥°£¯ˆ@Ü¿Óá¡¡¸¥Ï¥ó¥®¥ó¥°¥Ö¥é¥±¥Ã¥È¡¡ÖùÓṡ¡1‚€¡¡¡¾Öù¤Î¥µ¥¤¥º£º¡õ4cm¡«¡õ10cm¡¿¡¡Öù¤òЮ¤ß¤³¤ó¤Çʹ¤¦¥Ö¥é¥±¥Ã¥È?£

  5. Pingback: 【送料無料&あす楽対応】2015年新作Rayモデル南里みきちゃん着用水着レデ

  6. Pingback: 【02P04Jul15】DUEL 洋釣クッション R×RS 2.5mm×1m 1本入:釣具の通販 南紀屋楽天

  7. Pingback: 車 ホーン ミツバサンコーワ ミツバサンコーワ エアロスパイラル2 MH13A-011A【

  8. Pingback: 2015S/S 待望の新作!!コンセプトは「大人が着たいスウェット」Pierrotオリジナ

  9. Pingback: 【激安市場】コーチ COACH★レビューを書くと送料無料!小物(カードケース)

  10. Pingback: 【今ダケ送料無料】★★☆お家で洗える&UVカット機能付き!柔らか綿モ

  11. Pingback: ペッツルート トマト煮込み 鶏のひざ軟骨 20g×2袋 犬用おやつ【D】[AA

  12. Pingback: GINGER公式コラボ!『めちゃ楽』シリーズ!【イントレチャート型押し?メッ

  13. Pingback: 釣具のポイント 熊本,がまかつがま磯 グレ競技スペシャル3III/ 1.25号5.3m / gamakatsu/ 磯竿

  14. Pingback: 【楽天市場】新柄入荷 夜光 エギ 2.5号 15本 セット イカ釣り エギング 餌木【R2

  15. Pingback: DUEL YO-ZURI アオリーQ RS シルバーコノシロ SKS 3.5号:釣具の通販

  16. Pingback: 【大特価】【コーティング弦】Elixir エリクサー NANOWEB #14077 Medium 45-105 Long Scale【エレキベース

  17. Pingback: ◆ 激安 ! ◆到着後レビューを書いて 送料無料 ! バンダイ BANDAI 新品 スマイルプ

  18. Pingback: 送料無料!2015年 新作【レトロモダン 高級変わり織り浴衣 白地に碧すとら

  19. Pingback: ARCOROC アルコロック クワドロピッチャー 冷蔵庫用ピッチャー(水差し

  20. Pingback: APRO卓上型扇風機(3枚羽根) KDF-S23OR オレンジ (KDFS23OR):ソフマップ楽天市場

  21. Pingback: 七五三 男の子 5歳 羽織着物アンサンブルセット 黒地波と鷹 【BH】【そら

  22. Pingback: AH-SoftwareVOCALOID3 Library 東北ずん子(パッケージ版)【渋谷店】:イシバシ楽器 WEB 

  23. Pingback: ちょっと大人の香り?サンダルゴールド(白檀?SANDALGOLD)香(お香/インセンス/ア

  24. Pingback: 【激安市場】コーチ COACH★レビューを書くと送料無料!財布(コインケース)

  25. Pingback: 【送料無料】【在庫限りポイント10倍】オリムピック インテグラ 怒濤磯 1.

  26. Pingback: FIELD POINTアニマルフード付携帯用レインコート Mサイズブタ?カエル?クマ?ヒヨ

  27. Pingback: 【楽天市場】GEECRACK/ジークラック DOUBLEDUTCH/ダブルダッチ DEFENDER/ディフェンダー D

  28. Pingback: ダイニチ工業 カートリッジタンク(ワンタッチ汚れんキャップ付き) 81231

  29. Pingback: 【送料無料】三ツ星貿易 MA-6058SL-WH ソフトブラウン [冷凍庫 (58L)]【メーカー直送】

  30. Pingback: K18 ゴールドネックレス ハート【ゴールドアクセサリー/ジュエリー/ハート

  31. Pingback: 【激安市場】コーチ COACH バッグ(ハンドバッグ) レビューを書くと送料無料 F340

  32. Pingback: 【激安市場】【P最大32倍&クーポン】撮影 照明 撮影照明セット 【5灯ソケ

  33. Pingback: 【02P04Jul15】サンライン 海平 14号 500m スチールグレー:釣具の通販 南

  34. Pingback: NEWプチ?クッキーミルクカルシウム入 50g 【LP】【TC】【RCP】【hl150515】:快適ペッ

  35. Pingback: 【送料無料】 ブリヂストン 高硬度4つ折りマットレス(ダブル) BSM-530D ベージュ

  36. Pingback: 【激安市場】コーチ COACH★アパレル(帽子)F83840 ブラック×シルバー トーナル ス

  37. Pingback: 【激安市場】コーチ COACH★レビューを書くと送料無料!小物(カードケース)

  38. Pingback: 三菱重工 CFY11 酵素強力除菌フィルター(1セット):ソフマップ楽天市場

  39. Pingback: 人気のカモフラ柄 クロムハーツ CHROME HEARTS New トラッカーキャップ CH 304112505E06BRO127 迷彩 P15Aug15 02

  40. Pingback: 即納!投げ竿 PROMARINE プロマリン ワンダーサーフ 360 [hd-304038] ※|投げ釣り 鱚 キス 鰈

  41. Pingback: [P]VERY ブギウギTシャツ Boogie-Woogie T-Shirt 愛犬用Tシャツ(犬の服) 12号 サンキ

  42. Pingback: 【激安市場】【P最大32倍&クーポン】【4個セット】LED ワークライト 20W 2連 作

  43. Pingback: サンニード LORIA オットマン グリーン/ナチュラル HL-52GR/NA:ザ?ペット

  44. Pingback: 【激安市場】コーチ COACH★アパレル(マフラー)F98473 ネイビー ロゼンジマフラー

  45. Pingback: 釣りビジョンプラザ,アブガルシアファンタジスタ エックス グレイブ(FXNC-610M M

  46. Pingback: 【激安市場】コーチ COACH★レビューを書くと送料無料!財布(二つ折り財布)

  47. Pingback: BPS 動画あり! 送料無料 !( メール便 ) 火起こし棒 メガスパーク ケース付 メタ

  48. Pingback: 【楽天市場】Megabass/メガバス OROCHI XX/オロチダブルエックス F6?1/2-70XX【釣り/フィッ

  49. Pingback: 【 激安 ! 】レビューを書いて 送料無料 ! LEDキャンドルライト 炎が揺れる LED

Comments are closed.