usql/Examples/TweetAnalysis
Mike Rys 93b3d5bc88 Fix a typo in comment 2016-08-25 11:40:52 -07:00
..
TweetAnalysis New Flexible Schema Extractor and some cleanup 2016-08-15 01:20:03 -07:00
TweetAnalysisClass Fix a typo in comment 2016-08-25 11:40:52 -07:00
README.md Create README.md for Tweets Samples 2015-12-14 22:54:53 -08:00
TweetAnalysis.sln Added TweetAnalysis Samples and aligned sample data folder structure to account sample data folder structure 2015-12-14 22:29:57 -08:00

README.md

#Tweet Analysis Sample

##Story Line

We receive a set of tweet files downloaded from http://tweetdownload.net and start out with doing some exploration of the data.

First, we do a simple count of all the tweets per tweet authors in a single file. Next we also investigate the mentions in the tweets. We then refactor the code into code-behind and make it available for reuse in a registered assembly. Then we do the analysis over all files and include some more detailed information about the lineage of the data (who mentioned and which files did provide the tweet).

After we decided on the schema, we finally decide to make the processed data on tweet authors and their mentions available as a table, and write some adhoc analytical queries, that show that while Raghu is not a frequent tweeter, he is very influential :).

Note that versions of these sample queries were used in the U-SQL introduction and U-SQL UDF blog posts on the VS MSDN blog.