This data is derived from the MySpace real-time stream API. It contains all users in our dataset, around 11 million, with well-formed zip codes.
This data is derived from the MySpace real-time stream API. It contains all users in our dataset, around 11 million, with well-formed latitude/longitude.
This data is derived from the MySpace real-time stream API. The word count is from the free-form text fields MySpace moods, forum topic titles, replies to forum topics, text from sharing a link or item, and status mood updates. For the last three months the words from these fields has been extra …
This data is derived from the MySpace real-time stream API. The word count is from the free-form text fields MySpace moods, forum topic titles, replies to forum topics, text from sharing a link or item, and status mood updates. For the last three months the words from these fields has been extra …
This data is derived from the MySpace real-time stream API. The word count is from the free-form text fields MySpace moods, forum topic titles, replies to forum topics, text from sharing a link or item, and status mood updates. For the last three months the words from these fields has been extra …
This data is derived from the MySpace real-time stream API. It counts MySpace application adds by users aggregated by ZIP code. ZIP codes are user-supplied on MySpace, so there are many errors.
“WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical …
Continuing the ICWSM tradition, ICWSM 2009 is making a dataset available to researchers in the blog and social media fields. We invite you to download the dataset, explore it, learn something interesting about it, and submit a paper about it to ICWSM 2009.
Good research topics might include… …
This data comes from a scrape of the Twitter social network conducted by the Monkeywrench Consultancy. The full scrape consists of 35 million users, 500 million tweets, and 1 billion relationships between users.
This dataset is a mapping between the user IDs used for Twitter Search;…
9/11 tragedy pager intercepts.
The following are more than half a million national US pager intercepts released by wikileaks.org. This covers the September 11 tragedy from 3am on the same day (Tuesday) until 3am the following day, a 24 hour period surrounding the attacks in New York and Washing …
This data comes from a scrape of the Twitter social network conducted by the Monkeywrench Consultancy. The full scrape consists of 35 million users, 500 million tweets, and 1 billion relationships between users.
This dataset is a corpus of tokens collected from tweets sent between March 2006 a …
features_and_friends.csv
-——————————-
This file contains 33 image features for 19,217 MySpace profile pictures. Also
included is the number of friends for each user in the sample.
The columns are (roughly):
n – Number of brightness levels
pn – A measure of the …
This data is derived from the MySpace real-time stream API. It counts total MySpace application adds by users during the period from December, 2009 to February, 2010.
This data is derived from the MySpace real-time stream API. It counts MySpace application adds by users aggregated by date.
A record of all bookmarking activity on delicious.com for a roughly 10-day period in September 2009. Format is JSON, one record per line. There are 1.25 million entries. Download size is 170 MB. Sample record:
{"updated": “Tue, 08 Sep 2009 08:45:00 +0000”, “links”: [{"href": “http://w …
Twitter Haiti Earthquake Data (JSON dump) – raw tweets from the twitter.com social network concerning the devastating 2010 Earthquake in Haiti.
A capture of all tweets from Twitter’s sample feed during the 2010 state of the union address. Tweets are in JSON format. The feed is described here: http://apiwiki.twitter.com/Streaming-API-Documentation#statuses/sample.
> Identi.ca is a microblogging service. Users post short (140
character) notices which are broadcast to their friends and fans using the Web, RSS, or instant messages.
Bulk downloads not yet available.