Listing 7 datasets tagged with "bigdata"

Twitter Census - Conversation Metrics: One year of URLs, Hashtags, Smileys usage (Smiley Counts) *****

Occurrence counts of tweet tokens: hashtags, URLs, & smileys by hour or month | Twitter Census | Added by MonkeywrenchConsultancy 4 months ago

This data comes from a scrape of the Twitter social network conducted by the Monkeywrench Consultancy. The full scrape consists of 35 million users, 500 million tweets, and 1 billion relationships between users.

This dataset is a corpus of tokens collected from tweets sent between March 2006 a …

Computers » Social Networks

Freebase Data Dump *****

Added by Infochimps about 1 year ago

A data dump of all the current facts and assertions in the Freebase system.

Freebase is an open database of the worlds information, covering millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC archi …

Encyclopedic » Encyclopedias

Freebase.com Wikipedia Extraction (WEX) *****

Added by Infochimps about 1 year ago

The Freebase Wikipedia Extraction (WEX) is a processed dump of the English language Wikipedia. The wiki markup for each article is transformed into machine-readable XML, and common relational features such as templates, infoboxes, categories, article sections, and redirects are extracted intabul …

Encyclopedic » Encyclopedias

DBPedia Main *****

Added by Infochimps about 1 year ago

DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. The DBpedia knowledge base currently describes more than 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,0 …

Encyclopedic » Encyclopedias

Global Daily Weather Data from the National Climate Data Center (NCDC) *****

Federal Climate Complex GSOD (Global Surface Summary of Day) version 7 | Added by Infochimps 8 months ago

The GSOD (Global Daily) Data

The GSOD dataset is from National Climate Data Center, and downloadable at ftp://ftp.ncdc.noaa.gov/pub/data/gsod/

You can fetch your own copy with

wget -r -l3 —no-clobber —no-parent —no-verbos …
Science » Meteorology

Twitter Census - Conversation Metrics: One year of URLs, Hashtags, Smileys usage (by Hour) *****

Occurrence counts of tweet tokens: hashtags, URLs, & smileys by hour or month | Twitter Census | Added by MonkeywrenchConsultancy 4 months ago

This data comes from a scrape of the Twitter social network conducted by the Monkeywrench Consultancy. The full scrape consists of 35 million users, 500 million tweets, and 1 billion relationships between users.

This dataset is a corpus of tokens collected from tweets sent between March 2006 a …

Computers » Social Networks

Twitter Census - Conversation Metrics: One year of URLs, Hashtags, Smileys usage (Smiley Counts) *****

Occurrence counts of tweet tokens: hashtags, URLs, & smileys by hour or month | Twitter Census | Added by MonkeywrenchConsultancy 4 months ago

This data comes from a scrape of the Twitter social network conducted by the Monkeywrench Consultancy. The full scrape consists of 35 million users, 500 million tweets, and 1 billion relationships between users.

This dataset is a corpus of tokens collected from tweets sent between March 2006 a …

Computers » Social Networks

Austin Daily Weather (extracted from National Climate Data Center (NCDC) Data) **

Federal Climate Complex GSOD (Global Surface Summary of Day) version 7 | Added by Infochimps 8 months ago

About

This is an extract from the “Global Daily Weather Data from the National Climate Data Center (NCDC)” dataset for just austin.

Graphs

!http://infochimps.org/static/ga …

Science » Meteorology

The Open Library **

The Comprehensive Knowledge Archive Network (CKAN) Collection | Added by Infochimps 10 months ago

  1. About

> One web page for every book ever published. It’s a lofty, but achievable, goal.

> To build it, we need hundreds of millions of book records, a brand new database infrastructure for handling huge amounts of dynamic information, a wiki interface, multi-language support, and people w …