Listing 23 datasets tagged with "words"

Moby Project Word Lists | Added by Infochimps

113,809 official crosswords A list of words permitted in crossword games such as Scrabble™. Compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has all forms: -ing, -ed, -s, and so on of words, it makes a good addition when building a custom spell …

Linguistics » Word Lists

Occurrence counts of tweet tokens: hashtags, URLs, & smileys by hour or month | Twitter Census | Added by Infochimps

This data comes from a scrape of the Twitter social network conducted by the Monkeywrench Consultancy. The full scrape consists of 35 million users, 500 million tweets, and 1 billion relationships between users.

This dataset is a corpus of tokens collected from tweets sent between March 2006 a …

Computers » Social Networks | Social Sciences » Communications | Social Sciences » Sociology | History » Modern History

MySpace Real-Time Stream | Added by Infochimps

This data is derived from the MySpace real-time stream API. The word count is from the free-form text fields MySpace moods, forum topic titles, replies to forum topics, text from sharing a link or item, and status mood updates. For the last three months the words from these fields has been extra …

Computers » Social Networks | Linguistics

MySpace Real-Time Stream | Added by Infochimps

This data is derived from the MySpace real-time stream API. The word count is from the free-form text fields MySpace moods, forum topic titles, replies to forum topics, text from sharing a link or item, and status mood updates. For the last three months the words from these fields has been extra …

Computers » Social Networks | Linguistics

Moby Project Word Lists | Added by Infochimps

113,809 official crosswords A list of words permitted in crossword games such as Scrabble™. Compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has all forms: -ing, -ed, -s, and so on of words, it makes a good addition when building a custom spell …

Linguistics » Word Lists

Moby Project Word Lists | Added by Infochimps

113,809 official crosswords A list of words permitted in crossword games such as Scrabble™. Compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has all forms: -ing, -ed, -s, and so on of words, it makes a good addition when building a custom spell …

Linguistics » Word Lists

Moby Project Word Lists | Added by Infochimps

Over 256,700 hyphenated or other entries containing more than one word as well as all capitalized words and acronyms. Phrases were considered ‘common’ if they or variations of them occur in standard dictionaries or thesauruses.

Linguistics » Word Lists

MySpace Real-Time Stream | Added by Infochimps

This data is derived from the MySpace real-time stream API. The word count is from the free-form text fields MySpace moods, forum topic titles, replies to forum topics, text from sharing a link or item, and status mood updates. For the last three months the words from these fields has been extra …

Computers » Social Networks | Linguistics

Moby Project Word Lists | Added by Infochimps

Over 354,000 single words, excluding proper names, acronyms, or compound words and phrases. This list does not exclude archaic words or significant variant spellings.

Linguistics » Word Lists

Moby Project Word Lists | Added by Infochimps

Over 354,000 single words, excluding proper names, acronyms, or compound words and phrases. This list does not exclude archaic words or significant variant spellings.

Linguistics » Word Lists

Moby Project Word Lists | Added by Infochimps

Over 354,000 single words, excluding proper names, acronyms, or compound words and phrases. This list does not exclude archaic words or significant variant spellings.

Linguistics » Word Lists

Wordnet *****

Free

A large lexical database of English | The Comprehensive Knowledge Archive Network (CKAN) Collection | Added by Infochimps

“WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical …

Linguistics » Word Lists

Occurrence counts of tweet tokens: hashtags, URLs, & smileys by hour or month | Twitter Census | Added by Infochimps

This data comes from a scrape of the Twitter social network conducted by the Monkeywrench Consultancy. The full scrape consists of 35 million users, 500 million tweets, and 1 billion relationships between users.

This dataset is a corpus of tokens collected from tweets sent between March 2006 a …

Computers » Social Networks | Social Sciences » Communications | Social Sciences » Sociology | History » Modern History

Occurrence counts of tweet tokens: hashtags, URLs, & smileys by hour or month | Twitter Census | Added by Infochimps

This data comes from a scrape of the Twitter social network conducted by the Monkeywrench Consultancy. The full scrape consists of 35 million users, 500 million tweets, and 1 billion relationships between users.

This dataset is a corpus of tokens collected from tweets sent between March 2006 a …

Computers » Social Networks

Added by mrflip

List of summonable objects from the Nintendo DS game Scribblenauts, from AARDVARK, ABOMINABLE SNOWMAN and ABSCONDER to ZOMBIE, ZUNICERATOPS and ZYGOTE.

via the Scribblenauts Wikipedia entry:

Scribbl …

Computers | Linguistics » Word Lists

Moby Project Word Lists | Added by Infochimps

This file consists of the 1,000 most frequently used English words from a wide variety of common texts listed in decreasing order of frequency

Linguistics » Word Lists

Moby Project Word Lists | Added by Infochimps

This file consists of the 1,000 most frequently used English words as used on the Internet computer network in 1992.

Linguistics » Text Corpora

Moby Project Word Lists | Added by Infochimps

This file consists of the 1,000 most frequently used English words from a wide variety of common texts listed in decreasing order of frequency

Linguistics » Word Lists

Moby Project Word Lists | Added by Infochimps

74,550 common dictionary words — A list of words in common with two or more published dictionaries. This gives the developer of a custom spelling checker a good beginning pool of relatively common words.

Linguistics » Word Lists

Moby Project Word Lists | Added by Infochimps

10,196 places (places.txt) a large selection of place names in the United States

Geography » Geographical Names