300 Languages: A Parallel Speech Corpus Project

The 300 Languages Project is a special effort by The Rosetta Project, part of The Long Now Foundation, to begin the construction of a universal corpus of human language by collecting parallel text and audio in the world's 300 most widely-spoken languages. The resulting collection will contain thousands of volunteer-contributed public domain text documents and audio recordings which will be made available to researchers and the public alike via The Internet Archive, a free online digital library. More...

Make a New Translation  Record an Existing Text

The 300 Languages Project will accept submissions of any document in any language. Languages and document types other than those listed will be added directly to The Rosetta Project's collection at the Internet Archive.

Questions? Email laine@longnow.org.

300 Languages

The Rosetta Disk

Fifty to ninety percent of the world's languages are predicted to disappear in the next century, many with little or no significant documentation.