Brewster Kahle: Universal Access to All Human Knowledge
Presented at Web 2.0, 10/6/04, San Francisco
Impressionistic transcript by Cory Doctorow
Universal access to all knowledge is possible, and it's not aEarlier version of the speach available in QuickTime (121 mb QT)
non-profit goal. Index the whole damn thing -- it's a business
for AMZN (let's sell all the books, let's sell everything),
Altavista, (let's index all the web), etc.
26MM books in the Library of Congress -- more than 50% out of
copyright, most out of print, a tiny sliver in print. A digitized
ASCII book is about 1MB, so this is about 26TB, which costs about
$60K and takes up one bookshelf.
Google announced that it will digitize in-print material and
out-of-copyright works (like AMZN's thing).
It costs $10/book to scan -- they're digitizing all the books in
the Library of Alexandria, and they're going this in China, too.
A group in Toronto is doing a robot-scanner that will bring the
cost in the industrial world -- where labor is more expensive --
to scan books for $10. At $10 per, that $260 Million to scan all
Brewster is scanning all the books that are out of copyright, and
is trying to get at all the stuff that's out of print but still
in copyright -- the orphans. It's 8MM books, most of the 20th
We're suing Ashcroft in the Supreme Court for the right to bring
out-of-print, in-copyright books to the net.
We can print a book for a dollar -- it costs Harvard Library $2
to loan a book.
We've got book mobiles in India, Egypt, Uganda elsewhere printing
books for a dollar each.
Scan a book for $10, put it on the net, download it and bind it
How much audio is there?
2-3MM discs (78s, LPs, CDs) produced in the history of the world.
Lots of people aren't well-served by music publishers. Some rock
bands sell records but allow tape-trading of their live
performance. We've got 700 bands' live performances online --
including all of the Grateful Dead.
Online record-labels need help: we offer unlimited
storage/bandwidth forever for free to anyone releasing material
under a CC license. There should be no penalty to giving stuff
Classical music: we need a good classical music collection. If
you know anyone in a symphony we're looking to digitize their
stuff at hi-rez.
100-200K theatrically released films in the history of the world,
half are Indian.
600 films in the US are not in copyright -- we've got 300 on the
web to download, watch, cut up, do what you will.
Thousands of non-theatrical films (educational films, etc) in the
We're recording 20 channels of TV 24h/day at full rez. We've got
a petabyte of TV from Russia, UK, Arab world, etc.
We've got a DMCA exemption that allows us to digitize and rip
software. It's a disgrace that the software industry opposed
We've got a web-archive going back to 1996.
This is growing at one Library of Congress per month.
Preservation and access
We've got copies of this in SF (on the San Andreas fault), with
mirrors in Egypt and Amsterdam.
We're adding cool search stuff, like Recall.
Will we do this?
Dunno -- lots of business oppos here. 4 companies have already
This requires coop between govt, nonprofit and for-profit
UNIVERSAL ACCESS TO ALL HUMAN KNOWLEDGE CAN BE OUR GREATEST