OpenLibrary and Universal Library, guys work together!

August 3, 2008, [MD]

**The OpenLibrary\ **I wrote about the OpenLibrary previously, and since then they have only gotten better. They have added a lot of books, but more importantly, their website has turned into a real portal, where you can access all of their scanned books (over 234,000). This is only part of their quest, which is also to provide open information about all the books ever published in the world, through an innovative new database/wiki that they have developed. However, it is still very easy to type in a search string, and ask for only scanned books to be returned. They are then viewable in the flip-book format, which is still the absolutely best interface to scanned books that I have ever seen on the web. It makes reading online quite enjoyable (especially for old, beautifully decorated books), and the only thing missing is a zoom feature, which they have stated is coming.

Universal Library\ At a conference in Shanghai, I was introduced to the amazing Universal Library project, which I cannot believe I had not heard of earlier. Run by Carnegie Mellon University, in collaboration with Chinese and Indian governments, they have already scanned some 1,5 million books. What is exciting is not just the sheer numbers, but the incredible linguistic variety. Over one million books available in Chinese, and thousands in Urdu, Hindi, Arabic, Telugu, Tamil and many others. They have book scanning centers around the world, and have apparently developed some very advanced new technology, both for book scanning, for OCR in different scripts, etc.

Technical issues\ Given all this, I was extremely enthusiastic to try out the project. However, it turns out that all their files are either stores as DjVu (mostly the Chinese contributions) or tiff files (everything else). Both of these require special viewers to be installed, and after spending a lot of time trying to follow different instructions and downloading different files, I was finally able to display the Chinese books in Firefox on my MacBook, but I have still not been able to view the tiff files. And even if I am able to display the Chinese pages, the solution is still very far from as user-friendly and appealing as the OpenLibrary flip-book solution.

It does say on the Universal Library site that they will eventually also make their books available as PDFs etc, but my immediate thought would be, why don’t they publish it through the OpenLibrary? They already have the infrastructure and the technical solution. Indeed I don’t know the different reasons that lead to this not being an integrated part of the OpenContent Alliance in the first place, but if it was possible to integrate all these sources into the OpenLibrary - which is already aiming to be a truly international multilingual solution (even the interface can be translated in many languages) - I think that would be a wonderful solution. (Apparently Universal Library is also part of the Open Content Alliance which runs the Open Library).

The amazement of diving into the collections\ Either way, I hope these technical issues are solved, because this is an incredibly revolutionary project. Already a million books are available in Chinese, both out of copyright and ones they have been able to negotiate rights for… I could spend days doing all kinds of weird searches, finding Chinese books written hundred and fifty years ago about Norwegian folk tales, the cultural system in China, or Esperanto. This is also providing me with a renewed motivation to learn classical Chinese (wenyan), because of course most of the books are written in both traditional characters and classical Chinese. And the availability of such a treasure trove of books in Hindi and Urdu, is a huge incentive to my current efforts to learn Hindi!

Vital projects for humanity\ I think these efforts are some of the most important going on right now, and they deserve a lot more financial support than they are getting! I would love to see these scanning efforts also expanding to other countries and languages. And especially working hard to try to secure copyrights or permissions for even newer books. One of the only books on the history of libraries in Indonesia, “Perpustakaan Indonesia dari zaman ke zaman” from 1966 is obligatory reading for anyone wanting to do research on the history of libraries and literacy in Indonesia, but it is not available online, and in the US only in a few tattered copies that are sent back and forth between research libraries on inter-library loan (I got mine from a US institution on ILL). There is no way you can convince me that putting that online would constitute anything immoral!

What’s more, many of these collections are incredibly vulnerable and are disappearing - I have myself seen the results of high humidity and low maintenance budgets in Indian university libraries, with old books falling apart. Not to mention archives of millions of manuscripts written on palm leaves, etc. We cannot afford to loose this heritage!\ \ Keep us up to date!\ One thing that I find lacking from both projects is a good project blog, that is kept up to date. As a big supporter, I would like to know how their work is going, what their current bottle-neck is, what they are working on. Not in annual reports, but in daily or weekly reports. Most of the “news” on the Universal Library site is not dated, and the statistics of scanned books are a year old. If people knew more about what was going on, they would be in a much better position to offer suport, both direct, and indirect.

Stian

Stian Håklev August 3, 2008 Toronto, Canada
comments powered by Disqus