dlindia
TRANSCRIPT
![Page 1: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/1.jpg)
Digitization Practices in India: Issues and
Challenges
V.N. Shukla
![Page 2: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/2.jpg)
2
C-DAC, NOIDA UNITC-DAC, NOIDA UNITC-DAC, NOIDA UNITC-DAC, NOIDA UNIT
MISSION MISSION C-DACC-DAC
NATURAL LANGUAGE PROCESSING AND
INTERFACES
NATURAL LANGUAGE PROCESSING AND
INTERFACES
HUMAN RESOURCE DEVELOPMENT IN
HITECH AREAS
HUMAN RESOURCE DEVELOPMENT IN
HITECH AREAS
INFRASTRUCTURE AND SUPPORT
SERVICES
INFRASTRUCTURE AND SUPPORT
SERVICES
SPECIAL INDUSTRIAL
APPLICATIONS
![Page 3: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/3.jpg)
3
AREAS OF COMPETENCEAREAS OF COMPETENCE
Graphical Display System
Security Systems
Embedded System
System Engineering and Consultancy
NLP
Solar Energy System
E-Governance
Internet on CATV & E-Commerce
.
.
.
NOIDANOIDA
![Page 4: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/4.jpg)
•Digital Library Projects
•Mega Centre for Digital Library•Mobile Digital Library : Dware Dware Gyan Sampada•Digital Library at President’s House•Digital Library at Nagari Pracharini Sabha Varanasi•Digital Library at Uttaranchal•GyanNidhi : Multilingual Parallel Corpus in Indian Languages•Digital Library at Gujrat Vidyapeeth ,Ahmedabad•Digitization of Libraries
Digital Library Activities : CDAC Noida
![Page 5: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/5.jpg)
Digital Library Mission
Online ContentBillions of web pages
Offline ContentBillions of items still unindexed
To organize the information and make it universally accessible and useful.
![Page 6: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/6.jpg)
DL Initiatives
~85% of books are out of print and/or out of copyright – these books are only found in libraries
GOAL: Create a comprehensive virtual card catalog of all books in all languages, while respecting publishers’ rights
Only ~15% of books are in print
Source: Google
![Page 7: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/7.jpg)
Metadata Search
DL creation & processes
Users
Traditional Libraries
Digital Libraries
I NDEX
Index
Hyperlinks
![Page 8: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/8.jpg)
92% of the world's books are neither generating revenue for the copyright holder nor easily accessible to potential readers.*
The value is in the middle
A Typical Library Collection
In-Print Public DomainUnclear copyright status• May be in copyright, but not for sale • Rights may have reverted to author• May be in the public domain
Less than 20%**~65% or more
15%
*Source: Covey, Denise Troll. "Global Cooperation for Global Access: The Million Book Project“**OCLC analysis of the Google Books Library Project: http://www.dlib.org/dlib/september05/lavoie/09lavoie.html
~15%
![Page 9: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/9.jpg)
Digital Library (DL) may be seen as “Collection of intelligent creations by human beings through their own language and culture. It also reflects cultural heritage besides providing archive and generating many research issues pertaining to Natural Language Processing”
![Page 10: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/10.jpg)
According to other definition Digital libraries are
“Organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily available for use by a defined community or set of communities”.
Digital Library ?
Sun Microsystems defines a digital library as the electronic extension of functions users typically perform and the resources they access in a traditional library.
These information resources can be translated into digital form, stored in multimedia repositories, and made available through Web-based services.
![Page 11: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/11.jpg)
What is Digital library ?
A Service? An Architecture? A set of Information Resources? A set of tools to locate, search, retrieve
information? Possibly the tools to create such resources and
services also fall within the purview of DLs Digital face of traditional libraries Include both digital collections and traditional Backbone and nervous system of libraries.
![Page 12: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/12.jpg)
•Efficient & qualitative services by collecting, organizing, storing, disseminating, retrieving and preserving the information.
•Preservation benefits besides making information retrieval & delivery more comfortable.
•Online access to historical and cultural documents whose existence is endangered due to physical decay.
Digital libraries necessarily include a strong focus on the management of digital content, just as traditional libraries have focused for long on the management of content in physical forms.
Digital library Vs traditional libraryDigital library Vs traditional library
![Page 13: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/13.jpg)
The major areas for great exploitation are:
• Information retrieval, • multimedia,• database, • data mining, • data warehouse, • on-line information repositories, • image processing, hypertext, • World Wide Web and wide area information services (WAIS).
Most of the digital content that is being managed includes:
• Human Language, in various forms character-coded electronic text, scanned images, printed or handwritten text or human speech.
• Language technology helps in managing digital content
• Management through learning from past experience also adds to manage content
Digital Content ManagementDigital Content Management
![Page 14: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/14.jpg)
• Access anywhere
• Reducing delays
• Distributed storage – central access
• Better cataloguing • Cross references to other documents
• Full text search
• Protected information source • Wide exploration and exploitation of the information
Few advantages of digital libraries
The information explosion, the wide bandwidth data networks and the potential The information explosion, the wide bandwidth data networks and the potential of Internet-based technologies - such as the Web - make digital libraries one of of Internet-based technologies - such as the Web - make digital libraries one of the important application areas of computer science.the important application areas of computer science.
![Page 15: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/15.jpg)
Process of Digital Preservation
Centralized Server
Centralized Server
Book scanning status
Book scanning status
XML Meta File Creation using
Dublin core Std.
XML Meta File Creation using
Dublin core Std.
Scanned Image in TIFF
format
Scanned Image in TIFF
format
S/w to divide even & odd
pages
S/w to divide even & odd
pages
Batch cropping & Cleaning
Batch cropping & Cleaning
OCROCRConversion to
TXT/RTF/HTML
Conversion to TXT/RTF/HTML
Yes
No
Uploading
Reject the Book
Reject the Book
![Page 16: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/16.jpg)
Goals of DL
Focused on digitization technology, metadata schemes, data management techniques, and digital preservation.
Second-generation digital library exploring new opportunities and developing new
competencies. Third-generation digital library
focusing instead on fully integrating digital material into the library’s collections through a modular systems architecture.
![Page 17: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/17.jpg)
Ingredients for DLs
Hardware The minimum machinery to do the job
Software The programs for handling data
Digital Objects Articles, Conference Papers, Thesis,…… Basic Skills
Things one has to learn
![Page 18: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/18.jpg)
Hardware
A Server You’ll need access to a web server
A good PC Scanners
Flatbed – Auto feed, Back to back
MF
Book Scanner
![Page 19: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/19.jpg)
Software
Open Source Software (OSS)
Dspace, E-Prints, Fedora, GSDL……
Proprietary software you can’t avoid Image Editing and Optical Character Recognition Software
have to be purchased
![Page 20: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/20.jpg)
Content is King
The information content is more important than the systems used for its storage, management and retrieval
Objects should not be “locked” in specific DLs or archives
![Page 21: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/21.jpg)
Creating DLs …
Six steps Selecting Acquiring Digitization Creation Of Meta Data Organizing Archiving Providing Access
![Page 22: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/22.jpg)
![Page 23: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/23.jpg)
Possible Delivery Formats
Pure image formats: TIFF, JPEG Open encoded formats: XML, HTML, ASCII, and
Unicode Hybrid formats: PDF, DjVu – can contain both image and
text
Proprietary formats: Microsoft Word, WordPerfect
![Page 24: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/24.jpg)
Digitization: Issues
Copyright Access copy and archive copy File size Storage media( CD, Hard disc…) File format ( TIFF,JPEG…)
![Page 25: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/25.jpg)
25
Challenges in Digitization
Building digital collections of national importance from
existing texts, documents, images . . .
Creating new digital documents & linking them
Subject portals: Selecting and maintaining open source
digital resources
Developing / adapting management tools for digital
collections
Providing access to digital collections
![Page 26: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/26.jpg)
26
Challenges..
Integrating digital & other library collections
incl. integration of OPACs, subscribed e-resources and
subject portals
Establishing services for digital libraries
online access & offline support
education & training of users and librarians
Addressing social, legal, policy issues
![Page 27: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/27.jpg)
Challenges in Publishing
Preservation of layout
Searchability of content and metadata
Efficient image compression
Easy browsing of books
Accommodating low bandwidth user
Multilingual text support
Multipaging
![Page 28: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/28.jpg)
Digital Library Support in India
Funding Ministry of Communication & Information Technology
(MIT) Ministry of Human Resource Development (MHRD) Manuscript Mission of India Department of Scientific & Industrial Research (DSIR-
TRP) All India Council for Technical Education (AICTE) University Grants Commission (UGC)
![Page 29: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/29.jpg)
29
Library Consortium in India Scholarly Science Journals Theses & Dissertations Institutional E-Print Archives Books (out of copyright) Manuscripts Newspapers Online Courseware Open Access at Metadata Level Portal and Gateway Services
Digital Library Initiatives in India
![Page 30: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/30.jpg)
Government of India
Min. of C&IT Min of Culture
INDEST-AICTE Consortium
Others
CSIR E-Journals Consortium
UGC Infonet Consortium
FORSA Consortium
National Manuscript Library
Universal Digital
Library
IIM Libraries Consortium
![Page 31: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/31.jpg)
Digital Library of India Digital Library of India Digital Library of India Digital Library of India
Participating centers of DLI
IISc
IIIT-H State & CityCentral LibraryUniversity of Hyderabad
MIDC Pune University
AKCE
SASTRAASR Melkote
Sringeri Mutt
Anna University
TTD Tirupati
IIIT-Allahabad
CDAC Noida
Rashtrapathi Bhavan
Mega Scanning Centres atIIITH, IIITA
CDAC- Noida and Kolkatta
PTU-1PTU-2PTU-3
Goa University
Kanchi MuttIISc, IIAP,
PoornaPragya
CDAC Kolkata
ERNET
![Page 32: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/32.jpg)
Digital Library Initiatives in India
Some Examples
![Page 33: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/33.jpg)
April 20, 2009 Workshop on Institutional Repositories 33
Digital Library of India
http://www.dli.ernet.in/
![Page 34: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/34.jpg)
![Page 35: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/35.jpg)
April 20, 2009 Workshop on Institutional Repositories 35
http://www.ias.ac.in/
![Page 36: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/36.jpg)
April 20, 2009 Workshop on Institutional Repositories 36
http://www.insa.ac.in/
![Page 37: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/37.jpg)
April 20, 2009 Workshop on Institutional Repositories 37
http://medind.nic.in/
![Page 38: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/38.jpg)
April 20, 2009 Workshop on Institutional Repositories 38
![Page 39: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/39.jpg)
39
![Page 40: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/40.jpg)
![Page 41: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/41.jpg)
Manuscripts India has the largest collection of manuscripts in the world (5 million
Approximately).
India is the repository of an astounding wealth of ancient knowledge belonging to different periods of history, going back to thousands of years. Most of this knowledge belonging to different areas of intellectual activity such as religion, philosophy, science, arts and literature is preserved in the form of manuscripts. Composed in different Indian languages and scripts, they are preserved in materials such as birch bark, palm leaf, cloth, wood, stone and paper.
National Manuscript Mission was launched five-year programme in Feb., 2003 by the Ministry of Human Resource Development, Govt. of India to get all the manuscripts and conserve them.
![Page 42: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/42.jpg)
http://namami.nic.in/
![Page 43: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/43.jpg)
43
Archives of Indian Labour
V.V. Giri National Labour Institute
Heritage of Indian Working Class
Commissions on Labour
Oral History Collections
Trade Union Collections
Regional Collections
Strike Collections
Powered by Green Stone Digital
Library
http://www.indialabourarchives.org/
![Page 44: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/44.jpg)
Digital Libraries Benefits : Individual
Gain access to the holdings of libraries worldwide through automated catalogs. Locate both physical and digitized versions of scholarly articles and books.
Optimize searches, simultaneously search the Internet, commercial databases, and library collections.
Save search results and conduct additional processing to narrow or qualify results.
From search results, click through to access the digitized content or locate additional items of interest.
All of these capabilities are available from the desktop or other Web-enabled device such as a personal digital assistant or cellular telephone.
![Page 45: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/45.jpg)
Conclusion
Digital Libraries are redefining the role of libraries in society & the role of librarians & information specialists
National level mechanism is essential to promote and coordinate open access and public domain digital library systems
Improve awareness of open access Regular training – tools, processes, standards Support setting up of working models, services National Resource Centre for open access publishing
International agencies like UNESCO, ICSU, ICSTI, CODATA need to actively promote and support developing country initiatives
![Page 46: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/46.jpg)
References
Digitization Of Library Forum Survey 2010. IT Act . Available at www.mit.gov.in/it-bill.htm.
A digital library for education: the PEN-DOR project. The Electronic Library, 17(2), 75-82.
Government of India. 2000. “Background Report on IT for Masses” itformasses.nic.in/vsitformasses/page1.htm
Government of India. 2000. IT for the Common Man: The Millenium IT Policy. Department of Information.
![Page 47: Dlindia](https://reader033.vdocuments.mx/reader033/viewer/2022060108/554dcf7cb4c905cc0e8b466f/html5/thumbnails/47.jpg)
Thank You