Digitalization aids to recognize Tamil Palm leaf and Rock inspiration Manuscript

Dr. J. Porkodi

Assistant Professor of Chemistry

The Standard Fireworks Rajaratnam College for Women, Sivakasi

porkodi-che@sfrcollege.edu.in

Abstract

          In olden days before the invention of paper, Palm leaf and rock inspiration manuscript are considered as tools for transforming and preserving valuable information about our culture to the next generation. They are considered as valuable sources and versatile tools for narrating informations about the literature of ancient Tamil, Ayurvedic medicine, astrology prediction, architecture skills and biography of the ancient kings. Tamil texts in these manuscripts are very hard to read without the help of the proper guides. Nowadays studying Tamil palm leaf manuscripts involves a combination of traditional and modern tools. Digitalization not only helps to identify the Tamil text but also assist to preserve the priceless content in the manuscript. Some of the softwares like Digital Imaging, OCR (Optical Character Recognition), Text Encoding, Transcription,  Unicode Text Editors, MySQL or SQLite data bases, Localisation tools, Digital Libraries and Archives Platforms, Geographic Information System, GitHub or GitLab, B Spline curves etc., are used to recognize the vowels, consonants and its combination. By combining traditional scholarly methods with modern digital tools can enhance the study and preservation of Tamil palm leaf and rock inspiration manuscripts.

Keywords: manuscript; versatile tool, digitalization, softwares, preservation.

Introduction

            Previous to Egyptians ‘Papyrus’, invention knowledge and communication was shared by inscribing on tree barks, skin hides, rocks, and leaves. Among these palm leaves and rock inspiration were widely used due to its enormous availability and capability to withstand rigorous conditioning. Many organizations have taken enormous efforts on war-footing basis to preserve these manuscripts because of the valuable information varying from traditional medicines, land documents, astrology, astronomy, and many more present in it. This is done because organic nature of palm leaves has an average life time of 300–400 years [1].

            In olden days, Palm manuscripts and stone inspiration are used for 3 different categories. Document registration of land and building which are donated by the kings to the people are encrypted with palm manuscripts. The literary works, grammar, astrology, science and technology, etc., are encrypted in palm manuscripts.

            But the researchers and students with poor knowledge in identifying the old tamil curves and letters, Character recognition is one of the most difficult tasks.

            There are a lot of difficulties in image processing techniques like separate the characters in the segmentation process, recognizing unlimited character fonts and sculpting styles in noisy image and how to distinguish characters that have the same shape, but have different pronunciation in characters.

            Many researchers have tried to apply many techniques for breaking through the complex problems of character recognition. If it persists, all the valuable information given by our ancestors will not be reach to the future generations. People feel difficulty to recognize the sculpting of Tamil characters in stones, clay pot, copper plate etc., compared to the other character recognition from different sources.

            We can able to isolate, recognize and preserve these manuscripts with the help of Digitization which performs the conversion of analog format and assists in the preservation for future generation. This will be linked to all aspects like acquiring, converting, storing, retaining information in standardized and organized manner with technology support. This method make an “electronic photograph” of the required document which will be converted to digital form and can be stored electronically and accessed via computer.

Softwares used to recognize the manuscripts

            Digital Imaging, OCR (Optical Character Recognition), Text Encoding, Transcription,  Unicode Text Editors, MySQL or SQLite data bases, Localisation tools, Digital Libraries and Archives Platforms, Geographic Information System, GitHub or GitLab, B Spline curves

Digital imaging

            Digital imaging technique represents the image in 2D form as a limited set of digital values known as pixels or picture elements. This procedure can be performed with the help of  a scanner or video camera. Once the manuscript images are digitized, the Digital image processing lays emphasis on two main tasks: amelioration of pictorial information for the purposes of human interpretation and processing of data image for storage, representation, and transmission for independent machine perception [2].

            The main use of digital image processing is to find a delegacy of intensity distribution of any image and changing 3D images to 2D image values that will be useful for quantitative morphology description and representation. Adobe Photoshop, GIMP can Enhance and analyze digital images of Tamil Palm leaf and rock inspiration manuscripts. Adjustments like contrast and brightness can improve visibility

            Archivematica, DuraCloud implemented the digital preservation strategies to ensure the long-term accessibility and integrity of digital stone manuscript collections. Interestingly, Exif Tool can add and manage metadata for images of stone manuscripts, providing context and information about each image.

OCR (Optical Character Recognition)

            Optical Character Recognition (OCR), is one of the well-known method for transforming digital document images into machine-encoded texts [3]. This program scans a document and converting it to a word processing document in three steps – Image Acquisition and pre-processing, feature extraction and classification.

Step 1 is the cleaning up and enhances the quality of the image by noise removal, binarization, color adjustment and text segmentation.

Steps 2 is the extraction and capture data from the acquired text image which can be used for classification.

Final step is the part of the segmented text in the document image is mapped towards the equal textual image. This technique will identify and validate handwritten text also [4]. Character recognition system widely uses the methodologies of pattern identification which allots an unknown sample to a predefined characters. Wide number of character classes, difference in styles of handwriting, unconstrained writing, and the presence of visually similar characters are difficulties  in the identification of character in manuscripts

Tesseract, ABBYY FineReader softwares convert scanned images of inscriptions on stone manuscripts from OCR into editable text for transcription and analysis.

Text Encoding Software:

            This process assign numbers to graphical characters, particularly the hand written characters of various languages, and make them to be stored, broadcast, and transformed using digital computers [5]. The numerical values that make up a character encoding are called as “code points” and collectively comprise a “code space”, a “code page”, or a “character map Software like XML or TEI (Text Encoding Initiative) guidelines can be used for encoding and annotating the content of Tamil palm leaf manuscripts and stone manuscripts. This aids in creating structured digital editions.

Transcription Software:

            This software assists in the conversion of human speech into a text transcript. Audio or video files can be transcribed manually or automatically Tools like Juxta Editions or From The Page are designed for collaborative transcription of historical Tamil palm leaf and stone inspiration manuscripts. They allow multiple users to work on transcribing and annotating manuscripts.

Unicode Text Editors

            This computer program edits the plain text are sometimes called as “notepad”. Text editors are available with operating systems and software development packages, and it can be used to change files such as configuration files, documentation files and programming language source code [6]. Researchers must be ensures that their text editor supports Unicode characters, including Tamil characters. Editors like Notepad++, Sublime Text, or Visual Studio Code are examples.

Database Management Systems:

            We can Use databases like MySQL or SQLite to organize and manage metadata associated with Tamil palm leaf manuscripts, facilitating efficient retrieval and analysis.

Localization Tools: If the palm leaf manuscripts or rock inspiration manuscripts contain region-specific terms or dialects, localization tools like POEdit can assist in managing translations.

Digital Libraries and Archives Platforms: Platforms like DSpace, Omeka, or Greenstone can be used to organize, archive, and provide easy access and retrieval to digital collections of Tamil palm leaf manuscripts and transcriptions of stone manuscripts

Geographic Information System (GIS) Software: If the palm leaf or rock inspiration manuscripts contain geographical information, GIS tools like, ArcGIS can help to map and analyze spatial data.

Collaboration Platforms: Platforms like GitHub or GitLab can facilitate collaborative work on digitized Tamil palm leaf and stone manuscripts thus enabling version control and teamwork.

Virtual Reality (VR) Tools:

            Unity, Blender tools can create immersive experiences for studying and exploring digital representations of stone manuscripts.

Conclusion

             Ancient people had devotion towards nature and they lived in a greenery environment. They always depend on nature for their essential life. Before the invention of paper, the Palm leaf manuscripts and rock inspiration helped to share historical, medical, astrological information, agreements and transformation of knowledge among the people. Nowadays with the help of current technology and the digitalization techniques, manuscripts information will be preserved and the content will be easily accessed by the future generation endeavors.

References 

  1. D Uday Kumar, G. V. Sreekumar, U. A. Athvankar, Traditional writing system in Southern India — Palm Leaf Manuscripts. Design Thoughts (2009).
  2. Narenthiran R, Saravanan G, Ramanujam K.: The digitization of palm leaf Manuscripts.(2012)
  3. Ali Farhat, Omar Hommos, Ali Al-Zawqari, Abdulhadi Al-Qahtani, FaycalBensaali, Abbes Amira, XiaojunZhai, Optical character recognition on heterogeneous SoC for HD automatic number plate Recognition system(2018)
  4. Jayashree Rajesh prasad, Handwritten character recognition:A review(2014).
  5. R.S.Sabeenian, M.E. Paramasivam, M.E., R. Anand, P.M. Dinesh,  Palm-Leaf Manuscript Character Recognition and Classification Using Convolutional Neural Networks. In: Peng, SL., Dey, N., Bundele, M. (eds) Computing and Network Sustainability. Lecture Notes in Networks and Systems, vol 75. Springer, Singapore. (2019) https://doi.org/10.1007/978-981-13-7150-9_42

S. Ezhilarasi, P. UmaMaheswari, S. Raghavi, “Recognition of Characters using PCE based Convolutional LSTM Networks from Palaeographic Writings”, 2023 4th International Conference on Innovative Trends in Information Technology (ICITIIT), pp.1-6, 2023.

error: Content is protected !!