Effective ICT Tools for Tamil Research

Mrs.D.Suthamaheswari,

Assistant Professor of Computer Science,

The Standard Fireworks Rajaratnam College for Women,

Sivakasi.

Abstract

Tamil is one of the longest-surviving classical languages in the world. Tamil has the richest and the most ancient literature. Tamil research plays a key role in Tamil language development. In digital era, various opportunities and technologies are available for supporting Tamil Research development. The  paper entitled as “Effective ICT Tools for Tamil Research” aims to notify Tamil researchers about facilities and tools available for their research work. In this paper, a list of Information and Communication tools is suggested to facilitate research activities such as Data Collection, Data Processing and Data Analysis and visualization. This paper will provide an obvious view of available ICT tools supporting Tamil research.

Keywords: ICT Tools, Digital technology, Research activities

Introduction

Tamil is an ancient language. Computerizing and digitizing Tamil texts and scripts is very challenging. On account of development in Tamil computing, several technologies are emerged for Tamil computerization and digitization. Various tools and technologies are available for Tamil Research development. In this paper, a view of Information and Communication tools, that provide facilities in research is carried out and listed. Now a days many number of tools aid Tamil researchers efficiently. Following are the available tools and technologies efficient for data collection, data processing, data analysis and data visualization.

ICT Tools for Data Collection

Data collection plays an vital role in research. In earlier days, data could be collected through reading books in libraries, taking surveys, interviews, through observations, focus groups, manuscripts and inscriptions.  Some barriers in data collection are

  • Unavailability of books
  • Lot of time consuming in taking reference
  • Have to move to various places for taking surveys and interviews
  • Handling and storing manuscripts became rigid.
  • Reading and understanding inscriptions is hard.

Today’s technological development reduced the barriers of data collection activities dramatically. Data collection becomes a usual practice. It does not need researchers to move around and it is less time consuming and also carried out as parallel task. Some of the data collection tools are

  • Tools for Survey and Questionaries: Google Forms is familiar tool providing facilities for taking surveys through questionaries. Google Form is an online form and surveys creator with multiple question types and templates. Google forms can be accessed with Google personal account or Google workspace account. It provides analytical feature in captured responses. It is entirely free to use.
    • Applications for meetings and interviews:  Skype, Google Meet, BlueJeans, Jitsi, Teams, Zoom, Slack applications reduces mobility of researchers through video conferencing.
    • Skype : Skype offers a number of features based around calling on both free and paid, instant messaging, voice messaging and text messaging, video chat, and file and screen sharing. Chat histories are stored on Skype’s servers.
    • Google Meet: Google meet is a web-only platform that users can also access from mobile applications. The platform is entirely free upto 100 participants. Participants can share their screen during a call.
    • BlueJeans : The BlueJeans is a cloud based mobile video conferencing app lets anyone attend and host an interactive conference call from an iPhone, iPad, or Android mobile device.
    • Jitsi: Jitsi is a collection of free and open-source multiplatform voice, video conferencing and instant messaging applications for the Web platform, Windows, Linux, macOS, iOS and Android. It also support features such as attended and blind call transfer, auto reconnects, call recoding, call encryption with SRTP and ZRTP, message waiting indication etc.
  • Zoom: Zoom is a video conferencing platform that can be used through a computer desktop or mobile app, and allows users to connect online for video conference meetings, webinars and live chat.
  • Platforms for observations:
  • Blogs: Tamil Blogs like YourTechBuzz, Only4Tamil, Trending Tamil News Today, Story of Paradise provides knowledge about current trend through regular reading. The reviews, comments and discussions of followers and subscriptors give clear view about any particular topic.
  • Video Channels: YouTube videos like discussions, reviews of Tamil literatures, interviews of researchers, online visit to historical places help the researches to view their ideas in different perspectives. Todays trend WhatApp channel give regular updates upon required topic.
  • Image to Text translation Tools :

Manuscript, inscriptions are one of the data collection medium. Whose scanned images can be easily transformed into proper text using OCR (Optical Technology Recognition) technology.

  • OCR Bear is a free image reader feature that can convert image into a searchable and editable text.
  • i2OCR is a free online Optical Character Recognition (OCR) that extracts Tamil text from images and scanned documents so that it can be edited, formatted, indexed, searched, or translated.
  • Google Lens is another feature that captures not only the physical attributes of an object but also collect the information about it. It is suitable for Android/IOS devices.

              Increasing sophistication in techniques for data collection will lead to ongoing improvements in research.

ICT Tools for Data processing

Years back Tamil could not be typed directly in computer. Tamil could be written only in Roman characters. The development of transposition in the Tamil alphabet led to the development of direct typing of the Tamil alphabet. Collected data have to entered, processed and analysed to conclude the research. These activities can be done using either online or offline tools.  Many tools are available for data entering and processing. Some of them are

  • Data Entry Tools:
  • Word processors : Data are computerized using Word processors such as Microsoft Word, Azhagi, Tamil OpenOffice writer, Tamil Libre, Kamban 3.0, Mentamizh are familiar word processors exclusively for Tamil Language. These processors support SaiIndira, TamilBible, Unicode, STMZH, Vanavil, Shreelipi, LT-TM, Tscii, TAB, TAM, Bamini font encodings. Tamil has became transformed into Unicode by using NLP (Natural Language Processors) applications based on Machine Learning. NLP applications have many features such as Language processing, Smart Assistant, Language translation, online searches, predictive text.
  • Translators : Online translators are available which facilities the user to type both in bilingual. Translators transform entered Tamil text into English and vice versa.

Translators examples: Online Tamil Converters, TypingBaba, tamiltyping.in., Google Translate

  • Convertors : Online converters facilitate to convert our references from one format to another format like PDF to word.

Convertors examples : iLovePdf, smallpdf, Nitro, Xodo, Adobe

  • Speech to Text : Speech to text conversion tools prevent the researcher from typing. It is faster than typing a document. The available tools for speech to text conversion are Tamil voice typing App, Typing Guru, Speech Typing are available in online.
  • Data Processing Tools:
  • Morphological Tools: ThamizhiMorph, which  is an open source Tamil morphological analyser cum generator, which handles the inflectional morphology of Tamil verbs, nouns, and other types.
  • Spell Checkers: Spell checkers such as Tamilspellchecker, LanguageTool provide facility by showing the misspelled word with red underline and can be corrected with the suggested words.
  • Grammer and Language checker: Modern Tamil Sandhi-based Grammar checker (MTSGC) is the foundation for Prose-based Grammar Checker based on machine language. Tamilspellchecker, LanguageTool also support grammar and language checking in Tamil
  • Word Autocorrect: MenTamsizh word processor provides word autocorrect facility for Tamil language.

 ICT Tools for Data Analysis

Data Analysis is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data. Various tools are available to analyse and represent data both in online and offline.

  • Text corpora are large and structured sets of texts, which have been systematically collected. Text corpora are used by corpus linguists and within other branches of linguistics for statistical analysis, hypothesis testing, finding patterns of language use, investigating language change and variation, and teaching language proficiency.

Tamil Text corpora provides tools such as Word Count, Word Analyzer, Character Identifier, Word Concordance, Renaming the files, Domain Identifier

  • Speech corpora is a large database containing audio recordings of spoken language. With Speech Corpora, there is no need to collect and process recordings. It deals with large amount of data.
  • Parallel corpora contain a collection of original texts in language L1 and their translations into a set of languages L2 … Ln. In most cases, parallel corpora contain data from only two languages.
  • Microsoft Excel, Google Sheets, Apache OpenOffice Calc, WPS Office Spreadsheets provide analysis and visualization facilities. Analysis can be done by built in formulas such as sum, count, comparison, correlation etc. Data can be visualized through charts, graphs, tables.

            Tamil researchers can be benefited through these data analysis and visualization tools available.

Conclusion

In the digital environment, technology plays an integral role in communication and information exchange, There are only few tools are available in Tamil language compared to other languages. But researchers and users can use translators and converters for their work. Even though translators and convertors are in practice, it would be efficient when invention of many tools with several features related to Tamil language. User friendly tools would be effective. Tools like ChatGpt would be beneficial, which retrieve information from various websites such as Wikipedia. Rapid development of technology and research will satisfy the needs of Tamil researchers.

References

  1. https://www.researchgate.net/publication/329962423_TECHNOLOGICAL_DEVELOPMENT_OF_TAMIL
  2. https://github.com/narVidhai/tamil-nlp-catalog
  3. https://www.youtube.com/watch?v=qCc12zpzysY
  4. https://www.researchgate.net/publication/280713475_Contextual_spell_checking_for_Tamil_Language
  5. https://www.tamilvu.org/en/content/corpus-analysis-tools
error: Content is protected !!