Artificial Intelligence Based Text to Speech Generators for Tamil Language

Mrs.V. Vanthana

Department of Computer Applications,The Standard Fireworks Rajaratnam College For Women,.
Sivakasi,

Abstract

The discovery and the dynamic growth of the artificial intelligence technology has brought advances and augmentations in various industries. AI voice generator is one of the most crucial implementations of such technology. It utilizes the deep learning algorithms which is already trained with the datasets containing documented sounds for the understanding of natural language patterns. AI voice generator is a TTS (text to speech) tool which generates realistic, synthesized human voices according to the text inputted by the user. The tool can be customized to different talking styles, ages, genders and can support up to 120 languages and one among them is our semmozhi Tamil. These generators can be used for many utilities such as reading disabilities, e-learning, understanding pronunciations, etc. There exist numerous AI voice generators for Tamil text to speech conversion. This paper contains an overview of those applications specifically for the Tamil language.

Keywordsartificial intelligence, TTS, voice generators, Tamil text to speech, AI

I. Introduction

In the current digital scenario, Artificial intelligence has become an essential one and integrated in the day-to-day activities of all consumers, fundamentally transforming customer interactions with stakeholders [1], [2]. AI is the technology that makes the machine to think, act and react like human. Once trained, AI systems can think and behave like and better than human. The demand of AI is continuously increasing now a days as it can solve complex problems in limited time with limited human resources.

There exist many modes of interaction between human and computer. Among them, the easiest one is the text. To provide such interaction so naturally like the interaction with human, speech synthesis is used. It is the process of converting the text to the audio files with human synthesized voice of different styles, age and gender. [4]

The quality percentile of the voice resolves the effectiveness of the speech synthesis process. Multiple technologies have been evolved now to optimize the sound quality. With this, the speech synthesis finds itself useful in tasks such as assisting person with visual disability, spell checking, learning and teaching linguistic with accurate intonation. The implementation of speech synthesis is available as voice assistant in almost every smart phones today namely – Alexa, Siri, Google Assistant, etc. [5] The voice assistant achieved more preference among the consumers as it is easy to practice as well as time saving. [6]. Through this preference in mind, researchers also contributing many advancements for the progress of speech synthesis to satisfy the different needs of consumers. [7]

The contribution of AI to TTS can afford up to 120 languages including Tamil. The software used for the TTS using AI technology is called as AI voice over generators. There are many AI voice apps and this paper provides an overview of such one.

II. AI voice over generators

The tool utilized for the translation of textual data to speech is called as AI voice generators. Almost all smart gadgets existing today contains this feature. The speech cast-off in the conversion process matches the human voice and it is possible to customize the style, pitches, age, gender, etc.

The necessity of such technology is supplementary by the way it is helpful for individuals with reading disabilities, and can also assist during e-learning content creation. Another utility is it can assist the story readers when they are bored to read story but willing to listen as voice. Similarly, there are multiple purposes existing for the AI voice generators for the consumers.

III. Beneficiaries of AI voice Over generators

There are numerous benefits to using text-to-speech technology, or AI voice generator tools. The beneficiaries include students, abetting disable one, novices in learning, Travelers, aged people, professionals, non – native speakers.

A. Students

            Multimedia will reach the student mind faster than the text content. Tamil text to voice tool available online can assist them in interpreting their study materials, text books, etc into attractive voice and can study though they are indulged in some other activities. It is just like listening songs. So, the students can grasp the contents effectively and can remember the concept forever.

B. Assisting disable individuals

TTS is really a boon to the any kind of disability individuals like dyslexia, visual and leisurely learning. Such persons may also acquire literary awareness and learn things like the regular people through the text to speech technology.

C. Novice Language learners

If a learner is novel to the idioms of an unknown language, then the TTS can assist them by teaching the pronunciation and meaning. With this, they can easily become familiar with the languages and attain fluency in short tenure.

D. Commuters

Commuters means travellers. They usually loved to explore the world in cars, bikes, etc through self-driving.  TTS plays its role here by engaging them with route maps, news, songs, or any other contents thereby they can enjoy driving without any concentration issue.

E. Elderly people

The main problem of old age people is visual depreciation. Day by day they can loss their visual clarity. They can’t read the text in small fonts. Here TTS helps them to read books, posts, medicine names, shop boards, newspapers, account statements, etc. It provides the feel of friend and well wisher to the old age persons.

F. Professionals

Usually, the professionals like teachers, doctors, celebrities, lawyers, etc can’t afford time for the awareness of social activities out of their busy schedule. But they can get the chance of listening world news, learning the new innovations in their domain through TTS without affecting their usual activities.

G. Non-native speakers

            Non native person means they can come to a novel place from their native for the purpose of learning, working or some other commitments. There they can face lot of linguistic issues in their day-to-day activities mainly during shopping. TTS can help them to learn the language, with the perfect accent and to practice it quickly.

IV. Available AI voice over generators for Tamil language

A. Speechify

Speechify is one of the best Tamil texts to speech tool available as mobile application as well as chrome extension. The AI voice in it can process nearly 900 words in a minute. The translation/ comprehension is faster even the dyslexia persons can listen. Also exits a provision to input the image instead of text and, here the voice output will be produced. For this OCR software which means Optical Character Recognition is used. The main use cases of Speechify are listening story books, pdfs, any content documents, etc.

Speechify is available as a desktop as well as mobile application. It provides the option of first three-day trial after that cancel or subscribe for the premium plans. The free version is available with limited TTS feature.

Here the user can type the content in the white space. After that, the play button must be pressed, automatically the content will be read by the AI voices. To upload the document, click the Add files option in the left menu. To upload image input, the library option must be selected. It is one of the best options for teachers to teach and students to learn.

B. Dubverse

The application that produces high quality sound through the advanced AI technology is Dubverse. Here the user can input a text and het the audio file as output. More than 30 languages

both Indian and global are supported here. It is also possible to customize the voice range, accents, etc according to the user.

This is also available as desktop as well as mobile application. For free version only limited features allowed. But to avail full version pro subscription must be done.

Here the user must click the create text to speech button and then write the text, then automatically voice over file will be produced.

C. Murf:

Murf is one of the best Tamil TTS software used to create audios and videos. The unique feature of this is the application contains two natural voices – male voice named Senthil, and female voice named Chitra. The user can give voice overs to their story, videos, etc similar to the natural sounds. As Tamil is a local language for everyone, it helps to increase the consumer of our business.

Without typing, it is also possible to copy and paste the Tamil content from anywhere and give either the voice of male AI or female AI.

Whatever the functionality the user want, can be accessed for the Products menu. To generate e-learning materials, create option can be used. Both the desktop and mobile applications are applicable for Murf. It is mainly used for narration of e-learning materials. Regarding pricing, first 10 minutes after login is considered as free trial. Afterwards have to subscribe for further usage. In the free period 120+ voices are allowed.

D. ResembleAI:

Resemble AI is the next best TTS for Tamil language. Here one peculiar feature is that the user can use the default voice with customization or else the user can clone his/her own voice. This thing grasps more attention towards this application rather than others. The most interesting thing in this application is regarding price. Here the user can pay the price based on the time

duration the application is used.

The application can be tried for free with limited voices. But to clone our own voice, subscription is needed. In pro version pricing, full version with all features can be availed.

E. Fliki

Fliki is a flexible application for both the business people as well as the content creaters. It is more user friendly in which the audio or video can be produced with a minimum effort. Anyone can access the application.

One more thing to consider if that this app can repeat the human speech patters with emotions to produce a unique content. The use cases are mainly in education, product demonstration in commercial applications, self-explaining videos, etc.

Fliki is available in free as trial. Whatever the functionality needed can be accessed form the Features menu. It is possible to convert text to speech, video as well as blog or tweet to video. Here too both the desktop as well as mobile applications are there.

F. VEED.io

VEED is the most powerful TTS app that supports the voices of actors. In addition, here there is provision to add sound effects, and background music to the contents. The sounds here are clear, realistic and follows accurate Tamil accent.

In Veed TTS, the textual data can be inputted in the top left most textbox. To listen the speech, the play button must be clicked. It is free to use. No subscription needed for usage.

G. Natural Readers software for converting Text-to-Speech:

A text-to-speech in short termed as TTS is used to translate typical linguistic text into speech. It is the best assistive aides for people affected with reading or visual impairment [8]. This kind of technology assists by letting them to hear the text appearing on the system screen. It is an improvement over the traditional Braille because once the software is installed, it can read everything existing on the screen, whatever the content format may be (e.g. .pdf or website content). This allows students to involve themselves in online activities, access online course materials, check email or other social network messages, etc. One of the famous TTS software is Natural Readers.

Natural Reader is an affordable and efficient assistive technology tool with diverse applications. Through this software, it is possible to improve the learning capability and the life style of disabled students like the other non-disabled students. Its ease of use, compatibility feature with Microsoft Office programs, and high quality, natural sounding speech make it a tool of choice for learning disabled and visually impaired learners. It has the potential to read out the print-based material with good quality and sweet voice. The NeoSpeech TTS with high quality and natural sounding voice is used for reading the content. [9]

But this TTS tool won’t support Tamil language. For other global languages, it is a suitable one.

V. Comparison of AI voice generators for Tamil language

S.NoApplication NameFeaturesSupported Platform & device
     SpeechifyTTS and audio booksWindows, Android, Apple, Linux – Desktop and Mobile
 DubverseDubbing, Subtitles, TTSWindows – Desktop and Mobile
 MurfTTS, Transcription, Voice Cloning, Voice over video, Voice Over Google Slides, Voice changerWindows – Desktop and Mobile
 ResembleTTS, Speech to Text, Voice CloningWindows, Android – Desktop and Mobile
 FlikiTTS, Text to video, PPT to video, Photo to video, Blog to video, Tweet to video, Voice CloningWindows, Android – Desktop and Mobile
 Veed (open source)TTS, Video Editing, Screen Recorder, Subtitles, TranscriptionsWindows, Android – Desktop and Mobile

The table provides the overview of the features provided by the above discussed AI voice generators. All the applications support Tamil language.

VI. Conclusion

AI voice generator which is a tool for textual data to voice is really a boon that provides assistance in reading, learning, pronunciation, etc. There are many applications that provides such provision and all these are user friendly, easy to access, easily affordable, and time saving one. It can help students, educators, people with reading disabilities, professionals etc. It is a great thing that such application supports our Tamil language too. Apart from the above apps, still there are numerous tools available for TTS in Tamil language. It acts as the means to make our thaai mozhi Tamil to reach the nook and corner of the world.

References

[1] Moriuchi, E. (2019). Okay, Google! An empirical study on voice assistants on consumer

engagement and loyalty. Psychology & Marketing, 36(5), 489–501.

[2] Karnouskos, S. (2018). Self-driving car acceptance and the role of ethics. IEEE Transactions on Engineering Management, 67(2), 252–265.

[3] Tzafestas, Spyros, and Henk Verbruggen. “Artificial intelligence in industrial decision making, control, and automation: an introduction.” Artificial Intelligence in Industrial

Decision Making, Control and Automation. Springer, Dordrecht, 1995. 1-39.

[4] H. Sak, T. Gung, and Y. Safkan, “A corpus-based concatenative speech synthesis system for Turkish,” Turkish Journal of Electrical Engineering Computer Sciences, vol. 14, no. 2, pp. 209-223, 2006.

[5] G. Lopez, L. Quesada, and L. A. Guerrero, “Alexa vs. siri vs. Cortana vs. Google assistant: A comparison of speech-based natural user interfaces,” in Proc. International Conference on Applied Human Factors and Ergonomics, May 2018, pp. 241-250.

[6] Feng, H., Fawaz, K. & Shin, K. G. (2017, October). Continuous authentication for voice assistants. In Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking (pp. 343–355).

[7] McLean, G. & Osei-Frimpong, K. (2019). Hey Alexa…examine the variables influencing the use of artificial intelligent in-home voice assistants. Computers in Human Behavior, 99,

28–37.

[8] Anusha Joshi, Deepa Chabbi , Suman M and Suprita Kulkarni, “Text To Speech System For Kannada Language”, IEEE ICCSP 2015 conference, Pages 1901 – 1904

error: Content is protected !!