Heritage Conservation & Preservation through AI & NLP

B.Maaithili M.S.(IT & M)., M.Phil(CS).,

Assistant Professor of Information Technology

Saiva Bhanu Kshatriya College, Aruppukottai

Summary

It interconnects human and technology. Computerized heritage material is possible through AI & NLP. It acts as a superior being than a human mind can imagine.

Key words: Computerized, Heritage, Conservation

Introduction:

Todays technology plays a crucial role in preserving the historical knowledge of culture around the world. Preserving it for the next generation is a must. Our heritage is what we have inherited from the earlier period, to value and enjoy in the current, to protect and pass on to future [1]. Preservation is the guard of past and civilizing areas from human encroachment and human abuse overall, while Conservation is the fortification and safeguarding of chronological and civilizing sites by adaptable individual activity and not entire eliminating humans from the sites.

Heritage Conservation is about managing changes. Its main benefit is preservation of cultural uniqueness & the skill to pass on chronological structures to future generations. However, many cultural heritage objects are at risk of deterioration, scratch or thrashing dues to factors such as time, atmosphere & human behavior.

Now, AI & NLP are used to preserve the cultural heritage things in computerized format. NLP is a subset of AI that aims to make computers know human languages.  By doing  research and development in human languages technology called computational linguistics have evolved. In computational linguistics, research is done on how language works, and various prototypes are created for  machine learning and deep learning purposes  for every language (“Natural Language Processing (NLP) - Overview - GeeksforGeeks”) NLP aids computer in understand, interpreting and manipulating human languages. This technology can extract related information, uncover hidden patterns and provide valuable insights into historical events and cultural practices. It also, translates ancient texts, making them accessible to all over the world and also facilitating cross-cultural understanding. 

NLP Stages:

  1. Morphological & Lexical Analysis:

Morphological analysis stage deals with text at the individual word level. It looks for ‘morphemes’, the negligible unit of a word. Lexical Analysis finds the relation between these morphemes and transfigures the word into its core form. A lexical analyzer also assigns the possible Part-Of-Speech (POS) to the word. It takes into consideration the dictionary of the language. (“Stages of Natural Language Processing (NLP) - byteiota”) 

  1. Semantic Analysis:

It focuses for sense in the given sentence. It also deals with collobarate words into phrases.

  1. Pragmatic Analysis:

It interprets the given text using information from the preceding steps. 

  1. Syntax Analysis:

It ensures that a given piece of text is exact structure. It tries to parse the sentence to check accurate grammar at the sentence level.

  1. Discourse Integration:

 It deals with the effect of a preceding sentence on the sentence in consideration. 

NLP Modeling Techniques:

  1. Stemming & Lemmatization:

It is the Text Preprocessing Techniques. It converts raw text data into a structured format for machine process. Stemmers remove or eliminate word suffixes; lemmatization ensures the output word is an existing normalized form of the word that can be found in the dictionary. 

  1. Tokenization:

The process of breaking down paragraphs into smaller units. It helps computers understand and process human language by splitting it into manageable units. (“What is Tokenization | Tokenization In NLP - Analytics Vidhya”) (“What is Tokenization | Tokenization In NLP - Analytics Vidhya”)

  1. Stop-words Removal:

To remove the words that occurs commonly across all the documents in the corpus.

TF-IDF (Term Frequency – Inverse Document Frequency)

To estimate the importance of different words in a sentence

  1. Keyword Extraction:

Identifying and extracting the most significant words from a section of text. (“Leveraging Language Models for Keyword Extraction”) The words can be used to summarize the content of the text, or to facilitate information retrieval. (“Keywords extraction with Python & NLP | John Snow Labs”) 



  1. Word Embedding:

The words are grouped together based on their meaning in a sentence. (“Improving e-commerce product recommendation using semantic context and ...”)

  1. Sentiment Analysis:

The activity of examine the digital content to establish if the expressive attitude of the message is positive, negative or neutral.

  1. Topic Modeling:

The process of recognizing the words from the topics present in the document.

  1. Text Summarization:

The method of breaking down lengthy text into understandable sentences, it will decrease the time and effort required to read and analyze complex & lengthy texts.

  1. Named Entity Re-cognization:

It is the procedure of extracting useful, ordered information from huge & shapeless databases.

Applications of NLP in the real world include chatbots, predictive text, smart assistant, sentiment analysis, document analysis, online searches, automatic summarization and language translation.

AI powered NLP technologies can digitize & categorize vast archives of cultural artifacts, photographs & documents. This makes it easier to organize and search for specific pieces of knowledge, preserving the cultural heritage of indigenous communities. By analyzing vast amounts of data, create virtual restoration, predict preservation needs and even revive lost languages & traditions.

AI & NLP Benefits for Heritage Preservation:

  •  Digital Conversion of civilizing heritage

  •  The fast and complete examination of representation without physical contact.

  • . Strengthen the  protection efforts

  •  Experts discover and observe factors that can affect the artifacts protection.

  •  Computerized Image renovation

  •  Revive faded and impaired descriptions of past artworks and evidence proof.

  •  NLP for Translating historical scripts

 To understand and decode historical scripts written in ancient or less commonly used languages. It also facilitates cross-cultural research and provides insights into the cultural exchange between civilizations.




Virtual Reality for cultural heritage

These technologies provide immersive experiences that transport users to historical sites and ancient landscapes. It enables accurate and realistic reconstruction of historical cities, monuments, enhancing educational and cultural tourism experiences.

Virtual reality and augmented reality are used in preserving cultural heritage. They combine what the technologists call as inside virtual and outside real. These two similar applications  provide a computerized experience of the heritage for a user with out ever leaving their home or school. There is lesser need for the leaners to visit the  museum. These two offer a virtual and interactive experience They increase the users’ potential to discover cultural contents during the visit to the museum.  Sharing  these immersive experiences across social networks, spurring discursions to increase awareness of lost heritage.

Antique Reconstruction:

Technological tools help analyze the fragmented artifacts for their components thus reconstruct the damaged piece of history by predicting the original artifact. “To reconstruct damaged or fragmented artifacts by analyzing their components and predicting their original form.” (“The Lessons We Learn: AI for Cultural Heritage: Digitizing and ...”)

  1. Information Accessibility:

Access to large & diverse datasets of historical artifacts and documents. (“The Lessons We Learn: AI for Cultural Heritage: Digitizing and ...”)

  1. Language Preservation:

By analyzing linguistic data, AI aids in the preservation of endangered languages, ensuring that valuable aspects of culture are not lost. Tools like, machine translation services enable wider communication within communities that hold these cultural heritage. (“AI, Ethics, and the Preservation of Cultural Heritage”) 

  1. Collaboration & Global Efforts

Preserving cultural heritage is global endeavor requiring collaboration among researchers, cultural institutions, governments and technology developers.

AI & NLP obstacles for Heritage Preservation:

Misrepresentation of Artifact:

Computers  are not human.  They only work with the dataset given to them. They are not yet trained to search for different datasets. Computational data does not include every culture. This will create misrepresentation. "Cultural symbols or stories being misunderstood due to a dataset that lacks diversity, leading to a distorted preservation of history." (“AI, Ethics, and the Preservation of Cultural Heritage”). Efforts need to be made to collect dataset for different cultural background.

Exclusion of Artifact:

 Communities that are illiterate with their technological skills lack their ability to create a reliable dataset regarding their heritage and culture. This creates a problem when using  the  computerized language models. Historical accounts of marginalized communities being further marginalized if, AI system are not trained on inclusive datasets. (“AI, Ethics, and the Preservation of Cultural Heritage”) Lack of dataset from uninfluential culture  can result in exclusion of artifacts  as a whole.

  1. Identifying Fake Artifact:

 Authentications of cultural artifacts helping experts distinguish genuine items form forgeries.

  1. Detection of Offensive Artifacts:

 Due to the offensive content in the dataset my lead to fault preservation.

Conclusion:

  Now a day, AI & NLP has revolutionized cultural tradition preservation by facilitating successful conversion of digital format, supporting with re-establishment efforts and improving user-friendliness with the help of dataset. Using AI and NLP with ethical principles to preserve cultural heritage is necessary. One can use educational platforms to bring community engagement and interaction  to use the available technology ethically.

References:

  1. https://www.heritagecouncil.ie/about/what-is-heritage

  2. https://www.geeksforgeeks.org/natural-language-processing-overview/

  3. https://byteiota.com/stages-of-nlp/

  4. https://www.analyticsvidhya.com/blog/2020/05/what-is-tokenization-nlp/

  5. https://medium.com/@jamesgondola/ai-ethics-and-the-preservation-of-cultural-heritage-e872f5f34a6c

  6. https://www.degruyter.com/document/doi/10.1515/9783839467107-011/html?lang=en

  7. https://www.deeplearning.ai/resources/natural-language-processing/

  8. https://www.researchgate.net/publication/362253484_Role_of_Artificial_Intelligence_in_Preservation_of_Culture_and_Heritage

  9. https://www.europarl.europa.eu/RegData/etudes/BRIE/2023/747120/EPRS_BRI(2023)747120_EN.pdf

Author
கட்டுரையாளர்

B.Maaithili M.S.(IT & M)., M.Phil(CS).,

Assistant Professor of Information Technology

Saiva Bhanu Kshatriya College, Aruppukottai