Quick Links
|
International Journal of Advanced Innovative Technology in Engineering (IJAITE)Word Sense Disambiguation approach in Cross Language Information Retrieval Vivek A. Manwar, A. B. Manwar Abstract : Cross-Language Information Retrieval enables users to retrieve documents written in a language different from the query language, thereby overcoming linguistic barriers in information access. However, CLIR for Indian regional languages remains challenging due to lexical ambiguity, morphological richness, and limited linguistic resources. In particular, Marathi language exhibits a high degree of polysemy and homonymy, which often leads to semantic drift during query translation and degrades retrieval performance. This paper investigates the role of Word Sense Disambiguation in improving Marathi–English CLIR and proposes a novel hybrid WSD-based CLIR framework tailored for low-resource languages. The proposed approach integrates Marathi-specific morphological analysis, knowledge-based sense inventory from Marathi WordNet, contextual semantic similarity modeling, and sense-aware query reformulation to ensure semantically faithful translation. Experimental evaluation is conducted on a Marathi query set and an English document corpus using standard information retrieval metrics. Comparative results demonstrate that the proposed hybrid WSD-based CLIR framework significantly outperforms dictionary-based, first-sense, and knowledge-based baselines, achieving superior early precision and ranking effectiveness. Keywords : Information Retrieval, Cross-Language Informatio Full Text : Download PDF DOI : 10.65809/IJAITE/26/v11i01/001 Cite this paper : - References : [1] HL Shashirekha and Ibrahim Gashaw, “Enhanced Amharic Arabic Cross-Language Information Retrieval System using Part of Speech Tagging”, IEEE, 2019. [2] Nurul Amelina Nasharuddin et al, “A Review on Building Bilingual Comparable Corpora for Resource-limited Languages”, IEEE, Fourth International Conference on Information Retrieval and Knowledge Management, 2018. [3] Jay Patel et al, “Cross-lingual Information Retrieval: application and Challenges for Indian Languages”, 5th International Conference for Convergence in Technology (I2CT) Pune, India, Mar 2019. [4] Gauri Dhopavkar et al, “Application of Rule Based Approach to Word Sense Disambiguation of Marathi Language Text”, IEEE Sponsored 2nd International Conference on Innovations in Information Embedded and communication systems (ICIIECS) 978-1-4799-6818-3/15, 2015. [5] Paheli Bhattacharya et al, “Using Communities of word Derived from Multilingual Word Vectors for Cross-Language Information Retrieval in Indian Languages”, ACM Trans. Asian Low-Resour. Lang. Inf. Process. 18, 1, Article 1, December 2018. [6] Rabih Zbib et al, “Neural-Network Lexical Translation for Cross-lingual IR from Text and Speech”, ACM, ISBN 978-1 4503-6172-9, 2019. [7] Vijay Kumar Sharma, “Cross-Lingual Information Retrieval: A Dictionary-Based Query Translation Approach”, Springer, Advances in Computer and Computational Sciences, Advances in Intelligent Systems and Computing 554, 2018. [8] D Thenmozhi, “Ontology-based Tamil–English cross-lingual information retrieval system”, Springer, June 2018. [9] Nazreena Raman and Bhogeswar Borah, “Improvement of query-based text summarization using word sense disambiguation”, Complex & Intelligent Systems (2020) 6:75 85, Springer, 2020. [10] Bilel Elayeb, “Arabic word sense disambiguation: a review”, Springer, 12 March 2018. [11] Sreelakshmi Gopal and Rosna P Haroon, “Malayalam Word Sense Disambiguation using Naïve Bayes Classifier”, IEEE, International Conference on Advances in Human Machine Interaction (HMI - 2016), March 2016. [12] Krishnanjan Bhattacharjee et al, “Survey and Gap Analysis of Word Sense Disambiguation approaches on Unstructured Texts”, Proceedings of the International Conference on Electronics and Sustainable Communication Systems (ICESC 2020) IEEE, ISBN: 978-1-7281-4108-4, 2020. [13] Alok Ranjan Pal and Diganta Saha, “Word Sense Disambiguation in Bengali: An Auto-updated Learning Set Increases the Accuracy of the Result”, Springer Information Systems Design and Intelligent Applications, Advances in Intelligent Systems and Computing, pp. 423-430, 2016. [14] VARINDER PAL SINGH, “Word sense disambiguation for Punjabi language using deep learning techniques”, Springer, November 2019. [15] ALOK RANJAN PAL, “Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications”, Springer, 11 May 2018. [16] Lokesh Nandanwar, “Graph Connectivity for Unsupervised Word Sense Disambiguation for HINDI Language”, IEEE Sponsored 2nd International Conference on Innovations in Information Embedded and Communication systems (ICIIECS), 978-1-4799-6818-3, 2015. [17] Gauri Dhopavkar, “Syntactic Analyzer using Morphological Process for a Given Text in Natural Language for Sense Disambiguation”, IEEE 978-1-4799-4236-7, 2014. [18] Nutan B. Zungre, Gauri M. Dhopavkar, “Sense Disambiguation For Marathi Language Words Using Decision Graph Method”, 978-1-4673-9214-3/16 , IEEE Sponsored World Conference on Futuristic Trends in Research and Innovation for Social Welfare(WCFTR),2016. [19] Sudha Bhingardive and Pushpak Bhattacharyya, ”Word Sense Disambiguation Using Indo WordNet”, Springer, pp. 243-260, 2017. [20] R.K. Sharma and Parteek Kumar, “Development of Punjabi WordNet, Bilingual Dictionaries, Lexical Relations Creation and Its Challenges”, Springer, The WordNet in Indian Languages, pp. 83-99, 2017. [21] Neeraja Koppula et al, “Word Sense Disambiguation in Telugu Language Using Knowledge-Based Approach”, Springer, Proceedings of the Third International Conference on Computational Intelligence and Informatics, Advances in Intelligent Systems and Computing, Vol. 1090, pp. 153-161, 2020. [22] Sanjay Kumar Dwivedi and Shweta Vikram, “Word Sense Ambiguity in Question Sentence Translation: A Review”, Springer, Information and Communication Technology for Intelligent Systems (ICTIS 2017), Volume 84, pg. no. 64-71, 2017. [23] Ballesteros, and Croft, “ Dictionary Methods for Cross-Lingual Information Retrieval”, 7th DEXA Conf. on Database and Expert Systems Applications, Pg no. 791-801, 1996. [24] Jenq-Haur Wang, “Translating Unknown Cross-Lingual Queries in Digital Libraries Using a Web- based Approach”, Conference on Digital Libraries (JCDL’04), ACM, 2004 [25] Sadat Fatiha , “Exploiting a Multilingual Web-based Encyclopedia for Bilingual Terminology Extraction”, PACLIC 24 Proceedings, 2011. [26] Pratibha Bajpai, “Cross Language Information Retrieval: In Indian Language Perspective”, IJRET: International Journal of Research in Engineering and Technology, June-2014 |