Multilingual Meta-Distillation Alignment for Semantic Retrieval
PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024(2024)
摘要
Multilingual semantic retrieval involves retrieving semantically relevant content to a query irrespective of the language. Compared to monolingual and bilingual semantic retrieval, multilingual semantic retrieval requires a stronger alignment approach to pull the contents to be retrieved close to the representation of their corresponding queries, no matter their language combinations. Traditionally, this is achieved through more supervision in the form of multilingual parallel resources, which are expensive to obtain, especially for low-resource languages. In this work, on top of an optimization-based Model-Agnostic Meta-Learner (MAML), we propose a data-efficient meta-distillation approach: MAML-Align,1 specifically for low-resource multilingual semantic retrieval. Our approach simulates a gradual feedback loop from monolingual to bilingual and from bilingual to multilingual semantic retrieval. We systematically compare multilingual meta-distillation learning to different baselines and conduct ablation studies on the role of different sampling approaches in the meta-task construction. We show that MAML-Align's gradual feedback loop boosts the generalization to different languages, including zero-shot ones, better than naive fine-tuning and vanilla MAML.
更多查看译文
关键词
Semantic Retrieval,Meta-learning,MAML,Multilingual Representations,Knowledge Distillation,Meta-Distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要