Discovering Preference Optimization Algorithms with and for Large Language ModelsChris Lu,Samuel Holt,Claudio Fanconi,Alex James Chan,Jakob Nicolaus Foerster,Mihaela van der Schaar,Robert Tjarko LangeNeurIPS 2024(2024)引用 14|浏览29关键词Preference optimization,RLHF,Large Language ModelsAI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要