An Empirical Study of Mamba-based Language Models
Roger Waleffe,Wonmin Byeon, Duncan Riach,Brandon Norick,Vijay Korthikanti,Tri Dao,Albert Gu,Ali Hatamizadeh, Sudhakar Singh,Deepak Narayanan, Garvit Kulshreshtha, Vartika Singh,Jared Casper,Jan Kautz,Mohammad Shoeybi,Bryan Catanzaro CoRR(2024)
AI 理解论文
溯源树
样例
