On-Device Learning for Model Personalization with Large-Scale Cloud-Coordinated Domain Adaption
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(2022)
摘要
Cloud-based learning is currently the mainstream in both academia and industry. However, the global data distribution, as a mixture of all the users' data distributions, for training a global model may deviate from each user's local distribution for inference, making the global model non-optimal for each individual user. To mitigate distribution discrepancy, on-device training over local data for model personalization is a potential solution, but suffers from serious overfitting. In this work, we propose a new device-cloud collaborative learning framework under the paradigm of domain adaption, called MPDA, to break the dilemmas of purely cloud-based learning and on-device training. From the perspective of a certain user, the general idea of MPDA is to retrieve some similar data from the cloud's global pool, which functions as large-scale source domains, to augment the user's local data as the target domain. The key principle of choosing which outside data depends on whether the model trained over these data can generalize well over the local data. We theoretically analyze that MPDA can reduce distribution discrepancy and overfitting risk. We also extensively evaluate over the public MovieLens 20M and Amazon Electronics datasets, as well as an industrial dataset collected from Mobile Taobao over a period of 30 days. We finally build a device-tunnel-cloud system pipeline, deploy MPDA in the icon area of Mobile Taobao for click-through rate prediction, and conduct online A/B testing. Both offline and online results demonstrate that MPDA outperforms the baselines of cloud-based learning and on-device training only over local data, from multiple offline and online metrics.
更多查看译文
关键词
model personalization,device-cloud collaborative learning,domain adaptation,recommender systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要