騰訊實時推薦實踐
閱讀TencentRec: Real-time Stream Recommendation in Practice
大數據環境下的實時推薦需求,克服三大難題:大數據,實時性,準確度;
大數據,用戶數據,業務數據;實時基于 storm 處理;算法主要基于 item-based , content-based , demographic ,并且
根據實時特征,結合業務進行創新。
Highlights
1 Traditional recommender systems that analyze data and update models at regular time intervals, e.g., hours or days, cannot meet the real-time demands .
往往,實時用戶意圖更能真實的展現用戶需求,離線計算的大多數是預測,而且大多數不準。 Traditional recommender systems cannot make fast responses to users ' preference changes and capture the users’ real-time interests, thus resulting in bad recommendation results。這一塊感同身受。
2 實時推薦系統問題,系統性能,數據稀疏性和隱式反饋,算法問題
3 騰訊實時推薦系統主要工作:
大數據環境下,實現傳統 item-based,content-based, demographic 算法,并且將其應用到騰訊各個業務之中;
4 系統架構
( 1 )平臺選擇
支持實時計算,高可伸縮性,優秀的容錯性能,選擇 storm

( 2 )數據訪問接口

( 3 )數據存儲

5 算法設計
工業應用實踐考慮,易用性和準確度, ROI
( 1 ) item-based CF

處理隱式反饋問題,增量更新,裁剪技術減少計算成本
There are various types of user behaviors in our scenario, including click, browse, purchase, share, comment, etc.
通過技術手段,將隱式行為轉化為顯式評分。

增量更新

更新流程

we utilize the Hoeffding bound theory and develop a real-time pruning technique
( 2 )數據稀疏性處理
We develop two mechanisms to solve the data sparsity problem, including the demographic clustering and the demographic based complement .
( 3 )實時過濾機制
方法 1 ,采用時間窗口,基于 session 過濾數據;
方法 2 ,根據最近的行為做推薦種子。Besides the sliding window mechanism, we propose a real-time personalized filtering technique to serve the individual users ' realtime demands. For each user, we record the recent k items that he is interested in.
6 系統架構

7 應用點
騰訊視頻,易迅網,騰訊文學,微信,大眾點評,騰訊新聞, qq 空間等
參考文獻:
TencentRec: Real-time Stream Recommendation in Practice
啟發點:
( 1 )增量更新計算 item-based CF , demographic -based 剪枝
( 2 )系統性能