DeepFM算法:
论文: A Factorization-Machine based Neural Network for CTR Prediction,2017
https://arxiv.org/abs/1703.04247
- FM(因子分解机,之前的博文有讲过)可以做特征组合,但是计算量大,一般只考虑2阶特征组合
- 如何既考虑低阶(1阶+2阶),又能考虑到高阶特征 => DeepFM=FM+DNN
- 设计了一种end-to-end的模型结构 => 无须特征工程
在各种benchmark和工程中效果好
Criteo点击率预测, 4500万用户点击记录,90%样本用于训练,10%用于测试
Company*游戏中心,10亿记录,连续7天用户点击记录用于训练,之后1天用于测试
复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38import pandas as pd from sklearn.metrics import mean_squared_error from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder from deepctr.models import DeepFM from deepctr.feature_column import SparseFeat,get_feature_names #数据加载 data = pd.read_csv("movielens_sample.txt") sparse_features = ["movie_id", "user_id", "gender", "age", "occupation", "zip"] target = ['rating'] # 对特征标签进行编码 for feature in sparse_features: lbe = LabelEncoder() data[feature] = lbe.fit_transform(data[feature]) # 计算每个特征中的 不同特征值的个数 fixlen_feature_columns = [SparseFeat(feature, data[feature].nunique()) for feature in sparse_features] linear_feature_columns = fixlen_feature_columns dnn_feature_columns = fixlen_feature_columns feature_names = get_feature_names(linear_feature_columns + dnn_feature_columns) # 将数据集切分成训练集和测试集 train, test = train_test_split(data, test_size=0.2) train_model_input = {name:train[name].values for name in feature_names} test_model_input = {name:test[name].values for name in feature_names} # 使用DeepFM进行训练 model = DeepFM(linear_feature_columns, dnn_feature_columns, task='regression') model.compile("adam", "mse", metrics=['mse'], ) history = model.fit(train_model_input, train[target].values, batch_size=256, epochs=1, verbose=True, validation_split=0.2, ) # 使用DeepFM进行预测 pred_ans = model.predict(test_model_input, batch_size=256) # 输出RMSE或MSE mse = round(mean_squared_error(test[target].values, pred_ans), 4) rmse = mse ** 0.5 print("test00 RMSE", rmse)
最后
以上就是默默冰淇淋最近收集整理的关于DeepFM原理以及DeepCTR代码实现的全部内容,更多相关DeepFM原理以及DeepCTR代码实现内容请搜索靠谱客的其他文章。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复