概述
以前一直在猜测RFM的实现原理,今天总算了解了一点
直接附上R code,google所得:
##Creating Random Sales Data of the format CustomerId (unique to each customer), Sales.Date,Purchase.Value
sales=data.frame(sample(1000:1999,replace=T,size=10000),abs(round(rnorm(10000,28,13))))
names(sales)=c("CustomerId","Sales Value")
sales.dates <- as.Date("2012/1/1") + 700*sort(stats::runif(10000))
#generating random dates
sales=cbind(sales,sales.dates)
str(sales)
sales$recency=round(as.numeric(difftime(Sys.Date(),sales[,3],units="days")) )
##library(gregmisc)
##if you have existing sales data you need to just shape it in this format
rename.vars(sales, from="Sales Value", to="Purchase.Value")#Renaming Variable Names
## Creating Total Sales(Monetization),Frequency, Last Purchase date for each customer
salesM=aggregate(sales[,2],list(sales$CustomerId),sum)
names(salesM)=c("CustomerId","Monetization")
salesF=aggregate(sales[,2],list(sales$CustomerId),length)
names(salesF)=c("CustomerId","Frequency")
salesR=aggregate(sales[,4],list(sales$CustomerId),min)
names(salesR)=c("CustomerId","Recency")
##Merging R,F,M
test1=merge(salesF,salesR,"CustomerId")
salesRFM=merge(salesM,test1,"CustomerId")
##Creating R,F,M levels
salesRFM$rankR=cut(salesRFM$Recency, 100,labels=F) #rankR 1 is very recent while rankR 5 is least recent
salesRFM$rankF=cut(salesRFM$Frequency, 100,labels=F)#rankF 1 is least frequent while rankF 5 is most frequent
salesRFM$rankM=cut(salesRFM$Monetization, 100,labels=F)#rankM 1 is lowest sales while rankM 5 is highest sales
##Looking at RFM tables
table(salesRFM[,5:6])
table(salesRFM[,6:7])
table(salesRFM[,5:7])
Note-you can also use quantile function instead of cut function. This changes cut to equal length instead of equal interval. or see other methods for finding breaks for categories.
最后
以上就是害羞小蜜蜂为你收集整理的R 语言 RFM 模型实现的全部内容,希望文章能够帮你解决R 语言 RFM 模型实现所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复