我是靠谱客的博主 饱满音响,最近开发中收集的这篇文章主要介绍【Spark】用隐式偏好进行训练(推荐系统),觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

Training with Implicit Preference (Recommendation)

用隐式偏好进行训练(推荐系统)

There are two types of user preferences:

  • explicit preference (also referred as "explicit feedback"), such as "rating" given to item by users.
  • implicit preference (also referred as "implicit feedback"), such as "view" and "buy" history.

有两种用户偏好:

显式偏好(也称为“显式反馈”),比如用户对商品的“评分”

隐式偏好(也成为“隐式反馈”),比如“浏览”和“购买”历史记录

MLlib ALS provides the setImplicitPrefs() functionto set whether to use implicit preference. The ALS algorithm takes RDD[Rating]as training data input. The Rating class is defined in Spark MLlib library as:

MLlib协同过滤提供setImplicitPrefs()方法,可以设置是否使用隐式偏好。

协同过滤算法训练数据的输入是弹性数据集[评分]Rating类在Spark MLliblibrary是这样定义的:

1

case class Rating(user: Int, product: Int, rating: Double)

By default, the recommendation templatesets setImplicitPrefs() to false whichexpects explicit rating values which the user has rated the item.

推荐模板setImplicitPrefs()的默认值是false,需要显式的评分数据。

To handle implicit preference, you canset setImplicitPrefs() to true. In this case, the"rating" value input to ALS is used to calculate the confidence levelthat the user likes the item. Higher "rating" means a strongerindication that the user likes the item.

如果是隐式偏好,你可以设置setImplicitPrefs()true。这样,“评分”输入就会被用来计算用户是否喜欢商品的信心等级。更高的“评分”意味着用户更可能喜欢该商品。

The following provides an example ofusing implicit preference. You can find the complete modified source code here.

下面的案例提供了使用隐式偏好的例子。你可以在蓝色的here处找到完整的源代码。

Training with view events

用浏览行为进行训练

For example, if the more number of timesthe user has viewed the item, the higher confidence that the user likes theitem. We can aggregate the number of views and use this as the"rating" value.

比如,如果用户看该商品的次数越多,我们就越有信心该用户喜欢这个商品。我们可以累加浏览次数,然后用这个值作为“评分”数据。

First, we can modify DataSource.scala toaggregate the number of views of the user on the same item:

首先,我们可以改变DataSource.scala来累加用户浏览同一商品的次数:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

  def getRatings(sc:SparkContext):RDD[Rating]={//该方法输入context,输出评分

 

    val eventsRDD:RDD[Event]=PEventStore.find(

      appName = dsp.appName,

      entityType =Some("user"),

      eventNames =Some(List("view")),// MODIFIED

      // targetEntityType is optional field of an event.

      targetEntityType =Some(Some("item")))(sc)

 

    val ratingsRDD:RDD[Rating]= eventsRDD.map { event =>

      try{

        val ratingValue:Double= event.event match{

          case"view"=>1.0// MODIFIED

          case_=>thrownewException(s"Unexpected event ${event} is read.")

        }

        // MODIFIED

        // key is (user id, item id)

        // value is the rating value, which is 1.

        ((event.entityId, event.targetEntityId.get), ratingValue)

      }catch{

        case e:Exception=>{

          logger.error(s"Cannot convert ${event} to Rating. Exception: ${e}.")

          throw e

        }

      }

    }

    // MODIFIED

    // sum all values for the same user id and item id key

    .reduceByKey {case(a, b)=> a + b }

    .map {case((uid, iid), r)=>

      Rating(uid, iid, r)

    }.cache()

 

    ratingsRDD

  }

 

  override

  def readTraining(sc:SparkContext):TrainingData={

    newTrainingData(getRatings(sc))

  }

You may put the view count aggregationlogic in ALSAlgorithm's train() instead,depending on your needs.

Then, we can modify ALSAlgorithm.scalato set setImplicitPrefs to true:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

classALSAlgorithm(val ap:ALSAlgorithmParams)

  extendsPAlgorithm[PreparedData, ALSModel, Query, PredictedResult]{

 

  ...

 

  def train(sc:SparkContext, data:PreparedData):ALSModel={

 

    ...

 

    // If you only have one type of implicit event (Eg. "view" event only),

    // set implicitPrefs to true

    // MODIFIED

    val implicitPrefs =true

    val als =newALS()

    als.setUserBlocks(-1)

    als.setProductBlocks(-1)

    als.setRank(ap.rank)

    als.setIterations(ap.numIterations)

    als.setLambda(ap.lambda)

    als.setImplicitPrefs(implicitPrefs)

    als.setAlpha(1.0)

    als.setSeed(seed)

    als.setCheckpointInterval(10)

    val m = als.run(mllibRatings)

 

    newALSModel(

      rank = m.rank,

      userFeatures = m.userFeatures,

      productFeatures = m.productFeatures,

      userStringIntMap = userStringIntMap,

      itemStringIntMap = itemStringIntMap)

  }

 

  ...

 

}

 

Now the recommendation engine can traina model with implicit preference events.

Next: Filter Recommended Items by Blacklist in Query

https://predictionio.apache.org/templates/recommendation/training-with-implicit-preference/

 

If the rating matrix is derived fromanother source of information (e.g., it is inferred from other signals), youcan use the trainImplicit method to get better results.

可以使用trainImplicit 方法:

val alpha =0.01

val lambda =0.01

val model =ALS.trainImplicit(ratings, rank, numIterations, lambda, alpha)

alpha - a constant used for computing confidence in implicitALS (default 1.0)一个常量,用于计算隐式ALS 中的confidence,默认值是1.0

https://spark.apache.org/docs/1.6.1/mllib-collaborative-filtering.html

最后

以上就是饱满音响为你收集整理的【Spark】用隐式偏好进行训练(推荐系统)的全部内容,希望文章能够帮你解决【Spark】用隐式偏好进行训练(推荐系统)所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(43)

评论列表共有 0 条评论

立即
投稿
返回
顶部