Skip to content

Latest commit

 

History

History
118 lines (62 loc) · 9.93 KB

2018-11-23-dpmf.md

File metadata and controls

118 lines (62 loc) · 9.93 KB

DPMF

Title: Differentially Private Matrix Factorization

Authors: Jingyu Hua, Chang Xia, Sheng Zhong

Institution: State Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, China

Conference: International Joint Conference on Artificial Intelligence (IJCAI)

Year: 2015

Topic

Differential privacy; Matrix factorization

Motivation

Recommendation systems (RSes) provide users personalized recommendations of contents and services. To offer useful recommendations, a RS usually requires users to supply their personal preferences for various items, which raises serious privacy concerns.

Is it possible to build a recommendation system without the recommender learning the users’ ratings of items?

Approach

A differentially private MF mechanism guarantees that the execution of the learning algorithm exposes only item profile matrix, i.e., to the untrusted recommender but never any information about users’ ratings or even .

Scenario 1:

The recommender is trusted.

Hypothesis: The recommender has collected the ratings of users and wants to learn and publish the final item profile matrix satisfying -differential privacy.

Idea: Achieve differential privacy by randomly perturbing the objective function instead of perturbing the output of the leaning algorithm.

  • The recommender performs the original RLSM.

  • The recommender uses the obtained as a constant profile matrix when minimize the objective.

Scenario 2:

The recommender is untrusted but the online users are static

Challenge:

  • How the users can independently select such that the resulting sum follows the distribution of ?

  • Difference attack?

Idea:

  • The recommender picks a random number vector , of which each element , and sends it to every user in . User independently selects a random vector , which each element and computes according to .

  • When user requests from the recommender in the beginning of the -th iteration, the server generates a random noise vector , of which the elements are i.i.d. on . It returns together with to user .

  • User computes . The user computes , and forwards it to the third-party.

  • The third-party aggregates the results from users in and computes . The result is forwarded to the recommender.

  • The recommender computes , and uses it to update .

Scenario 3:

The recommender is untrusted and the online users are dynamic

Challenge: Different groups of users produce random variables with the same sum?

Idea:

  • The recommender randomly picks online users, i.e., , and sends each a random vector , of which the elements are i.i.d. on .

  • User picks the local noise vector the same as before, and computes . The result is forwarded to the third-party.

  • The third-party aggregates the results from each user and computes , where is another random vector, of which the elements are i.i.d. on . The result is sent to the recommender.

  • The recommender derives by removing each , and keeps it secret.

  • The recommender randomly divides into pieces, and sends every user in this group a piece .

  • User uses to replace .

  • The third-party minuses from before uploading.

Contribution

  • Figure out the required distribution of the noise component for objective perturbation in MF;

  • Decompose the noise component for objective perturbation into small pieces that can be determined locally and independently by users;

  • A third-party based mechanism to reduce noises added in each iteration.

Performance

Dataset:

  • Netflix dataset (191668 users' ratings of 100 movies, 5 star range)

  • MovieLens 100k (943 users' ratings of 1682 movies, 5 star range)

Baseline:

PMF

Metric:

Accuracy

  • The cumulative distribution function (CDF) of prediction errors

  • The mean increase of prediction errors compared with the baseline Communication overheads