DPMF

Title: Differentially Private Matrix Factorization

Authors: Jingyu Hua, Chang Xia, Sheng Zhong

Institution: State Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, China

Conference: International Joint Conference on Artificial Intelligence (IJCAI)

Year: 2015

Topic

Differential privacy; Matrix factorization

Motivation

Recommendation systems (RSes) provide users personalized recommendations of contents and services. To offer useful recommendations, a RS usually requires users to supply their personal preferences for various items, which raises serious privacy concerns.

Is it possible to build a recommendation system without the recommender learning the users’ ratings of items?

Approach

A differentially private MF mechanism guarantees that the execution of the learning algorithm exposes only item profile matrix, i.e., to the untrusted recommender but never any information about users’ ratings or even .

Scenario 1:

The recommender is trusted.

Hypothesis: The recommender has collected the ratings of users and wants to learn and publish the final item profile matrix satisfying $\epsilon$ -differential privacy.

Idea: Achieve differential privacy by randomly perturbing the objective function instead of perturbing the output of the leaning algorithm.

The recommender performs the original RLSM.
The recommender uses the obtained as a constant profile matrix when minimize the objective.

Scenario 2:

The recommender is untrusted but the online users are static

Challenge:

How the users can independently select $\eta _j^{i_s}$ such that the resulting sum $\eta _j$ follows the distribution of $p(\eta _j)\propto e^{-\frac{\epsilon \left \| \eta _j \right \|}{2\Delta }}$ ?
Difference attack?

Idea:

The recommender picks a random number vector , of which each element $H_j\left [ l \right ]\sim Exponential(1)$ , and sends it to every user in . User independently selects a random vector $C_{i_s}$ , which each element $C_{i_s}\left [ l \right ]\sim N(0, \frac{1}{k})$ and computes $\eta _j^{i_s}$ according to $\eta _j^{i_s}\left [ l \right ]=\frac{2\Delta \sqrt{d}}{\epsilon }\cdot \sqrt{2H_j\left [ l \right ]}C_{i_s}\left [ l \right ]$ .
When user requests from the recommender in the beginning of the -th iteration, the server generates a random noise vector $\Psi _j^{i_s}(t)$ , of which the elements are i.i.d. on . It returns $\Psi _j^{i_s}(t)$ together with to user .
User computes $\widetilde{\triangledown }_{v_j}^{i_s}(t)=\triangledown _{v_j}^{i_s}(t)+\eta _j^{i_s}+\widetilde{\rho} _j^{i_s}(t)$ . The user computes $\Phi _j^{i_s}(t)=\widetilde{\triangledown }_{v_j}^{i_s}(t)+\Psi _j^{i_s}(t)\ mod\ P$ , and forwards it to the third-party.
The third-party aggregates the results from users in and computes $\Phi _j(t)=\sum _{s=1}^k \Phi _j^{i_s}(t)\ mod\ P$ . The result is forwarded to the recommender.
The recommender computes $\widetilde{\triangledown }_{v_j}(t)=\Phi _j(t)-\sum _{s=1}^k \Psi _j^{i_s}(t)\ mod\ P$ , and uses it to update .

Scenario 3:

The recommender is untrusted and the online users are dynamic

Challenge: Different groups of users produce random variables with the same sum?

Idea:

The recommender randomly picks online users, i.e., $UO=\left \{ u_{i_1},u_{i_2},...,u_{i_k} \right \}$ , and sends each $u_{i_s}\in UO$ a random vector $\beta _j^{i_s}$ , of which the elements are i.i.d. on .
User $u_{i_s}$ picks the local noise vector $\eta _j^{i_s}$ the same as before, and computes $\tau _j^{i_s}=\eta _j^{i_s}+\beta _j^{i_s}\ mod\ P$ . The result is forwarded to the third-party.
The third-party aggregates the results from each user and computes $\widetilde{\tau }_j=\alpha _j+\sum _{s=1}^k \tau _j^{i_s}\ mod\ P$ , where $\alpha _j$ is another random vector, of which the elements are i.i.d. on . The result is sent to the recommender.
The recommender derives $\widehat{\tau }_j=\alpha _j+\sum _{s=1}^k \eta _j^{i_s}\ mod\ P$ by removing each $\beta _j^{i_s}$ , and keeps it secret.
The recommender randomly divides $\widehat{\tau }_j$ into pieces, and sends every user $u_{i_s}$ in this group a piece $\widehat{\tau }_j^{i_s}$ .
User $u_{i_s}$ uses $\widehat{\tau }_j^{i_s}$ to replace $\eta _j^{i_s}$ .
The third-party minuses $\alpha _j$ from $\Phi _j(t)$ before uploading.

Contribution

Figure out the required distribution of the noise component for objective perturbation in MF;
Decompose the noise component for objective perturbation into small pieces that can be determined locally and independently by users;
A third-party based mechanism to reduce noises added in each iteration.

Performance

Dataset:

Netflix dataset (191668 users' ratings of 100 movies, 5 star range)
MovieLens 100k (943 users' ratings of 1682 movies, 5 star range)

Baseline:

PMF

Metric:

Accuracy

The cumulative distribution function (CDF) of prediction errors
The mean increase of prediction errors compared with the baseline Communication overheads

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2018-11-23-dpmf.md

2018-11-23-dpmf.md

DPMF

Topic

Motivation

Approach

Contribution

Performance

Files

2018-11-23-dpmf.md

Latest commit

History

2018-11-23-dpmf.md

File metadata and controls

DPMF

Topic

Motivation

Approach

Contribution

Performance