Anime-RecSys

banner

license

Goal of this repo is to document the development of user-item recommendation systems, and compare different methods. Dataset used is Anime Recommendation Database 2020 from Kaggle. All the models can be downloaded from the Drive.

There are two most common ways of making recommending:

Right now we will focus only on colaborative filtering.

Predicting User-Anime Ratings

Open In Collab

One of the ways to recommend items is by predicting what rating will a user put on an item, and showing the ones that user hasn’t seen and have the highest rating.

Matrix Factorization

Matrix factorization algorithms work by decomposing the user-item interaction matrix into the product of two lower dimensionality rectangular matrices. For example, using the anime rating matrix we can try to learn the user and anime matrices, such that each row and column represent a user and an anime respectively.

We are never actually going to create this rating matrix, however it is better to imagine it like that.

Making the tf.keras.Model Instead of user and item matrices, we will use embeddings, which will map each user and anime to a vector. In addition, we'll add a bias to each user and anime. ```python class MatrixFactorizationModel(tf.keras.Model): def __init__(self, num_users, num_items, embedding_dim): super(MatrixFactorizationModel, self).__init__() self.embedding_dim = embedding_dim self.user_embeddings = tf.keras.layers.Embedding(num_users, embedding_dim) self.item_embeddings = tf.keras.layers.Embedding(num_items, embedding_dim) self.user_biases = tf.keras.layers.Embedding(num_users, 1) self.item_biases = tf.keras.layers.Embedding(num_items, 1) self.bias = tf.Variable(tf.zeros([1])) self.dropout = tf.keras.layers.Dropout(.5) def call(self, inputs, training = False): ... ``` It is a good practice to place a dropout layer over embedding layers to prevent overfitting and in turn, making features more robust. To compute the predicted rating I use the formula `prediction = (user_embedding + user_bias) * (item_embedding + item_bias) + bias`. ```python class MatrixFactorizationModel(tf.keras.Model): def __init__(self, num_users, num_items, embedding_dim): ... def call(self, inputs, training = False): user_ids = inputs[:, 0] item_ids = inputs[:, 1] user_embedding = self.user_embeddings(user_ids) + self.user_biases(user_ids) item_embedding = self.item_embeddings(item_ids) + self.item_biases(item_ids) if training: user_embedding = self.dropout(user_embedding, training = training) item_embedding = self.dropout(item_embedding, training = training) user_embedding = tf.reshape(user_embedding, [-1, self.embedding_dim]) item_embedding = tf.reshape(item_embedding, [-1, self.embedding_dim]) dot = tf.keras.layers.Dot(axes=1)([user_embedding, item_embedding]) + self.bias return dot ``` To compile the model I used Adam optimizer, MSE as the loss and RMSE as a metric to keep track of. The model is ran [eagerly](https://www.tensorflow.org/guide/eager), to allow such model definition. ```python mf_model = MatrixFactorizationModel(num_users = num_users, num_items = num_anime, embedding_dim = 64) mf_model.compile( optimizer = tf.keras.optimizers.Adam(), loss = tf.keras.losses.MeanSquaredError(), metrics = [ tf.keras.metrics.RootMeanSquaredError("RMSE") ], run_eagerly = True ) ```

Neural Network

The neural network will try to learn user and item embeddings that model user-item interactions.

Model sctructure is quite similar to that of MatrixFactorization, however now we will concatenate user and anime embeddings and pass them through two dense layers. Last layer has only one node and its output will represent the predicted rating.

Making the tf.keras.Model Let's define the user and item embeddings and the two dense layers. I used `relu` activation on the output layer because predictions non-negative numbers. ```python class NeuralNetworkModel(tf.keras.Model): def __init__(self, num_users, num_items, embedding_dim): super(NeuralNetworkModel, self).__init__() self.embedding_dim = embedding_dim self.user_embeddings = tf.keras.layers.Embedding(num_users, embedding_dim) self.item_embeddings = tf.keras.layers.Embedding(num_items, embedding_dim) self.dense1 = tf.keras.layers.Dense(64, activation='relu') self.dense2 = tf.keras.layers.Dense(1, activation='relu') self.concat = tf.keras.layers.Concatenate() self.dropout = tf.keras.layers.Dropout(.5) def call(self, inputs, training = False): ... ``` We'll take user and item embeddings, apply dropout after each, and concatenate them. Then we'll pass them through two dense layers. ```python class NeuralNetworkModel(tf.keras.Model): def __init__(self, num_users, num_items, embedding_dim): ... def call(self, inputs, training = False): user_ids = inputs[:, 0] item_ids = inputs[:, 1] user_embedding = self.user_embeddings(user_ids) item_embedding = self.item_embeddings(item_ids) if training: user_embedding = self.dropout(user_embedding, training = training) item_embedding = self.dropout(item_embedding, training = training) user_embedding = tf.reshape(user_embedding, [-1, self.embedding_dim]) item_embedding = tf.reshape(item_embedding, [-1, self.embedding_dim]) x = self.concat([user_embedding, item_embedding]) x = self.dense1(x) x = self.dense2(x) return x ``` Similarly to `MatrixFactorization`, to compile the model I used Adam optimizer, MSE as the loss and RMSE as a metric to keep track of. The model is ran [eagerly](https://www.tensorflow.org/guide/eager), to allow such model definition. ```python nn_model = NeuralNetworkModel(num_users = num_users, num_items = num_anime, embedding_dim = 64) nn_model.compile( optimizer = tf.keras.optimizers.Adam(), loss = tf.keras.losses.MeanSquaredError(), metrics = [ tf.keras.metrics.RootMeanSquaredError("RMSE") ], run_eagerly = True ) ```

Predicting User-Anime Interactions

Open In Collab

Being able to predict what rating will a user place on an anime is nice. However, our goal is to recommend anime. So, basically, we can rephrase our question to will a user have a positive interaction with an anime?

Let’s define positive and negative interactions as

Now, we can use exactly the same model structures, except with a sigmoid function applied to the output.

Dataset

In our anime dataset we don’t have a positive-negative interaction column, so we will make it ourselves.

We’ll leave only Completed and Dropped columns. All the anime rated less than or equal 5 or that were dropped, will be considered negative interactions. Rest are positive.

watching_status values:

1: Currently Watching
2: Completed
3: On Hold
4: Dropped
6: Plan to Watch
interactions = interactions[interactions["watching_status"] <= 4]  # leave only
interactions = interactions[interactions["watching_status"] >= 2]  # watching_status that
interactions = interactions[interactions["watching_status"] != 3]  # are 2 or 4

def transform(df):
    df.loc[df["rating"] <= 5, ["watching_status"]] = 0.0
    df.loc[df["watching_status"] == 2, ["watching_status"]] = 1.0
    df.loc[df["watching_status"] == 4, ["watching_status"]] = 0.0

    df.rename(columns = {'watching_status': 'interaction'}, inplace = True)

transform(interactions)

good = interactions['user_id'].value_counts().loc[lambda x : x > 4].unique()  # we will consider a user if they  
interactions = interactions[interactions["user_id"].isin(good)]               # have more than 4 occurances

Hybrid (Predicting User-Anime Ratings and Interactions)

Open In Collab

(IMHO) A model that has multiple tasks can perform better because of more diverse data. We can make a predictor that will achive both tasks, predict user-item ratings and interactions.

As it has two tasks, model loss function will be the sum of two loss functions for each task multiplied by importance weights.

Model

Making the tf.keras.Model Model structure is similar to Neural Networs used for rating and interaction predictions, however now it has both outputs (`dense2` and `dense3` layers). ```python class HybridNeuralNetworkModel(tf.keras.Model): def __init__(self, num_users, num_items, embedding_dim): super(HybridNeuralNetworkModel, self).__init__() self.embedding_dim = embedding_dim self.user_embeddings = tf.keras.layers.Embedding(num_users, embedding_dim) self.item_embeddings = tf.keras.layers.Embedding(num_items, embedding_dim) self.dense1 = tf.keras.layers.Dense(64, activation='relu') self.dense2 = tf.keras.layers.Dense(1, activation='relu') self.dense3 = tf.keras.layers.Dense(1, activation='sigmoid') self.concat = tf.keras.layers.Concatenate() self.dropout = tf.keras.layers.Dropout(.5) def call(self, inputs, training = False): ... ``` In output scence it is also similar, except now it returns two outputs. ```python class HybridNeuralNetworkModel(tf.keras.Model): def __init__(self, num_users, num_items, embedding_dim): ... def call(self, inputs, training = False): user_ids = inputs[:, 0] item_ids = inputs[:, 1] user_embedding = self.user_embeddings(user_ids) item_embedding = self.item_embeddings(item_ids) if training: user_embedding = self.dropout(user_embedding, training = training) item_embedding = self.dropout(item_embedding, training = training) user_embedding = tf.reshape(user_embedding, [-1, self.embedding_dim]) item_embedding = tf.reshape(item_embedding, [-1, self.embedding_dim]) conc = self.concat([user_embedding, item_embedding]) x = self.dense1(conc) x1 = self.dense2(x) x2 = self.dense3(x) return x1, x2 ``` Each output will have its own loss function defined, as well as evaluation metrics. Lambda values in the formula above are configured in the `loss_weights`. ```python hnn_model = HybridNeuralNetworkModel(num_users = num_users, num_items = num_anime, embedding_dim = 64) hnn_model.compile( optimizer = tf.keras.optimizers.Adam(), loss = { "output_1": tf.keras.losses.MeanSquaredError(), "output_2": tf.keras.losses.BinaryCrossentropy() }, loss_weights = { "output_1": 1.0, "output_2": 5.0 }, metrics = { "output_1": [ tf.keras.metrics.RootMeanSquaredError("RMSE") ], "output_2": [ tf.keras.metrics.BinaryAccuracy("Acc"), tf.keras.metrics.Precision(name = "P"), tf.keras.metrics.Recall(name = "R") ] }, run_eagerly = True ) ```

Comparing Model Performances

Predicting Ratings

Model val_loss RMSE
Hybrid NN 1.9694 1.4034
Neural Network 1.9897 1.4105
Matrix Factorization 2.8429 1.6861

Predicting Interactions

Model val_loss Acc Precision Recall
Hybrid NN 0.3079 0.8790 0.8931 0.9759
Neural Network 0.2502 0.8050 0.8907 0.8282
Matrix Factorization 0.4691 0.7678 0.8969 0.7621