Selleri Development
commited on
Commit
·
54d5e17
1
Parent(s):
d820001
Add Learning Pytorch and RS Collaborative Filtering
Browse files
1. Latihan Pytorch.ipynb
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyMZU9NIL9Z+tmzB6E63g064"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["## Pengenalan Tensor\n","Tensor adalah struktur data utama dalam TensorFlow dan PyTorch. Mirip dengan array atau matriks, tensor bisa memiliki dimensi lebih tinggi, misalnya 3D, 4D, dan seterusnya. Berikut adalah operasi dasar dengan tensor."],"metadata":{"id":"KryTJ4bT_TPf"}},{"cell_type":"markdown","source":["## TensorFlow vs. PyTorch\n","\n","* TensorFlow: Dikembangkan oleh Google, TensorFlow sering digunakan untuk produksi model skala besar. TensorFlow memiliki lebih banyak fitur terintegrasi untuk deployment.\n","* PyTorch: Dikembangkan oleh Facebook, PyTorch populer di kalangan peneliti karena pendekatannya yang lebih intuitif dan dinamis."],"metadata":{"id":"yZZq8D_x_YXQ"}},{"cell_type":"markdown","source":["## Instalasi PyTorch\n","`!pip install torch torchvision`"],"metadata":{"id":"VxE-qr32_loc"}},{"cell_type":"code","execution_count":1,"metadata":{"id":"pYKyitXt_E0M","executionInfo":{"status":"ok","timestamp":1724026664038,"user_tz":-420,"elapsed":6454,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"outputs":[],"source":["import torch"]},{"cell_type":"code","source":["# Membuat tensor 2x3\n","tensor_pt = torch.tensor([[1, 2, 3], [4, 5, 6]])\n","print(tensor_pt)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"hp716GSk_2YB","executionInfo":{"status":"ok","timestamp":1724026689486,"user_tz":-420,"elapsed":520,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"7ed75221-1dcb-4ff4-dca1-dee564bf5c25"},"execution_count":2,"outputs":[{"output_type":"stream","name":"stdout","text":["tensor([[1, 2, 3],\n"," [4, 5, 6]])\n"]}]},{"cell_type":"markdown","source":["## Operasi Dasar dengan Tensor\n","Setelah kita punya tensor, kita bisa melakukan operasi dasar seperti penambahan, pengurangan, perkalian, dan sebagainya."],"metadata":{"id":"erAxdfR3_9h0"}},{"cell_type":"code","source":["# Penambahan\n","add_result = tensor_pt + 2\n","print(add_result)\n","\n","# Perkalian elemen-wise\n","mul_result = tensor_pt * 3\n","print(mul_result)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"1rPZY2D8ACwu","executionInfo":{"status":"ok","timestamp":1724026731421,"user_tz":-420,"elapsed":525,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"d494d3e0-764f-4a2e-b872-413432cbf69f"},"execution_count":3,"outputs":[{"output_type":"stream","name":"stdout","text":["tensor([[3, 4, 5],\n"," [6, 7, 8]])\n","tensor([[ 3, 6, 9],\n"," [12, 15, 18]])\n"]}]},{"cell_type":"markdown","source":["## Menggunakan GPU\n","Baik TensorFlow maupun PyTorch mendukung penggunaan GPU untuk mempercepat komputasi:"],"metadata":{"id":"AZxl8k73AQol"}},{"cell_type":"code","source":["device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n","tensor_pt = tensor_pt.to(device)\n","result = tensor_pt * 2\n","print(result)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"A3QZWZg4AP-E","executionInfo":{"status":"ok","timestamp":1724026804678,"user_tz":-420,"elapsed":488,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"da31eaa6-0bd0-4ece-8e9d-57096b11b2a3"},"execution_count":4,"outputs":[{"output_type":"stream","name":"stdout","text":["tensor([[ 2, 4, 6],\n"," [ 8, 10, 12]])\n"]}]}]}
|
2. Latihan RS MF PyTorch.ipynb
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyPX+xE181f8IKzMjsVahwg0"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["## Jenis-Jenis Model Recommender Systems\n","Ada beberapa pendekatan yang umum digunakan dalam sistem rekomendasi:\n","\n","* Collaborative Filtering (CF): Mengandalkan interaksi pengguna dan item untuk memberikan rekomendasi.\n","* Content-Based Filtering (CBF): Menggunakan fitur dari item atau pengguna.\n","* Hybrid Models: Menggabungkan CF dan CBF.\n","\n","Untuk memulai, kita akan fokus pada Collaborative Filtering, khususnya Matrix Factorization."],"metadata":{"id":"oQ7NulcJBhII"}},{"cell_type":"markdown","source":["## Implementasi Matrix Factorization\n","Matrix Factorization adalah teknik di mana kita memfaktorkan matriks besar yang berisi data rating pengguna ke dalam dua matriks lebih kecil: satu untuk pengguna dan satu untuk item.\n","\n","Rumus Dasar:\n","\n","Jika kita punya matriks rating \\( R \\) dengan dimensi \\( m * n \\) (di mana \\( m \\) adalah jumlah pengguna dan \\( n \\) adalah jumlah item), maka kita faktorkan matriks ini menjadi dua matriks:\n","- \\( P \\): Matriks pengguna dengan dimensi \\( m * k \\).\n","- \\( Q \\): Matriks item dengan dimensi \\( k * n \\).\n","\n","Model ini mencoba meminimalkan selisih antara \\( R \\) dan perkalian \\( P * Q \\)."],"metadata":{"id":"Gbg2sUaaBzGT"}},{"cell_type":"markdown","source":["## Membangun Model dengan PyTorch\n"],"metadata":{"id":"Pxi5fyTzCPxB"}},{"cell_type":"code","execution_count":1,"metadata":{"id":"Ye2sQ-_EBUmj","executionInfo":{"status":"ok","timestamp":1724028356158,"user_tz":-420,"elapsed":8672,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"outputs":[],"source":["import torch\n","import torch.nn as nn\n","import torch.optim as optim\n","import numpy as np"]},{"cell_type":"code","source":["class MFModel(nn.Module):\n"," def __init__(self, num_users, num_items, embedding_size):\n"," super(MFModel, self).__init__()\n"," self.user_embedding = nn.Embedding(num_users, embedding_size)\n"," self.item_embedding = nn.Embedding(num_items, embedding_size)\n","\n"," def forward(self, user_id, item_id):\n"," user_vec = self.user_embedding(user_id)\n"," item_vec = self.item_embedding(item_id)\n"," dot_product = torch.sum(user_vec * item_vec, dim=1)\n"," return dot_product"],"metadata":{"id":"wjj9iidpCYG5","executionInfo":{"status":"ok","timestamp":1724027645001,"user_tz":-420,"elapsed":3,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":7,"outputs":[]},{"cell_type":"code","source":["# Data dummy: user_ids, item_ids, dan ratings\n","user_ids = torch.tensor([0, 1, 2, 0, 1, 2], dtype=torch.long)\n","item_ids = torch.tensor([0, 1, 2, 1, 2, 0], dtype=torch.long)\n","ratings = torch.tensor([5, 4, 3, 4, 5, 2], dtype=torch.float32)"],"metadata":{"id":"EAUFvL3eDX63","executionInfo":{"status":"ok","timestamp":1724028364023,"user_tz":-420,"elapsed":358,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":2,"outputs":[]},{"cell_type":"code","source":["# Jumlah pengguna dan item\n","num_users = 3\n","num_items = 3\n","embedding_size = 2"],"metadata":{"id":"itw-Yyq4Cbmt","executionInfo":{"status":"ok","timestamp":1724028366568,"user_tz":-420,"elapsed":3,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":3,"outputs":[]},{"cell_type":"markdown","source":["## Training\n","Sekarang, mari kita bicara tentang cara melatih model ini menggunakan data rating nyata. Kamu akan membutuhkan:\n","- **Loss Function**: Biasanya menggunakan Mean Squared Error (MSE).\n","- **Optimizer**: Menggunakan Adam atau SGD."],"metadata":{"id":"4JuPuzF5Ciut"}},{"cell_type":"code","source":["# Membuat model dan loss function\n","model = MFModel(num_users, num_items, embedding_size)\n","criterion = nn.MSELoss()\n","optimizer = optim.Adam(model.parameters(), lr=0.01)\n","\n","# Melatih model\n","for epoch in range(100):\n"," model.train()\n"," optimizer.zero_grad()\n"," predictions = model(user_ids, item_ids)\n"," loss = criterion(predictions, ratings)\n"," loss.backward()\n"," optimizer.step()"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"pi-IXtbTCpEK","executionInfo":{"status":"ok","timestamp":1724027799617,"user_tz":-420,"elapsed":849,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"22bb1702-0b08-458f-8bd1-ee8b5bd25bcc"},"execution_count":13,"outputs":[{"output_type":"stream","name":"stdout","text":["tensor([0.1248, 3.2503, 1.1823, 1.2675, 2.3347, 0.5206],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.1386, 3.2986, 1.2122, 1.3016, 2.3800, 0.5333],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.1528, 3.3474, 1.2425, 1.3361, 2.4256, 0.5464],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.1674, 3.3964, 1.2732, 1.3710, 2.4717, 0.5599],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.1825, 3.4458, 1.3042, 1.4063, 2.5180, 0.5738],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.1979, 3.4955, 1.3357, 1.4419, 2.5648, 0.5881],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.2138, 3.5455, 1.3675, 1.4780, 2.6118, 0.6028],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.2301, 3.5956, 1.3998, 1.5144, 2.6592, 0.6179],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.2468, 3.6460, 1.4324, 1.5512, 2.7068, 0.6334],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.2640, 3.6965, 1.4654, 1.5884, 2.7547, 0.6494],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.2816, 3.7471, 1.4987, 1.6259, 2.8028, 0.6657],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.2996, 3.7978, 1.5324, 1.6637, 2.8511, 0.6825],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.3181, 3.8484, 1.5665, 1.7019, 2.8996, 0.6997],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.3370, 3.8990, 1.6009, 1.7404, 2.9483, 0.7172],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.3564, 3.9494, 1.6356, 1.7792, 2.9970, 0.7352],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.3763, 3.9997, 1.6706, 1.8182, 3.0458, 0.7537],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.3965, 4.0496, 1.7059, 1.8575, 3.0946, 0.7725],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.4173, 4.0992, 1.7415, 1.8970, 3.1434, 0.7917],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.4385, 4.1483, 1.7774, 1.9367, 3.1921, 0.8114],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.4602, 4.1968, 1.8135, 1.9766, 3.2407, 0.8314],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.4823, 4.2448, 1.8498, 2.0166, 3.2892, 0.8518],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.5049, 4.2919, 1.8864, 2.0567, 3.3374, 0.8727],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.5279, 4.3383, 1.9231, 2.0969, 3.3853, 0.8939],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.5514, 4.3836, 1.9600, 2.1370, 3.4328, 0.9155],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.5754, 4.4279, 1.9970, 2.1772, 3.4799, 0.9375],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.5999, 4.4710, 2.0341, 2.2173, 3.5266, 0.9598],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.6248, 4.5127, 2.0713, 2.2573, 3.5727, 0.9825],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.6502, 4.5531, 2.1085, 2.2972, 3.6182, 1.0056],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.6760, 4.5918, 2.1458, 2.3368, 3.6630, 1.0290],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.7023, 4.6289, 2.1830, 2.3762, 3.7070, 1.0527],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.7290, 4.6642, 2.2201, 2.4153, 3.7503, 1.0768],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.7562, 4.6977, 2.2572, 2.4541, 3.7926, 1.1011],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.7838, 4.7291, 2.2942, 2.4925, 3.8340, 1.1258],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.8119, 4.7584, 2.3309, 2.5304, 3.8744, 1.1507],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.8404, 4.7856, 2.3675, 2.5678, 3.9137, 1.1759],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.8693, 4.8105, 2.4038, 2.6047, 3.9518, 1.2013],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.8987, 4.8331, 2.4398, 2.6410, 3.9888, 1.2269],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.9285, 4.8533, 2.4755, 2.6767, 4.0246, 1.2528],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.9587, 4.8712, 2.5108, 2.7118, 4.0591, 1.2788],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([0.9893, 4.8866, 2.5456, 2.7462, 4.0923, 1.3050],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.0203, 4.8997, 2.5800, 2.7798, 4.1241, 1.3313],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.0517, 4.9103, 2.6139, 2.8128, 4.1546, 1.3577],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.0835, 4.9186, 2.6472, 2.8449, 4.1838, 1.3842],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.1157, 4.9246, 2.6799, 2.8764, 4.2116, 1.4108],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.1482, 4.9283, 2.7119, 2.9070, 4.2380, 1.4374],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.1812, 4.9299, 2.7433, 2.9369, 4.2631, 1.4640],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.2144, 4.9293, 2.7739, 2.9660, 4.2868, 1.4906],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.2480, 4.9268, 2.8037, 2.9944, 4.3091, 1.5171],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.2820, 4.9223, 2.8326, 3.0220, 4.3302, 1.5436],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.3162, 4.9161, 2.8607, 3.0489, 4.3500, 1.5699],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.3508, 4.9081, 2.8879, 3.0750, 4.3686, 1.5962],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.3857, 4.8987, 2.9142, 3.1005, 4.3860, 1.6222],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.4209, 4.8878, 2.9395, 3.1253, 4.4022, 1.6480],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.4564, 4.8756, 2.9637, 3.1495, 4.4174, 1.6736],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.4922, 4.8622, 2.9869, 3.1730, 4.4314, 1.6989],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.5282, 4.8478, 3.0091, 3.1960, 4.4445, 1.7240],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.5645, 4.8325, 3.0301, 3.2184, 4.4567, 1.7487],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.6010, 4.8164, 3.0500, 3.2402, 4.4679, 1.7730],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.6378, 4.7996, 3.0688, 3.2616, 4.4784, 1.7970],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.6748, 4.7823, 3.0865, 3.2826, 4.4880, 1.8205],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.7120, 4.7645, 3.1029, 3.3031, 4.4970, 1.8436],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.7495, 4.7463, 3.1182, 3.3232, 4.5053, 1.8663],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.7871, 4.7280, 3.1323, 3.3430, 4.5130, 1.8885],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.8249, 4.7094, 3.1453, 3.3625, 4.5201, 1.9101],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.8629, 4.6909, 3.1570, 3.3816, 4.5268, 1.9312],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.9010, 4.6723, 3.1676, 3.4005, 4.5331, 1.9518],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.9393, 4.6538, 3.1771, 3.4191, 4.5390, 1.9718],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([1.9778, 4.6355, 3.1853, 3.4375, 4.5445, 1.9912],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.0163, 4.6174, 3.1925, 3.4557, 4.5498, 2.0101],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.0550, 4.5996, 3.1985, 3.4737, 4.5548, 2.0283],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.0938, 4.5821, 3.2035, 3.4915, 4.5597, 2.0459],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.1327, 4.5650, 3.2074, 3.5092, 4.5644, 2.0628],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.1717, 4.5482, 3.2103, 3.5267, 4.5691, 2.0792],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.2107, 4.5319, 3.2121, 3.5440, 4.5736, 2.0949],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.2498, 4.5160, 3.2130, 3.5613, 4.5781, 2.1100],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.2890, 4.5006, 3.2130, 3.5784, 4.5826, 2.1244],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.3282, 4.4856, 3.2121, 3.5954, 4.5871, 2.1382],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.3674, 4.4711, 3.2103, 3.6122, 4.5917, 2.1514],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.4066, 4.4571, 3.2077, 3.6289, 4.5963, 2.1640],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.4459, 4.4435, 3.2043, 3.6454, 4.6010, 2.1760],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.4851, 4.4304, 3.2002, 3.6618, 4.6059, 2.1873],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.5243, 4.4177, 3.1954, 3.6781, 4.6108, 2.1981],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.5634, 4.4055, 3.1899, 3.6941, 4.6159, 2.2083],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.6025, 4.3937, 3.1839, 3.7100, 4.6211, 2.2179],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.6416, 4.3822, 3.1773, 3.7256, 4.6264, 2.2270],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.6805, 4.3712, 3.1702, 3.7411, 4.6319, 2.2355],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.7194, 4.3605, 3.1626, 3.7563, 4.6376, 2.2435],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.7582, 4.3501, 3.1546, 3.7712, 4.6435, 2.2511],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.7969, 4.3400, 3.1462, 3.7859, 4.6495, 2.2581],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.8354, 4.3302, 3.1374, 3.8003, 4.6556, 2.2647],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.8738, 4.3206, 3.1284, 3.8145, 4.6619, 2.2708],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.9121, 4.3113, 3.1190, 3.8282, 4.6684, 2.2765],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.9502, 4.3022, 3.1095, 3.8417, 4.6750, 2.2818],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([2.9882, 4.2932, 3.0998, 3.8548, 4.6818, 2.2867],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([3.0259, 4.2844, 3.0899, 3.8676, 4.6887, 2.2913],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([3.0635, 4.2757, 3.0798, 3.8800, 4.6958, 2.2955],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([3.1008, 4.2671, 3.0697, 3.8920, 4.7030, 2.2994],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([3.1379, 4.2587, 3.0596, 3.9036, 4.7103, 2.3030],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([3.1748, 4.2503, 3.0494, 3.9148, 4.7177, 2.3063],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n","tensor([3.2114, 4.2419, 3.0392, 3.9256, 4.7252, 2.3093],\n"," grad_fn=<SumBackward1>) tensor([5., 4., 3., 4., 5., 2.])\n"]}]},{"cell_type":"markdown","source":["## Evaluasi dan Pengujian\n","Setelah model dilatih, kita bisa mengevaluasi performa dengan metrik seperti RMSE (Root Mean Squared Error) atau MAE (Mean Absolute Error)."],"metadata":{"id":"TnkLwD9HC9PR"}},{"cell_type":"code","source":["model.eval()\n","with torch.no_grad():\n"," predicted_rating = model(torch.tensor([0]), torch.tensor([2]))\n"," print(f\"Predicted rating for user 0 on item 2: {predicted_rating.item():.2f}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"O504toWdC5f1","executionInfo":{"status":"ok","timestamp":1724027844222,"user_tz":-420,"elapsed":348,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"dcd8a93a-483f-4022-bc14-4cbcfc8e9b6a"},"execution_count":14,"outputs":[{"output_type":"stream","name":"stdout","text":["Predicted rating for user 0 on item 2: 4.58\n"]}]},{"cell_type":"markdown","source":["## Peningkatan Model dengan Bias Term\n","Pada tahap awal, kita hanya menggunakan dot product dari vektor pengguna dan item. Sekarang kita tambahkan bias term untuk pengguna dan item. Ini sering digunakan untuk memperhitungkan preferensi umum dari pengguna atau popularitas item."],"metadata":{"id":"ZgIYgmuEFd2Q"}},{"cell_type":"code","source":["class MFModel(nn.Module):\n"," def __init__(self, num_users, num_items, embedding_size):\n"," super(MFModel, self).__init__()\n"," self.user_embedding = nn.Embedding(num_users, embedding_size)\n"," self.item_embedding = nn.Embedding(num_items, embedding_size)\n"," self.user_bias = nn.Embedding(num_users, 1)\n"," self.item_bias = nn.Embedding(num_items, 1)\n","\n"," def forward(self, user_id, item_id):\n"," user_vec = self.user_embedding(user_id)\n"," item_vec = self.item_embedding(item_id)\n"," user_bias = self.user_bias(user_id).squeeze()\n"," item_bias = self.item_bias(item_id).squeeze()\n"," dot_product = torch.sum(user_vec * item_vec, dim=1)\n"," return dot_product + user_bias + item_bias"],"metadata":{"id":"N4PVmIFvFdCx"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## Regularisasi untuk Mencegah Overfitting\n","Agar model tidak overfit terhadap data, kita tambahkan regularisasi L2 pada embedding."],"metadata":{"id":"pjSxLOcIFwgR"}},{"cell_type":"code","source":["class MFModel(nn.Module):\n"," def __init__(self, num_users, num_items, embedding_size, reg_factor):\n"," super(MFModel, self).__init__()\n"," self.user_embedding = nn.Embedding(num_users, embedding_size)\n"," self.item_embedding = nn.Embedding(num_items, embedding_size)\n"," self.user_bias = nn.Embedding(num_users, 1)\n"," self.item_bias = nn.Embedding(num_items, 1)\n"," self.reg_factor = reg_factor\n","\n"," def forward(self, user_id, item_id):\n"," user_vec = self.user_embedding(user_id)\n"," item_vec = self.item_embedding(item_id)\n"," user_bias = self.user_bias(user_id).squeeze()\n"," item_bias = self.item_bias(item_id).squeeze()\n"," dot_product = torch.sum(user_vec * item_vec, dim=1)\n"," return dot_product + user_bias + item_bias\n","\n"," def regularization_loss(self):\n"," return self.reg_factor * (torch.norm(self.user_embedding.weight) + torch.norm(self.item_embedding.weight))\n","\n","# Melatih model dengan regularisasi\n","model = MFModel(num_users, num_items, embedding_size, reg_factor=0.01)\n","optimizer = optim.Adam(model.parameters(), lr=0.01)\n","criterion = nn.MSELoss()\n","\n","for epoch in range(100):\n"," model.train()\n"," optimizer.zero_grad()\n"," predictions = model(user_ids, item_ids)\n"," loss = criterion(predictions, ratings)\n"," loss += model.regularization_loss() # Menambahkan regularisasi L2\n"," loss.backward()\n"," optimizer.step()"],"metadata":{"id":"LrjkVie5Ft6Q","executionInfo":{"status":"ok","timestamp":1724028663649,"user_tz":-420,"elapsed":459,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":11,"outputs":[]},{"cell_type":"code","source":["from sklearn.metrics import mean_squared_error\n","import numpy as np\n","\n","# Prediksi untuk semua data\n","model.eval()\n","with torch.no_grad():\n"," predictions = model(user_ids, item_ids).numpy()\n"," rmse = np.sqrt(mean_squared_error(ratings.numpy(), predictions))\n"," print(f\"RMSE: {rmse:.4f}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"nrAdqqsOG8oz","executionInfo":{"status":"ok","timestamp":1724028573185,"user_tz":-420,"elapsed":401,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"ee341d5a-1592-450a-f740-d4116bee8883"},"execution_count":7,"outputs":[{"output_type":"stream","name":"stdout","text":["RMSE: 2.9928\n"]}]},{"cell_type":"code","source":["# Memprediksi rating untuk user 0 terhadap item 2\n","model.eval()\n","with torch.no_grad():\n"," predicted_rating = model(torch.tensor([0]), torch.tensor([2]))\n"," print(f\"Predicted rating for user 0 on item 2: {predicted_rating.item():.2f}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"TzQUJkF_HWA8","executionInfo":{"status":"ok","timestamp":1724028669624,"user_tz":-420,"elapsed":448,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"fc7c70d3-6ae9-4f8c-82bf-c2619f0a3462"},"execution_count":12,"outputs":[{"output_type":"stream","name":"stdout","text":["Predicted rating for user 0 on item 2: 4.07\n"]}]}]}
|
3. Latihan RS Pakai Movielens 100k Pandas + PyTorch.ipynb
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyMISNKUb0GuL4vZ+oGJ6Qtk"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["## Mengunduh dan Menyiapkan Dataset MovieLens 100k\n","Pertama, kita unduh dataset dari MovieLens atau gunakan library surprise untuk memuatnya dengan mudah."],"metadata":{"id":"DSZf6okqom9F"}},{"cell_type":"code","execution_count":2,"metadata":{"id":"iZQJmpLAmmsf","executionInfo":{"status":"ok","timestamp":1724037443739,"user_tz":-420,"elapsed":2098,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"outputs":[],"source":["import pandas as pd\n","import numpy as np\n","from sklearn.model_selection import train_test_split"]},{"cell_type":"code","source":["# Mengunduh dataset MovieLens 100k\n","!wget -q https://files.grouplens.org/datasets/movielens/ml-100k.zip\n","!unzip -q ml-100k.zip\n","\n","# Memuat data\n","data = pd.read_csv('ml-100k/u.data', sep='\\t', names=['user_id', 'item_id', 'rating', 'timestamp'])\n","data = data[['user_id', 'item_id', 'rating']] # Kita hanya butuh user_id, item_id, dan rating\n","\n","# Normalisasi ID pengguna dan item (karena ID asli mungkin tidak dimulai dari 0)\n","data['user_id'] = data['user_id'] - 1\n","data['item_id'] = data['item_id'] - 1\n","\n","# Melihat statistik dataset\n","num_users = data['user_id'].nunique()\n","num_items = data['item_id'].nunique()\n","print(f\"Number of users: {num_users}, Number of items: {num_items}\")\n","\n","# Split dataset menjadi train dan test\n","train_data, test_data = train_test_split(data, test_size=0.2, random_state=42)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"rio1kkwcosE2","executionInfo":{"status":"ok","timestamp":1724037467604,"user_tz":-420,"elapsed":19576,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"ac2395e8-bf13-4aa9-d8b6-bb4d3e316a15"},"execution_count":3,"outputs":[{"output_type":"stream","name":"stdout","text":["replace ml-100k/allbut.pl? [y]es, [n]o, [A]ll, [N]one, [r]ename: A\n","Number of users: 943, Number of items: 1682\n"]}]},{"cell_type":"markdown","source":["## Membuat Model Matrix Factorization\n","Sekarang kita akan membuat model matrix factorization dengan TensorFlow atau PyTorch yang mirip dengan yang sudah kita implementasikan sebelumnya."],"metadata":{"id":"g5MXB_AzpGlT"}},{"cell_type":"code","source":["import torch\n","import torch.nn as nn\n","import torch.optim as optim\n","from torch.utils.data import DataLoader, Dataset"],"metadata":{"id":"zoLrOQhVpB5H","executionInfo":{"status":"ok","timestamp":1724037516985,"user_tz":-420,"elapsed":5274,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":4,"outputs":[]},{"cell_type":"code","source":["class MovieLensDataset(Dataset):\n"," def __init__(self, data):\n"," self.user_ids = torch.tensor(data['user_id'].values, dtype=torch.long)\n"," self.item_ids = torch.tensor(data['item_id'].values, dtype=torch.long)\n"," self.ratings = torch.tensor(data['rating'].values, dtype=torch.float32)\n","\n"," def __len__(self):\n"," return len(self.ratings)\n","\n"," def __getitem__(self, idx):\n"," return self.user_ids[idx], self.item_ids[idx], self.ratings[idx]"],"metadata":{"id":"QwGSZdiWpO7B","executionInfo":{"status":"ok","timestamp":1724037529668,"user_tz":-420,"elapsed":462,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":5,"outputs":[]},{"cell_type":"code","source":["class MFModel(nn.Module):\n"," def __init__(self, num_users, num_items, embedding_size, reg_factor):\n"," super(MFModel, self).__init__()\n"," self.user_embedding = nn.Embedding(num_users, embedding_size)\n"," self.item_embedding = nn.Embedding(num_items, embedding_size)\n"," self.user_bias = nn.Embedding(num_users, 1)\n"," self.item_bias = nn.Embedding(num_items, 1)\n"," self.reg_factor = reg_factor\n","\n"," def forward(self, user_id, item_id):\n"," user_vec = self.user_embedding(user_id)\n"," item_vec = self.item_embedding(item_id)\n"," user_bias = self.user_bias(user_id).squeeze()\n"," item_bias = self.item_bias(item_id).squeeze()\n"," dot_product = torch.sum(user_vec * item_vec, dim=1)\n"," return dot_product + user_bias + item_bias\n","\n"," def regularization_loss(self):\n"," return self.reg_factor * (torch.norm(self.user_embedding.weight) + torch.norm(self.item_embedding.weight))"],"metadata":{"id":"MLNUf5SmpQY-","executionInfo":{"status":"ok","timestamp":1724037544992,"user_tz":-420,"elapsed":480,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":6,"outputs":[]},{"cell_type":"code","source":["# DataLoader untuk training\n","train_dataset = MovieLensDataset(train_data)\n","train_loader = DataLoader(train_dataset, batch_size=256, shuffle=True)\n","\n","# Hyperparameters\n","embedding_size = 30\n","reg_factor = 0.01\n","model = MFModel(num_users, num_items, embedding_size, reg_factor)\n","criterion = nn.MSELoss()\n","optimizer = optim.Adam(model.parameters(), lr=0.01)"],"metadata":{"id":"qUCHMXMwpUHm","executionInfo":{"status":"ok","timestamp":1724037565360,"user_tz":-420,"elapsed":2236,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":7,"outputs":[]},{"cell_type":"code","source":["# Training loop\n","for epoch in range(10):\n"," model.train()\n"," total_loss = 0\n"," for user_ids, item_ids, ratings in train_loader:\n"," optimizer.zero_grad()\n"," predictions = model(user_ids, item_ids)\n"," loss = criterion(predictions, ratings)\n"," loss += model.regularization_loss() # Menambahkan regularisasi\n"," loss.backward()\n"," optimizer.step()\n"," total_loss += loss.item()\n"," print(f\"Epoch {epoch+1}, Loss: {total_loss/len(train_loader):.4f}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"ZeZaRwWppcRi","executionInfo":{"status":"ok","timestamp":1724037604042,"user_tz":-420,"elapsed":21024,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"f92fdf3e-3921-4201-d5f8-dbc665e83e11"},"execution_count":8,"outputs":[{"output_type":"stream","name":"stdout","text":["Epoch 1, Loss: 30.0071\n","Epoch 2, Loss: 10.9298\n","Epoch 3, Loss: 5.9005\n","Epoch 4, Loss: 4.1320\n","Epoch 5, Loss: 3.3202\n","Epoch 6, Loss: 2.8464\n","Epoch 7, Loss: 2.5141\n","Epoch 8, Loss: 2.2483\n","Epoch 9, Loss: 2.0194\n","Epoch 10, Loss: 1.8150\n"]}]},{"cell_type":"markdown","source":["## Mengevaluasi Kinerja Model\n","Setelah training, kita evaluasi model pada dataset test menggunakan RMSE."],"metadata":{"id":"8gEQRe7EqIqB"}},{"cell_type":"code","source":["from sklearn.metrics import mean_squared_error\n","import numpy as np\n","\n","model.eval()\n","test_dataset = MovieLensDataset(test_data)\n","test_loader = DataLoader(test_dataset, batch_size=256, shuffle=False)\n","\n","predictions, targets = [], []\n","with torch.no_grad():\n"," for user_ids, item_ids, ratings in test_loader:\n"," output = model(user_ids, item_ids)\n"," predictions.extend(output.numpy())\n"," targets.extend(ratings.numpy())\n","\n","rmse = np.sqrt(mean_squared_error(targets, predictions))\n","print(f\"Test RMSE: {rmse:.4f}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"uOBsbzKfpmKB","executionInfo":{"status":"ok","timestamp":1724037678537,"user_tz":-420,"elapsed":480,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"8774098f-36f6-4f47-8fda-9be25ad8d7a1"},"execution_count":13,"outputs":[{"output_type":"stream","name":"stdout","text":["Test RMSE: 1.0543\n"]}]},{"cell_type":"markdown","source":["## Melakukan Prediksi untuk Pengguna dan Film tertentu\n","Untuk memprediksi rating dari pengguna tertentu untuk film tertentu:"],"metadata":{"id":"JxUvse0jqM_U"}},{"cell_type":"code","source":["model.eval()\n","with torch.no_grad():\n"," predicted_rating = model(torch.tensor([0]), torch.tensor([50]))\n"," print(f\"Predicted rating for user 0 on item 50: {predicted_rating.item():.2f}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"h-XzJZwsp4Lq","executionInfo":{"status":"ok","timestamp":1724037714159,"user_tz":-420,"elapsed":483,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"0df44abb-20fa-4691-b8fc-2dcd8acae0c2"},"execution_count":15,"outputs":[{"output_type":"stream","name":"stdout","text":["Predicted rating for user 0 on item 50: 2.89\n"]}]},{"cell_type":"markdown","source":["## Melakukan Top-N Recommendation Untuk Pengguna Tertentu\n","Untuk menampilkan Top-N prediksi untuk pengguna tertentu, kamu bisa mengambil semua item yang belum dirating oleh pengguna tersebut dan menghitung prediksinya. Setelah itu, kamu bisa mengurutkannya berdasarkan nilai prediksi dari yang tertinggi.\n","\n","Penjelasan:\n","- Memfilter Item yang Belum Dirating: Untuk setiap pengguna, kita ambil semua item yang belum pernah mereka beri rating. Ini penting karena rekomendasi seharusnya diberikan untuk item yang belum pernah dilihat atau dinilai.\n","- Prediksi Semua Rating: Untuk setiap item yang belum dirating, kita prediksi rating menggunakan model yang telah dilatih.\n","- Urutkan Prediksi: Setelah semua prediksi dihitung, kita urutkan dari yang tertinggi ke terendah dan ambil N prediksi teratas.\n","- Mengembalikan Item yang Direkomendasikan: Output adalah daftar item (misalnya film) yang direkomendasikan untuk pengguna tersebut."],"metadata":{"id":"Kq8QZMnIqhAo"}},{"cell_type":"code","source":["def get_top_n_recommendations_pytorch(model, user_id, N=10):\n"," # Dapatkan semua item yang tersedia\n"," all_items = np.array(range(num_items))\n","\n"," # Cek item yang sudah dirating oleh user\n"," rated_items = train_data[train_data['user_id'] == user_id]['item_id'].values\n","\n"," # Ambil item yang belum dirating oleh user\n"," items_to_predict = np.setdiff1d(all_items, rated_items)\n","\n"," # Prediksi rating untuk item-item tersebut\n"," model.eval()\n"," with torch.no_grad():\n"," user_ids = torch.tensor([user_id] * len(items_to_predict))\n"," item_ids = torch.tensor(items_to_predict)\n"," predicted_ratings = model(user_ids, item_ids).numpy()\n","\n"," # Urutkan item berdasarkan rating tertinggi\n"," top_n_items = items_to_predict[np.argsort(predicted_ratings)[-N:][::-1]]\n","\n"," return top_n_items\n","\n","# Contoh penggunaan\n","user_id = 0\n","top_n_recommendations = get_top_n_recommendations_pytorch(model, user_id, N=10)\n","print(f\"Top 10 recommended items for user {user_id}: {top_n_recommendations}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"-1vF04GIqQTY","executionInfo":{"status":"ok","timestamp":1724037958415,"user_tz":-420,"elapsed":505,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"8032ff43-9c16-4632-d720-4cfeb8fffa33"},"execution_count":16,"outputs":[{"output_type":"stream","name":"stdout","text":["Top 10 recommended items for user 0: [588 511 514 482 479 495 284 526 316 512]\n"]}]},{"cell_type":"markdown","source":["- Metadata dalam Evaluasi: Saat mengevaluasi model, kita sertakan metadata dari item yang ada di test set.\n","- Prediksi Top-N: Metadata juga disertakan dalam proses prediksi untuk semua item yang belum dirating oleh pengguna, lalu kita ambil N item dengan prediksi tertinggi."],"metadata":{"id":"SuzgT1YzcKjI"}}]}
|
4. Latihan RS With Meta Data + PyTorch.ipynb
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyMAFA2EYfyZHWxBXYVidXcz"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["## Mengunduh dan Menyiapkan Dataset MovieLens 100k\n","Pertama, kita unduh dataset dari MovieLens atau gunakan library surprise untuk memuatnya dengan mudah."],"metadata":{"id":"XuAFN0HOVwz5"}},{"cell_type":"code","execution_count":2,"metadata":{"id":"a5vVoaGrOWqT","executionInfo":{"status":"ok","timestamp":1724049334056,"user_tz":-420,"elapsed":2310,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"outputs":[],"source":["import pandas as pd\n","import numpy as np\n","from sklearn.model_selection import train_test_split"]},{"cell_type":"code","source":["# Mengunduh dataset MovieLens 100k\n","!wget -q https://files.grouplens.org/datasets/movielens/ml-100k.zip\n","!unzip -q ml-100k.zip\n","\n","# Memuat data\n","data = pd.read_csv('ml-100k/u.data', sep='\\t', names=['user_id', 'item_id', 'rating', 'timestamp'])\n","data = data[['user_id', 'item_id', 'rating']] # Kita hanya butuh user_id, item_id, dan rating\n","\n","# Membaca file item yang berisi metadata\n","genre_list = [\"unknown\", \"Action\", \"Adventure\", \"Animation\", \"Children's\", \"Comedy\", \"Crime\", \"Documentary\", \"Drama\", \"Fantasy\", \"Film-Noir\", \"Horror\", \"Musical\", \"Mystery\", \"Romance\", \"Sci-Fi\", \"Thriller\", \"War\", \"Western\"]\n","columns = [\"item_id\", \"title\", \"release_date\", \"video_release_date\", \"IMDb_URL\"] + genre_list\n","item_metadata = pd.read_csv('ml-100k/u.item', sep='|', encoding='ISO-8859-1', header=None, names=columns, usecols=range(24))\n","\n","# # Menggabungkan metadata dengan dataset utama\n","data = pd.merge(data, item_metadata, on='item_id')\n","\n","# Normalisasi ID pengguna dan item (karena ID asli mungkin tidak dimulai dari 0)\n","data['user_id'] = data['user_id'] - 1\n","data['item_id'] = data['item_id'] - 1\n","\n","# Melihat statistik dataset\n","num_users = data['user_id'].nunique()\n","num_items = data['item_id'].nunique()\n","print(f\"Number of users: {num_users}, Number of items: {num_items}\")\n","\n","# Split dataset menjadi train dan test\n","train_data, test_data = train_test_split(data, test_size=0.2, random_state=42)\n","\n","item_metadata.head()"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":411},"id":"c1A17vhhVubf","executionInfo":{"status":"ok","timestamp":1724049863540,"user_tz":-420,"elapsed":4952,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"762dffb5-22fe-4cfa-971c-fb62eeec95c8"},"execution_count":6,"outputs":[{"output_type":"stream","name":"stdout","text":["replace ml-100k/allbut.pl? [y]es, [n]o, [A]ll, [N]one, [r]ename: A\n","Number of users: 943, Number of items: 1682\n"]},{"output_type":"execute_result","data":{"text/plain":[" item_id title release_date video_release_date \\\n","0 1 Toy Story (1995) 01-Jan-1995 NaN \n","1 2 GoldenEye (1995) 01-Jan-1995 NaN \n","2 3 Four Rooms (1995) 01-Jan-1995 NaN \n","3 4 Get Shorty (1995) 01-Jan-1995 NaN \n","4 5 Copycat (1995) 01-Jan-1995 NaN \n","\n"," IMDb_URL unknown Action \\\n","0 http://us.imdb.com/M/title-exact?Toy%20Story%2... 0 0 \n","1 http://us.imdb.com/M/title-exact?GoldenEye%20(... 0 1 \n","2 http://us.imdb.com/M/title-exact?Four%20Rooms%... 0 0 \n","3 http://us.imdb.com/M/title-exact?Get%20Shorty%... 0 1 \n","4 http://us.imdb.com/M/title-exact?Copycat%20(1995) 0 0 \n","\n"," Adventure Animation Children's ... Fantasy Film-Noir Horror Musical \\\n","0 0 1 1 ... 0 0 0 0 \n","1 1 0 0 ... 0 0 0 0 \n","2 0 0 0 ... 0 0 0 0 \n","3 0 0 0 ... 0 0 0 0 \n","4 0 0 0 ... 0 0 0 0 \n","\n"," Mystery Romance Sci-Fi Thriller War Western \n","0 0 0 0 0 0 0 \n","1 0 0 0 1 0 0 \n","2 0 0 0 1 0 0 \n","3 0 0 0 0 0 0 \n","4 0 0 0 1 0 0 \n","\n","[5 rows x 24 columns]"],"text/html":["\n"," <div id=\"df-7019fae2-ff1a-48e5-8edf-31baa245c445\" class=\"colab-df-container\">\n"," <div>\n","<style scoped>\n"," .dataframe tbody tr th:only-of-type {\n"," vertical-align: middle;\n"," }\n","\n"," .dataframe tbody tr th {\n"," vertical-align: top;\n"," }\n","\n"," .dataframe thead th {\n"," text-align: right;\n"," }\n","</style>\n","<table border=\"1\" class=\"dataframe\">\n"," <thead>\n"," <tr style=\"text-align: right;\">\n"," <th></th>\n"," <th>item_id</th>\n"," <th>title</th>\n"," <th>release_date</th>\n"," <th>video_release_date</th>\n"," <th>IMDb_URL</th>\n"," <th>unknown</th>\n"," <th>Action</th>\n"," <th>Adventure</th>\n"," <th>Animation</th>\n"," <th>Children's</th>\n"," <th>...</th>\n"," <th>Fantasy</th>\n"," <th>Film-Noir</th>\n"," <th>Horror</th>\n"," <th>Musical</th>\n"," <th>Mystery</th>\n"," <th>Romance</th>\n"," <th>Sci-Fi</th>\n"," <th>Thriller</th>\n"," <th>War</th>\n"," <th>Western</th>\n"," </tr>\n"," </thead>\n"," <tbody>\n"," <tr>\n"," <th>0</th>\n"," <td>1</td>\n"," <td>Toy Story (1995)</td>\n"," <td>01-Jan-1995</td>\n"," <td>NaN</td>\n"," <td>http://us.imdb.com/M/title-exact?Toy%20Story%2...</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>1</td>\n"," <td>1</td>\n"," <td>...</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," </tr>\n"," <tr>\n"," <th>1</th>\n"," <td>2</td>\n"," <td>GoldenEye (1995)</td>\n"," <td>01-Jan-1995</td>\n"," <td>NaN</td>\n"," <td>http://us.imdb.com/M/title-exact?GoldenEye%20(...</td>\n"," <td>0</td>\n"," <td>1</td>\n"," <td>1</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>...</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>1</td>\n"," <td>0</td>\n"," <td>0</td>\n"," </tr>\n"," <tr>\n"," <th>2</th>\n"," <td>3</td>\n"," <td>Four Rooms (1995)</td>\n"," <td>01-Jan-1995</td>\n"," <td>NaN</td>\n"," <td>http://us.imdb.com/M/title-exact?Four%20Rooms%...</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>...</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>1</td>\n"," <td>0</td>\n"," <td>0</td>\n"," </tr>\n"," <tr>\n"," <th>3</th>\n"," <td>4</td>\n"," <td>Get Shorty (1995)</td>\n"," <td>01-Jan-1995</td>\n"," <td>NaN</td>\n"," <td>http://us.imdb.com/M/title-exact?Get%20Shorty%...</td>\n"," <td>0</td>\n"," <td>1</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>...</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," </tr>\n"," <tr>\n"," <th>4</th>\n"," <td>5</td>\n"," <td>Copycat (1995)</td>\n"," <td>01-Jan-1995</td>\n"," <td>NaN</td>\n"," <td>http://us.imdb.com/M/title-exact?Copycat%20(1995)</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>...</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>0</td>\n"," <td>1</td>\n"," <td>0</td>\n"," <td>0</td>\n"," </tr>\n"," </tbody>\n","</table>\n","<p>5 rows × 24 columns</p>\n","</div>\n"," <div class=\"colab-df-buttons\">\n","\n"," <div class=\"colab-df-container\">\n"," <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-7019fae2-ff1a-48e5-8edf-31baa245c445')\"\n"," title=\"Convert this dataframe to an interactive table.\"\n"," style=\"display:none;\">\n","\n"," <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n"," <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n"," </svg>\n"," </button>\n","\n"," <style>\n"," .colab-df-container {\n"," display:flex;\n"," gap: 12px;\n"," }\n","\n"," .colab-df-convert {\n"," background-color: #E8F0FE;\n"," border: none;\n"," border-radius: 50%;\n"," cursor: pointer;\n"," display: none;\n"," fill: #1967D2;\n"," height: 32px;\n"," padding: 0 0 0 0;\n"," width: 32px;\n"," }\n","\n"," .colab-df-convert:hover {\n"," background-color: #E2EBFA;\n"," box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n"," fill: #174EA6;\n"," }\n","\n"," .colab-df-buttons div {\n"," margin-bottom: 4px;\n"," }\n","\n"," [theme=dark] .colab-df-convert {\n"," background-color: #3B4455;\n"," fill: #D2E3FC;\n"," }\n","\n"," [theme=dark] .colab-df-convert:hover {\n"," background-color: #434B5C;\n"," box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n"," filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n"," fill: #FFFFFF;\n"," }\n"," </style>\n","\n"," <script>\n"," const buttonEl =\n"," document.querySelector('#df-7019fae2-ff1a-48e5-8edf-31baa245c445 button.colab-df-convert');\n"," buttonEl.style.display =\n"," google.colab.kernel.accessAllowed ? 'block' : 'none';\n","\n"," async function convertToInteractive(key) {\n"," const element = document.querySelector('#df-7019fae2-ff1a-48e5-8edf-31baa245c445');\n"," const dataTable =\n"," await google.colab.kernel.invokeFunction('convertToInteractive',\n"," [key], {});\n"," if (!dataTable) return;\n","\n"," const docLinkHtml = 'Like what you see? Visit the ' +\n"," '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n"," + ' to learn more about interactive tables.';\n"," element.innerHTML = '';\n"," dataTable['output_type'] = 'display_data';\n"," await google.colab.output.renderOutput(dataTable, element);\n"," const docLink = document.createElement('div');\n"," docLink.innerHTML = docLinkHtml;\n"," element.appendChild(docLink);\n"," }\n"," </script>\n"," </div>\n","\n","\n","<div id=\"df-be701d3b-14d5-4244-b58b-5be23168ded1\">\n"," <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-be701d3b-14d5-4244-b58b-5be23168ded1')\"\n"," title=\"Suggest charts\"\n"," style=\"display:none;\">\n","\n","<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n"," width=\"24px\">\n"," <g>\n"," <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n"," </g>\n","</svg>\n"," </button>\n","\n","<style>\n"," .colab-df-quickchart {\n"," --bg-color: #E8F0FE;\n"," --fill-color: #1967D2;\n"," --hover-bg-color: #E2EBFA;\n"," --hover-fill-color: #174EA6;\n"," --disabled-fill-color: #AAA;\n"," --disabled-bg-color: #DDD;\n"," }\n","\n"," [theme=dark] .colab-df-quickchart {\n"," --bg-color: #3B4455;\n"," --fill-color: #D2E3FC;\n"," --hover-bg-color: #434B5C;\n"," --hover-fill-color: #FFFFFF;\n"," --disabled-bg-color: #3B4455;\n"," --disabled-fill-color: #666;\n"," }\n","\n"," .colab-df-quickchart {\n"," background-color: var(--bg-color);\n"," border: none;\n"," border-radius: 50%;\n"," cursor: pointer;\n"," display: none;\n"," fill: var(--fill-color);\n"," height: 32px;\n"," padding: 0;\n"," width: 32px;\n"," }\n","\n"," .colab-df-quickchart:hover {\n"," background-color: var(--hover-bg-color);\n"," box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n"," fill: var(--button-hover-fill-color);\n"," }\n","\n"," .colab-df-quickchart-complete:disabled,\n"," .colab-df-quickchart-complete:disabled:hover {\n"," background-color: var(--disabled-bg-color);\n"," fill: var(--disabled-fill-color);\n"," box-shadow: none;\n"," }\n","\n"," .colab-df-spinner {\n"," border: 2px solid var(--fill-color);\n"," border-color: transparent;\n"," border-bottom-color: var(--fill-color);\n"," animation:\n"," spin 1s steps(1) infinite;\n"," }\n","\n"," @keyframes spin {\n"," 0% {\n"," border-color: transparent;\n"," border-bottom-color: var(--fill-color);\n"," border-left-color: var(--fill-color);\n"," }\n"," 20% {\n"," border-color: transparent;\n"," border-left-color: var(--fill-color);\n"," border-top-color: var(--fill-color);\n"," }\n"," 30% {\n"," border-color: transparent;\n"," border-left-color: var(--fill-color);\n"," border-top-color: var(--fill-color);\n"," border-right-color: var(--fill-color);\n"," }\n"," 40% {\n"," border-color: transparent;\n"," border-right-color: var(--fill-color);\n"," border-top-color: var(--fill-color);\n"," }\n"," 60% {\n"," border-color: transparent;\n"," border-right-color: var(--fill-color);\n"," }\n"," 80% {\n"," border-color: transparent;\n"," border-right-color: var(--fill-color);\n"," border-bottom-color: var(--fill-color);\n"," }\n"," 90% {\n"," border-color: transparent;\n"," border-bottom-color: var(--fill-color);\n"," }\n"," }\n","</style>\n","\n"," <script>\n"," async function quickchart(key) {\n"," const quickchartButtonEl =\n"," document.querySelector('#' + key + ' button');\n"," quickchartButtonEl.disabled = true; // To prevent multiple clicks.\n"," quickchartButtonEl.classList.add('colab-df-spinner');\n"," try {\n"," const charts = await google.colab.kernel.invokeFunction(\n"," 'suggestCharts', [key], {});\n"," } catch (error) {\n"," console.error('Error during call to suggestCharts:', error);\n"," }\n"," quickchartButtonEl.classList.remove('colab-df-spinner');\n"," quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n"," }\n"," (() => {\n"," let quickchartButtonEl =\n"," document.querySelector('#df-be701d3b-14d5-4244-b58b-5be23168ded1 button');\n"," quickchartButtonEl.style.display =\n"," google.colab.kernel.accessAllowed ? 'block' : 'none';\n"," })();\n"," </script>\n","</div>\n","\n"," </div>\n"," </div>\n"],"application/vnd.google.colaboratory.intrinsic+json":{"type":"dataframe","variable_name":"item_metadata"}},"metadata":{},"execution_count":6}]},{"cell_type":"markdown","source":["## Membuat Model Matrix Factorization\n","Sekarang kita akan membuat model matrix factorization dengan TensorFlow atau PyTorch yang mirip dengan yang sudah kita implementasikan sebelumnya."],"metadata":{"id":"D0pwH9lFY9oc"}},{"cell_type":"code","source":["import torch\n","import torch.nn as nn\n","import torch.optim as optim\n","from torch.utils.data import DataLoader, Dataset"],"metadata":{"id":"QFeYKahGZF_I","executionInfo":{"status":"ok","timestamp":1724050080430,"user_tz":-420,"elapsed":6584,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":7,"outputs":[]},{"cell_type":"code","source":["class MovieLensDatasetWithMetadata(Dataset):\n"," def __init__(self, ratings, metadata):\n"," \"\"\"\n"," Args:\n"," ratings (DataFrame): DataFrame yang berisi kolom user_id, item_id, dan rating.\n"," metadata (ndarray): Numpy array atau tensor berisi metadata item (misalnya genre).\n"," \"\"\"\n"," self.user_ids = torch.tensor(ratings['user_id'].values, dtype=torch.long)\n"," self.item_ids = torch.tensor(ratings['item_id'].values, dtype=torch.long)\n"," self.ratings = torch.tensor(ratings['rating'].values, dtype=torch.float32)\n"," self.metadata = torch.tensor(metadata, dtype=torch.float32)\n","\n"," def __len__(self):\n"," return len(self.ratings)\n","\n"," def __getitem__(self, idx):\n"," user_id = self.user_ids[idx]\n"," item_id = self.item_ids[idx]\n"," rating = self.ratings[idx]\n"," metadata = self.metadata[item_id] # Metadata diambil berdasarkan item_id\n"," return user_id, item_id, rating, metadata"],"metadata":{"id":"7FLSZaW-a-Ro","executionInfo":{"status":"ok","timestamp":1724050569156,"user_tz":-420,"elapsed":402,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":12,"outputs":[]},{"cell_type":"code","source":["class MFModelWithMetadata(nn.Module):\n"," def __init__(self, num_users, num_items, embedding_size, metadata_size, reg_factor):\n"," super(MFModelWithMetadata, self).__init__()\n"," self.user_embedding = nn.Embedding(num_users, embedding_size)\n"," self.item_embedding = nn.Embedding(num_items, embedding_size)\n"," self.user_bias = nn.Embedding(num_users, 1)\n"," self.item_bias = nn.Embedding(num_items, 1)\n"," self.metadata_dense = nn.Linear(metadata_size, embedding_size)\n"," self.reg_factor = reg_factor\n","\n"," def forward(self, user_id, item_id, item_metadata):\n"," user_vec = self.user_embedding(user_id)\n"," item_vec = self.item_embedding(item_id)\n"," metadata_vec = torch.relu(self.metadata_dense(item_metadata))\n"," user_bias = self.user_bias(user_id).squeeze()\n"," item_bias = self.item_bias(item_id).squeeze()\n","\n"," # Menggabungkan embedding item dengan vektor metadata\n"," combined_item_vec = item_vec + metadata_vec\n","\n"," dot_product = torch.sum(user_vec * combined_item_vec, dim=1)\n"," return dot_product + user_bias + item_bias\n","\n"," def regularization_loss(self):\n"," return self.reg_factor * (torch.norm(self.user_embedding.weight) + torch.norm(self.item_embedding.weight))"],"metadata":{"id":"Vobp8x1zYeuw","executionInfo":{"status":"ok","timestamp":1724050573193,"user_tz":-420,"elapsed":5,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}}},"execution_count":13,"outputs":[]},{"cell_type":"code","source":["# Melatih model\n","metadata_input = torch.tensor(item_metadata[genre_list].values, dtype=torch.float32)\n","train_metadata = metadata_input[train_data['item_id'].values]\n","\n","train_dataset = MovieLensDatasetWithMetadata(train_data, train_metadata)\n","train_loader = DataLoader(train_dataset, batch_size=256, shuffle=True)\n","\n","model = MFModelWithMetadata(num_users, num_items, embedding_size=32, metadata_size=len(genre_list), reg_factor=0.01)\n","criterion = nn.MSELoss()\n","optimizer = optim.Adam(model.parameters(), lr=0.01)\n","\n","# Training loop\n","for epoch in range(20):\n"," model.train()\n"," total_loss = 0\n"," for user_ids, item_ids, ratings, metadata in train_loader:\n"," optimizer.zero_grad()\n"," predictions = model(user_ids, item_ids, metadata)\n"," loss = criterion(predictions, ratings)\n"," loss += model.regularization_loss() # Menambahkan regularisasi\n"," loss.backward()\n"," optimizer.step()\n"," total_loss += loss.item()\n"," print(f\"Epoch {epoch+1}, Loss: {total_loss/len(train_loader):.4f}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"_GlxO1NTacre","executionInfo":{"status":"ok","timestamp":1724050727524,"user_tz":-420,"elapsed":51674,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"bb7020d3-678d-410f-beff-6b13ef24f69e"},"execution_count":15,"outputs":[{"output_type":"stream","name":"stderr","text":["<ipython-input-12-23d1ac0771db>:11: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).\n"," self.metadata = torch.tensor(metadata, dtype=torch.float32)\n"]},{"output_type":"stream","name":"stdout","text":["Epoch 1, Loss: 30.5156\n","Epoch 2, Loss: 9.8794\n","Epoch 3, Loss: 5.5845\n","Epoch 4, Loss: 4.0761\n","Epoch 5, Loss: 3.3534\n","Epoch 6, Loss: 2.9154\n","Epoch 7, Loss: 2.6030\n","Epoch 8, Loss: 2.3503\n","Epoch 9, Loss: 2.1314\n","Epoch 10, Loss: 1.9354\n","Epoch 11, Loss: 1.7634\n","Epoch 12, Loss: 1.6138\n","Epoch 13, Loss: 1.4952\n","Epoch 14, Loss: 1.4069\n","Epoch 15, Loss: 1.3445\n","Epoch 16, Loss: 1.2983\n","Epoch 17, Loss: 1.2678\n","Epoch 18, Loss: 1.2397\n","Epoch 19, Loss: 1.2157\n","Epoch 20, Loss: 1.1865\n"]}]},{"cell_type":"markdown","source":["## Mengevaluasi Kinerja Model\n","Setelah training, kita evaluasi model pada dataset test menggunakan RMSE."],"metadata":{"id":"Wf2yPJlPbxWO"}},{"cell_type":"code","source":["from sklearn.metrics import mean_squared_error\n","import numpy as np\n","\n","model.eval()\n","test_metadata = metadata_input[test_data['item_id'].values]\n","\n","test_dataset = MovieLensDatasetWithMetadata(test_data, test_metadata)\n","test_loader = DataLoader(test_dataset, batch_size=256, shuffle=False)\n","\n","predictions, targets = [], []\n","with torch.no_grad():\n"," for user_ids, item_ids, ratings, metadata in test_loader:\n"," output = model(user_ids, item_ids, metadata)\n"," predictions.extend(output.numpy())\n"," targets.extend(ratings.numpy())\n","\n","rmse = np.sqrt(mean_squared_error(targets, predictions))\n","print(f\"Test RMSE: {rmse:.4f}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"nTUvdTFIalb5","executionInfo":{"status":"ok","timestamp":1724050795644,"user_tz":-420,"elapsed":989,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"279cf213-0a0c-4b22-8d5a-8017b8fafda2"},"execution_count":17,"outputs":[{"output_type":"stream","name":"stderr","text":["<ipython-input-12-23d1ac0771db>:11: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).\n"," self.metadata = torch.tensor(metadata, dtype=torch.float32)\n"]},{"output_type":"stream","name":"stdout","text":["Test RMSE: 0.9595\n"]}]},{"cell_type":"markdown","source":["## Melakukan Top-N Recommendation Untuk Pengguna Tertentu\n","Untuk menampilkan Top-N prediksi untuk pengguna tertentu, kamu bisa mengambil semua item yang belum dirating oleh pengguna tersebut dan menghitung prediksinya. Setelah itu, kamu bisa mengurutkannya berdasarkan nilai prediksi dari yang tertinggi.\n","\n","Penjelasan:\n","- Memfilter Item yang Belum Dirating: Untuk setiap pengguna, kita ambil semua item yang belum pernah mereka beri rating. Ini penting karena rekomendasi seharusnya diberikan untuk item yang belum pernah dilihat atau dinilai.\n","- Prediksi Semua Rating: Untuk setiap item yang belum dirating, kita prediksi rating menggunakan model yang telah dilatih.\n","- Urutkan Prediksi: Setelah semua prediksi dihitung, kita urutkan dari yang tertinggi ke terendah dan ambil N prediksi teratas.\n","- Mengembalikan Item yang Direkomendasikan: Output adalah daftar item (misalnya film) yang direkomendasikan untuk pengguna tersebut."],"metadata":{"id":"ONQGz8OKb_aM"}},{"cell_type":"code","source":["def get_top_n_recommendations_pytorch(model, user_id, N=10):\n"," # Dapatkan semua item yang tersedia\n"," all_items = np.array(range(num_items))\n","\n"," # Cek item yang sudah dirating oleh user\n"," rated_items = train_data[train_data['user_id'] == user_id]['item_id'].values\n","\n"," # Ambil item yang belum dirating oleh user\n"," items_to_predict = np.setdiff1d(all_items, rated_items)\n","\n"," # Dapatkan metadata dari item yang akan diprediksi\n"," items_metadata = metadata_input[items_to_predict]\n","\n"," model.eval()\n"," with torch.no_grad():\n"," user_ids = torch.tensor([user_id] * len(items_to_predict))\n"," item_ids = torch.tensor(items_to_predict)\n"," metadata = torch.tensor(items_metadata, dtype=torch.float32)\n"," predicted_ratings = model(user_ids, item_ids, metadata).numpy()\n","\n"," # Urutkan item berdasarkan rating tertinggi\n"," top_n_items = items_to_predict[np.argsort(predicted_ratings)[-N:][::-1]]\n","\n"," return top_n_items\n","\n","# Contoh penggunaan\n","user_id = 0\n","top_n_recommendations = get_top_n_recommendations_pytorch(model, user_id, N=10)\n","print(f\"Top 10 recommended items for user {user_id}: {top_n_recommendations}\")"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"O6UMD30pb8KI","executionInfo":{"status":"ok","timestamp":1724050857455,"user_tz":-420,"elapsed":539,"user":{"displayName":"Andys Collection","userId":"04951959771200949138"}},"outputId":"a71e90ab-0fc0-466d-8b41-9637809f963a"},"execution_count":18,"outputs":[{"output_type":"stream","name":"stdout","text":["Top 10 recommended items for user 0: [ 813 1448 856 1466 1499 849 1650 407 171 312]\n"]},{"output_type":"stream","name":"stderr","text":["<ipython-input-18-2d43211060e5>:18: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).\n"," metadata = torch.tensor(items_metadata, dtype=torch.float32)\n"]}]},{"cell_type":"markdown","source":["- Metadata dalam Evaluasi: Saat mengevaluasi model, kita sertakan metadata dari item yang ada di test set.\n","- Prediksi Top-N: Metadata juga disertakan dalam proses prediksi untuk semua item yang belum dirating oleh pengguna, lalu kita ambil N item dengan prediksi tertinggi."],"metadata":{"id":"7LeYwz1QcxWH"}}]}
|