Author

Francis Ndungu

On this Page

Home → Articles → How to Create a Product Recommendation System with Python, Pandas, NumPy, and Scikit-learn

How to Create a Product Recommendation System with Python, Pandas, NumPy, and Scikit-learn

20 Mar, 2026

Introduction

Recommendation systems help businesses suggest products to customers based on their preferences and behavior. With Python libraries like Pandas, NumPy, and Scikit-learn, you can build a simple recommendation engine that analyzes user ratings and predicts what products they might like.

This guide shows you how to create a basic product recommendation system step by step.

Prerequisites

Before you start:

Purchase an Ubuntu 24.04 VPS server. If you don't have a VPS server, sign up with Vultr and get up to $300 worth of free credit to test the Vultr platform.
SSH to your VPS server using PuTTY for Windows or run the following command if you're using Linux or Mac.
console
```
$ ssh username@vps_server_public_ip_address
```
Create a non-root user with sudo privileges. Read our guide on How to Create a Non-Root Sudo User on Ubuntu 24.04. You'll use this user's account to run the commands in this guide.
Install Python 3.10 or later by following our How to Install Python on Ubuntu 24.04 guide.

Set Up a Virtual Environment

It’s best practice to use a virtual environment so your project dependencies don’t interfere with other Python projects.

Create and switch to a new directory for your project

console

$ mkdir product-recommendation && cd product-recommendation

Create a virtual environment:
console
```
$ python3 -m venv venv
```
Activate the virtual environment:
console
```
$ source venv/bin/activate
```
Install the required libraries inside this environment:
console
```
(venv) $ pip install pandas numpy scikit-learn
```

Create a Sample Dataset

Use Pandas to build a simple dataset of users rating products.

Python

import pandas as pd

# Sample data: user, product, rating
data = {
    'user_id': [1, 1, 1, 2, 2, 3, 3, 4],
    'product_id': [101, 102, 103, 101, 104, 102, 104, 103],
    'rating': [5, 3, 4, 4, 5, 2, 4, 5]
}

df = pd.DataFrame(data)
print(df)

You'll use this data set in the next step.

Create a User-Product Matrix

Pivot the data so that each row represents a user and each column represents a product.

Python

user_product_matrix = df.pivot_table(
    index='user_id',
    columns='product_id',
    values='rating'
)

print(user_product_matrix)

This matrix shows which user rated which product. Missing values mean the user hasn’t rated that product yet.

Handle Missing Values

Replace missing values with 0 for simplicity.

Python

import numpy as np

matrix_filled = user_product_matrix.fillna(0)
print(matrix_filled)

Compute Similarity Between Users

Use cosine similarity from Scikit-learn to measure how similar users are based on their ratings.

Python

from sklearn.metrics.pairwise import cosine_similarity

# Compute similarity
user_similarity = cosine_similarity(matrix_filled)

print(user_similarity)

Each value shows how similar two users are (closer to 1 means more similar).

Make Recommendations

Recommend products to a user based on what similar users liked.

Python

def recommend_products(user_id, matrix, similarity, top_n=2):
    # Get similarity scores for the user
    sim_scores = similarity[user_id - 1]  # adjust index
    # Find most similar user
    similar_user = np.argsort(sim_scores)[-2] + 1  # skip self
    print(f"User {user_id} is most similar to User {similar_user}")

    # Get products rated by the similar user
    user_ratings = matrix.loc[similar_user]
    recommended = user_ratings[user_ratings > 0].index.tolist()

    return recommended[:top_n]

print(recommend_products(1, matrix_filled, user_similarity))

This function finds the most similar user and recommends the products they rated.

Test the Recommendation System

Try recommending products for different users:

Python

print("Recommendations for User 1:", recommend_products(1, matrix_filled, user_similarity))
print("Recommendations for User 2:", recommend_products(2, matrix_filled, user_similarity))

Conclusion

In this guide, you built a simple product recommendation system using Pandas, NumPy, and Scikit-learn. You learned how to create a user-product matrix, compute similarity between users, and recommend products based on similar users’ ratings. This is a basic collaborative filtering approach. You can improve it by using larger datasets, more advanced similarity measures, or machine learning models to achieve higher accuracy.

AI
Machine Learning
Deep Learning
Data Science
Neural Networks