October 12, 2022

Improving Query Performance with Indexes using Prisma: Hash Indexes

One strategy for improving performance for your database queries is using indexes. This article will dive into hash indexes: taking a look at the data structure used and improve the performance of an existing query with an index using Prisma.

Improving Query Performance with Indexes using Prisma: Introduction

Improving Query Performance with Indexes using Prisma: B-Tree Index

(Currently reading)

Improving Query Performance with Indexes using Prisma: Hash Indexes

Overview

Introduction
Hash tables: the data structure that powers hash indexes
When to use a hash index
Working with hash indexes using Prisma
Summary and next steps

Introduction

In this part of the series, you will learn what hash indexes are, how they work, and when to use them, and then dive into a concrete example of how you can improve the performance of a query with a hash index using Prisma.

If you want to learn more about the fundamentals of database indexes, check out the first part.

Hash tables: the data structure that powers hash indexes

Hash indexes use the hash table data structure. Hash tables (also known as hash maps) are great data structures that allow fast data retrieval in almost constant time (O(1)). This means the retrieval time of a record won't be affected by the size of the data being searched.

If you're unfamiliar with the concept of Big O notation, take a look at What is Big O notation.

PostgreSQL's hash index is composed of "buckets" or "slots" into which tuples are placed.

PostgreSQL uses a hash function to compute a hash key or hash code when storing a value to the index:

Hash key: maps the value to a 32-bit integer.
Hash code: maps to a bucket number in which the value will be stored.

Hash function: a function that maps data of arbitrary size to fixed-size values.

Hash code/ key: the output of a hash function.

When retrieving a record using a hash index, the database applies the hash function to the value to determine the bucket that might contain the value. After determining the bucket, the database will search through the tuple to find the records that match your query.

If you're interested in reading about PostgreSQL's implementation of the hash index, you can read further here.

When to use a hash index

Hash indexes would be a solid choice if you only intend to use the equality operator (=) to query your data. For example, in the query in the example below, a hash index would be suitable.

If you intend to use range operators (<, <=,>, >=) when filtering your data, you can use a B-Tree index. You can refer to part 2 to learn more about B-tree indexes.

Hash indexes only work with the equality (=) operator. This means that a hash index would be a solid choice if you're using the = operator when querying your data.

While a hash index might be a good choice for speeding up your queries, it comes with some caveats. Some of the limitations are that they:

Cannot be used to index multiple columns
Cannot be used to create sorted indexes
Cannot be used to enforce unique constraints

Working with hash indexes using Prisma

Assumed knowledge

To follow along, the following knowledge will be assumed:

Some familiarity with JavaScript/TypeScript
Some experience working with REST APIs
A basic understanding of working with Git

Development environment

You will also be expected to have the following tools set up in your development environment:

Node.js
Git
Docker or PostgresQL
Prisma VS Code extension (optional): intellisense and syntax highlighting for Prisma
REST Client VS Code extension (optional): sending HTTP requests on VS Code

Note: If you don't have Docker or PostgreSQL installed, you can set up a free database on Railway.

Clone the repository and install dependencies

Navigate to your directory of choice and clone the repository:

Change the directory to the cloned repository and install dependencies:

Next, rename the .env.example file to .env.

Project walkthrough

The sample project is a minimal REST API built with TypeScript and Fastify.

The project contains the following file structure:

The notable files and directories for this project are:

The prisma folder contains:
- The schema.prisma file that defines the database schema
- The migrations directory that contains the database migrations history
- The seed.ts file that contains a script to seed your development database
The src directory:
- The index.ts file defines a REST API using Fastify. It contains one endpoint called /users and accepts one optional query parameter — firstName
The docker-compose.yml file defining the PostgreSQL database docker image
The .env file containing your database connection string

The application contains a single model in the Prisma schema called User with the following fields:

The src/index.ts file contains primitive logging middleware to measure the time taken by a Prisma query:

You can use the logged data to determine which Prisma queries are slow. You can use the logs to gauge queries that could require some performance improvements.

src/index.ts also logs Prisma query events and parameters to the terminal. The query event and parameters contains the SQL query and parameters that Prisma executes against your database.

The SQL queries (with filled-in parameters) can be copied and prefixed with EXPLAIN to view the query plan the database will provide.

Create and seed the database

Start up the PostgreSQL database with docker:

Next, apply the existing database migration in prisma/migrations:

The above command will:

Create a new database called users-db (inferred from the connection string defined in the .env file)
Create a User table as defined by the model in prisma/schema.prisma.
Trigger the seeding script defined in package.json. The seeding step is triggered because it's run against a new database.

The seed file in prisma/seed.ts will populate the database with a million user records.

Start up the application server:

Make an API request

The cloned repository contains a requests.http file that contains sample requests to http://localhost:3000/users that can be used by the installed REST Client VS Code extension. The requests contain different firstName query parameters.

Note: Ensure you've installed the REST Client VS Code extension for this step. You can also use other API testing tools such as Postman, Insomnia, or your preferred tool of choice.```

Click the Send Request button right above the request to make the request.

VS Code will open an editor tab on the right side of the window with the responses. You should also see some information logged on the terminal.

In the screenshot above, the query took 55 ms.

Improve query performance with a hash index

You can define a hash index in your Prisma schema file using the @@index() attribute function and providing the following arguments:

fields: a list of fields to be indexed
type: the name of the index created in the database

The @@index attribute supports more arguments you can learn more about in the Prisma Schema API Reference.

Next, specify the firstName in the fields argument and Hash as the value of the type argument.

After making the change, create and run another migration to create the index on the firstName field in the User model:

Generated SQL migration

Next, navigate to the requests.http file again and resend the requests to the /users route.

You will notice an improvement in response times. In my case, in the screenshot below, the response times have been down to about 9 to 11ms.

Congratulations! 🎉

You've learned how to reduce your database queries' response times using a hash index.

Summary and next steps

In this part, you learned what hash indexes are, their internal structure and limitations, and how to define and use a hash index using Prisma.

If you would like learn about the fundamentals of database indexes and B-Tree indexes, refer to part 1 and part 2.

In the following article, you will learn about GIN indexes: what it is, their structure, how it works, and how you can utilize a GIN index in your application using Prisma.

Education