October 12, 2022

Improving Query Performance with Indexes using Prisma: Hash Indexes

One strategy for improving performance for your database queries is using indexes. This article will dive into hash indexes: taking a look at the data structure used and improve the performance of an existing query with an index using Prisma.



In this part of the series, you will learn what hash indexes are, how they work, and when to use them, and then dive into a concrete example of how you can improve the performance of a query with a hash index using Prisma.

If you want to learn more about the fundamentals of database indexes, check out the first part.

Hash tables: the data structure that powers hash indexes

Hash indexes use the hash table data structure. Hash tables (also known as hash maps) are great data structures that allow fast data retrieval in almost constant time (O(1)). This means the retrieval time of a record won't be affected by the size of the data being searched.

If you're unfamiliar with the concept of Big O notation, take a look at What is Big O notation.

PostgreSQL's hash index is composed of "buckets" or "slots" into which tuples are placed.

PostgreSQL uses a hash function to compute a hash key or hash code when storing a value to the index:

  • Hash key: maps the value to a 32-bit integer.
  • Hash code: maps to a bucket number in which the value will be stored.

Hash function: a function that maps data of arbitrary size to fixed-size values.

Hash code/ key: the output of a hash function.

When retrieving a record using a hash index, the database applies the hash function to the value to determine the bucket that might contain the value. After determining the bucket, the database will search through the tuple to find the records that match your query.

If you're interested in reading about PostgreSQL's implementation of the hash index, you can read further here.

When to use a hash index

Hash indexes would be a solid choice if you only intend to use the equality operator (=) to query your data. For example, in the query in the example below, a hash index would be suitable.

SELECT firstName from 'User' where lastName = 'Wick';

If you intend to use range operators (<, <=,>, >=) when filtering your data, you can use a B-Tree index. You can refer to part 2 to learn more about B-tree indexes.

Hash indexes only work with the equality (=) operator. This means that a hash index would be a solid choice if you're using the = operator when querying your data.

While a hash index might be a good choice for speeding up your queries, it comes with some caveats. Some of the limitations are that they:

  • Cannot be used to index multiple columns
  • Cannot be used to create sorted indexes
  • Cannot be used to enforce unique constraints

Working with hash indexes using Prisma

Assumed knowledge

To follow along, the following knowledge will be assumed:

  • Some familiarity with JavaScript/TypeScript
  • Some experience working with REST APIs
  • A basic understanding of working with Git

Development environment

You will also be expected to have the following tools set up in your development environment:

Note: If you don't have Docker or PostgreSQL installed, you can set up a free database on Railway.

Clone the repository and install dependencies

Navigate to your directory of choice and clone the repository:

git clone -b hash-indexes git@github.com:ruheni/prisma-indexes.git

Change the directory to the cloned repository and install dependencies:

cd prisma-indexes
npm install

Next, rename the .env.example file to .env.

mv .env.example .env

Project walkthrough

The sample project is a minimal REST API built with TypeScript and Fastify.

The project contains the following file structure:

├── .github/workflows
│ │ └── test.yaml
│ └── renovate.json
├── node_modules
├── prisma
│ ├── migrations/
│ ├── schema.prisma
│ └── seed.ts
├── src
│ └── index.ts
├── README.md
├── .env
├── .gitignore
├── docker-compose.yml
├── package-lock.json
├── package.json
├── requests.http
└── tsconfig.json

The notable files and directories for this project are:

  • The prisma folder contains:
    • The schema.prisma file that defines the database schema
    • The migrations directory that contains the database migrations history
    • The seed.ts file that contains a script to seed your development database
  • The src directory:
    • The index.ts file defines a REST API using Fastify. It contains one endpoint called /users and accepts one optional query parameter — firstName
  • The docker-compose.yml file defining the PostgreSQL database docker image
  • The .env file containing your database connection string

The application contains a single model in the Prisma schema called User with the following fields:

// prisma/schema.prisma
model User {
id Int @id @default(autoincrement())
firstName String
lastName String
email String

The src/index.ts file contains primitive logging middleware to measure the time taken by a Prisma query:

// src/index.ts
prisma.$use(async (params, next) => {
const before = Date.now()
const result = await next(params)
const after = Date.now()
logger.info(`Query took ${after - before}ms`)
return result

You can use the logged data to determine which Prisma queries are slow. You can use the logs to gauge queries that could require some performance improvements.

src/index.ts also logs Prisma query events and parameters to the terminal. The query event and parameters contains the SQL query and parameters that Prisma executes against your database.

const prisma = new PrismaClient({
log: [{ emit: "event", level: "query", },],
prisma.$on("query", async (e) => {
logger.info(`Query: ${e.query}`)
logger.info(`Params: ${e.params}`)

The SQL queries (with filled-in parameters) can be copied and prefixed with EXPLAIN to view the query plan the database will provide.

Create and seed the database

Start up the PostgreSQL database with docker:

docker-compose up -d

Next, apply the existing database migration in prisma/migrations:

npx prisma migrate dev

The above command will:

  1. Create a new database called users-db (inferred from the connection string defined in the .env file)
  2. Create a User table as defined by the model in prisma/schema.prisma.
  3. Trigger the seeding script defined in package.json. The seeding step is triggered because it's run against a new database.

The seed file in prisma/seed.ts will populate the database with a million user records.

Start up the application server:

npm run dev

Make an API request

The cloned repository contains a requests.http file that contains sample requests to http://localhost:3000/users that can be used by the installed REST Client VS Code extension. The requests contain different firstName query parameters.

Note: Ensure you've installed the REST Client VS Code extension for this step. You can also use other API testing tools such as Postman, Insomnia, or your preferred tool of choice.`

Click the Send Request button right above the request to make the request.

VS Code will open an editor tab on the right side of the window with the responses. You should also see some information logged on the terminal.

In the screenshot above, the query took 55 ms.

Improve query performance with a hash index

You can define a hash index in your Prisma schema file using the @@index() attribute function and providing the following arguments:

  • fields: a list of fields to be indexed
  • type: the name of the index created in the database

The @@index attribute supports more arguments you can learn more about in the Prisma Schema API Reference.

Next, specify the firstName in the fields argument and Hash as the value of the type argument.

model User {
id Int @id @default(autoincrement())
firstName String
lastName String
email String
+ @@index(fields: [firstName], type: Hash)

After making the change, create and run another migration to create the index on the firstName field in the User model:

npx prisma migrate dev --name add-firstName-index
--- prisma/migrations/[timestamp]_add_firstName_index/migration.sql
-- CreateIndex
CREATE INDEX "User_firstName_idx" ON "User" USING HASH ("firstName");

Next, navigate to the requests.http file again and resend the requests to the /users route.

You will notice an improvement in response times. In my case, in the screenshot below, the response times have been down to about 9 to 11ms.

Congratulations! 🎉

You've learned how to reduce your database queries' response times using a hash index.

Summary and next steps

In this part, you learned what hash indexes are, their internal structure and limitations, and how to define and use a hash index using Prisma.

If you would like learn about the fundamentals of database indexes and B-Tree indexes, refer to part 1 and part 2.

In the following article, you will learn about GIN indexes: what it is, their structure, how it works, and how you can utilize a GIN index in your application using Prisma.

Don’t miss the next post!

Sign up for the Prisma Newsletter