OiO.lk Blog SQL Improving query speed in PostgreSQL for large datasets with multiple filters
SQL

Improving query speed in PostgreSQL for large datasets with multiple filters


I’m relatively new to PostgreSQL and have hosted a PostgreSQL database on a remote AWS server. I use this server to analyze derivatives data (futures and options) using Python.

I’ve created two separate tables to store options data (one for call options and one for put options), and a table for futures data. Below are the table structures:

Options table (for call/put):

  • datetime (timestamp)
  • open (float4)
  • high (float4)
  • low (float4)
  • close (float4)
  • volume (int4)
  • open_interest (int4)
  • strike (float4)
  • expiry (date)
  • delta (float4)

Futures table:

  • datetime (timestamp)
  • open (float4)
  • high (float4)
  • low (float4)
  • close (float4)
  • volume (int4)
  • open_interest (int4)

For example, I have a put options table named abc_put_iv. When I run the following query:

SELECT *
FROM abc_put_iv
WHERE strike = 1200
  AND expiry = '2019-02-14'
ORDER BY datetime ASC;

The query takes a significant amount of time to execute. Since I plan to fetch data based on different parameters like strike, expiry, and potentially a specific date (which is technically not present in a separate column but datetime column is present), I’m looking for ways to optimize this query and make it faster.

I usually connect to the database using Python and psycopg2 to fetch data for various experiments.

My question: What are the best practices and detailed suggestions for optimizing this query to speed up data retrieval? Any tips on indexing, query restructuring, or other performance improvements would be highly appreciated.



You need to sign in to view this answers

Exit mobile version