Heroku Database Optimization


05-19-2014


Filed under: heroku, database, performance, featured

Most of us web devs rely on NewRelic to identify slow database queries. Heroku offers a useful, free and little-known command-line tool, heroku-pg-extra that provides great insight as well.

Once installed, run heroku pg:outliers, which shows the top 10 slow queries.


                   qry                   |    exec_time    | prop_exec_time |  ncalls   | sync_io_time 
-----------------------------------------+-----------------+----------------+-----------+--------------
 SELECT  "teacher_tweets".* FROM "teac.. | 00:08:11.793401 | 30.5%          | 3,485,358 | 00:00:00
 SELECT "lessons".* FROM "lessons"  WH.. | 00:07:06.119699 | 26.4%          | 64,543    | 00:00:00
 SELECT "teachers".* FROM "teachers" I.. | 00:02:31.94909  | 9.4%           | 129,286   | 00:00:00
 SELECT COUNT(*) FROM "workouts"  WHER.. | 00:01:11.026701 | 4.4%           | 5,292,598 | 00:00:00
 SELECT  "teachers".* FROM "teachers" .. | 00:00:55.749109 | 3.5%           | 1,735,350 | 00:00:00
 SELECT  "studios".* FROM "studios"  W.. | 00:00:49.19646  | 3.0%           | 1,293,179 | 00:00:00
 SELECT "lessons".* FROM "lessons"  WH.. | 00:00:45.412566 | 2.8%           | 14,166    | 00:00:00
 SELECT "lessons".* FROM "lessons"  WH.. | 00:00:37.907575 | 2.3%           | 14,166    | 00:00:00
 SELECT "retreats".* FROM "retreats"  .. | 00:00:30.425801 | 1.9%           | 64,556    | 00:00:00
 SELECT  "teacher_images".* FROM "teac.. | 00:00:23.229008 | 1.4%           | 1,116,421 | 00:00:00
(10 rows)

If you want to see the entire query without truncation, run this instead:

heroku pg:outliers notruncate


                   qry                                                                                                                        |    exec_time    | prop_exec_time |  ncalls   | sync_io_time 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+-----------+--------------
 SELECT  "teacher_tweets".* FROM "teacher_tweets"  WHERE "teacher_tweets"."teacher_id" = $1  ORDER BY created_at DESC LIMIT ?                                                                                                                     | 00:08:11.809985 | 30.5%          | 3,485,466 | 00:00:00
 SELECT "lessons".* FROM "lessons"  WHERE "lessons"."retired" = ? AND "lessons"."teacher_id" IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)  ORDER BY "lessons".starts_at ASC | 00:07:06.132324 | 26.4%          | 64,545    | 00:00:00
 SELECT "teachers".* FROM "teachers" INNER JOIN "teachers_users" ON "teachers"."id" = "teachers_users"."teacher_id" WHERE "teachers_users"."user_id" = $1                                                                                         | 00:02:31.954223 | 9.4%           | 129,290   | 00:00:00
 SELECT COUNT(*) FROM "workouts"  WHERE "workouts"."lesson_id" = ? AND "workouts"."user_id" = ?                                                                                                                                                   | 00:01:11.029121 | 4.4%           | 5,292,762 | 00:00:00
 SELECT  "teachers".* FROM "teachers"  WHERE "teachers"."id" = $1  ORDER BY "teachers"."id" ASC LIMIT ?                                                                                                                                           | 00:00:55.750938 | 3.5%           | 1,735,404 | 00:00:00
 SELECT  "studios".* FROM "studios"  WHERE "studios"."id" = $1  ORDER BY "studios"."id" ASC LIMIT ?                                                                                                                                               | 00:00:49.198205 | 3.0%           | 1,293,221 | 00:00:00
 SELECT "lessons".* FROM "lessons"  WHERE "lessons"."teacher_id" = $1 AND "lessons"."retired" = ?  ORDER BY "lessons".starts_at ASC                                                                                                               | 00:00:45.412566 | 2.8%           | 14,166    | 00:00:00
 SELECT "lessons".* FROM "lessons"  WHERE "lessons"."teacher_id" = $1 AND "lessons"."retired" = ?                                                                                                                                                 | 00:00:37.907575 | 2.3%           | 14,166    | 00:00:00
 SELECT "retreats".* FROM "retreats"  WHERE "retreats"."featured" = ?                                                                                                                                                                             | 00:00:30.426673 | 1.9%           | 64,558    | 00:00:00
 SELECT  "teacher_images".* FROM "teacher_images"  WHERE "teacher_images"."teacher_id" = $1 AND "teacher_images"."featured" = ?  ORDER BY "teacher_images"."id" ASC LIMIT ?                                                                       | 00:00:23.230178 | 1.4%           | 1,116,455 | 00:00:00
(10 rows)

Pick the top query you would like to optimize and remember measure, optimize, and measure again.

zlu@zlu-mba:~/projects/foo (master)$ heroku pg:psql
psql (9.3.2, server 9.2.6)
SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256)
Type "help" for help.

foodb=> EXPLAIN ANALYZE SELECT  "teacher_tweets".* FROM "teacher_tweets"  WHERE "teacher_tweets"."teacher_id" = 1  ORDER BY created_at DESC LIMIT 5;
                                                                            QUERY PLAN                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=30.04..30.05 rows=5 width=248) (actual time=0.141..0.141 rows=0 loops=1)
   ->  Sort  (cost=30.04..30.05 rows=17 width=248) (actual time=0.135..0.135 rows=0 loops=1)
         Sort Key: created_at
         Sort Method: quicksort  Memory: 25kB
         ->  Index Scan using index_teacher_tweets_on_teacher_id on teacher_tweets  (cost=0.00..29.99 rows=17 width=248) (actual time=0.057..0.057 rows=0 loops=1)
               Index Cond: (teacher_id = 1)
 Total runtime: 0.258 ms
(7 rows)

Total runtime is what you want to cut down.

Add index of teacher__id to teacher__tweets table and measure again with “EXPLAIN ANALYZE”.

Once the optimization is deployed to the Heroku, run pg:outliners again to make sure optimization has happened and repeat the optimization process.