Most of the rails developers write queries on daily basis. Some on the other day by mistake we follow rails way instead of SQL way to do tasks like sorting etc. But you might ask what’s the problem regarding which approach I take as long as I am able to get correct results. Yes, you may get correct results but you will end up with performance problems in future. Following a set of scenarios I am observed that SQL querying out powered rails:
1) Apply sorting on a combination of different tables:
I have three tables namely deposits
, purchases
and withdraws
. Now, I want to display latest three records from a combination of three tables.
Let’s check rails way
Benchmark.ms {(Deposit.all + Purchase.all + Withdraw.all).sort{|x, y| x.updated_at <=> y.updated_at}} Deposit Load (0.6ms) SELECT "deposits".* FROM "deposits" Purchase Load (0.4ms) SELECT "purchases".* FROM "purchases" Withdraw Load (0.7ms) SELECT "withdraws".* FROM "withdraws" => 121.91500002518296
Let’s check SQL way
(SELECT 'deposit', authorization_code, amount, status, updated_at FROM DEPOSITS ORDER BY updated_at DESC LIMIT 3) UNION (SELECT 'purchase', ref, amount, status, updated_at FROM PURCHASES ORDER BY updated_at DESC LIMIT 3) UNION (SELECT 'withdraw', ref, amount, status, updated_at FROM WITHDRAWS ORDER BY updated_at DESC LIMIT 3) ORDER BY updated_at DESC LIMIT 3 => SQL (2.0ms)
2) Apply sorting on different types of data:
Let’s say a table stores gifs, videos of different duration. Now I want sort videos, gifs in ascending order of content duration separately.
Let’s check rails way
Benchmark.ms do videos = SpotBooking.where(content_type: 'video').order('content_duration DESC') gifs = SpotBooking.where(content_type: 'gif').order('content_duration DESC') videos + gifs end SpotBooking Load (0.4ms) SELECT "spot_bookings".* FROM "spot_bookings" WHERE "spot_bookings"."content_type" = $1 ORDER BY content_duration DESC [["content_type", 3]] SpotBooking Load (0.4ms) SELECT "spot_bookings".* FROM "spot_bookings" WHERE "spot_bookings"."content_type" = $1 ORDER BY content_duration DESC [["content_type", 2]] => 12.477000011131167
Let’s check SQL way
SELECT id, content_duration, row_number() OVER (PARTITION BY content_type ORDER BY content_duration DESC) FROM spot_bookings => SQL (0.8ms)
3) Formatting the data while querying:
Let’s check rails way:
Benchmark.ms do h = {} pricings = SpotPricing.order('starts_at ASC') pricings.each do |b| h[b.starts_at.strftime("%H %M %P")] = b.price.to_f end end SpotPricing Load (0.5ms) SELECT "spot_pricings".* FROM "spot_pricings" ORDER BY starts_at ASC => 35.459000151604414 # output => {"02:19 AM"=>"1.00", "03:19 AM"=>"2.00", "04:19 AM"=>"3.00", "05:19 AM"=>"4.00", "06:19 AM"=>"5.00", "07:19 AM"=>"6.00", "08:19 AM"=>"7.00", "09:19 AM"=>"8.00", "10:19 AM"=>"9.00", "11:19 AM"=>"10.00", "12:19 PM"=>"11.00", "13:19 PM"=>"12.00", "14:19 PM"=>"13.00", "15:19 PM"=>"14.00", "16:19 PM"=>"15.00", "17:19 PM"=>"16.00", "18:19 PM"=>"17.00", "19:19 PM"=>"18.00", "20:19 PM"=>"19.00", "21:19 PM"=>"20.00", "23:19 PM"=>"22.00", "00:19 AM"=>"23.00"}
Let’s check SQL way:
SpotPricing.order('starts_at ASC').pluck("to_char(starts_at, 'HH24:MI AM')", "to_char(price, 'FM999999999.00')") SELECT to_char(starts_at, 'HH24:MI AM'), to_char(price, 'FM999999999.00') FROM "spot_pricings" ORDER BY ASC => SQL (0.5ms) => {"02:19 AM"=>"1.00", "03:19 AM"=>"2.00", "04:19 AM"=>"3.00", "05:19 AM"=>"4.00", "06:19 AM"=>"5.00", "07:19 AM"=>"6.00", "08:19 AM"=>"7.00", "09:19 AM"=>"8.00", "10:19 AM"=>"9.00", "11:19 AM"=>"10.00", "12:19 PM"=>"11.00", "13:19 PM"=>"12.00", "14:19 PM"=>"13.00", "15:19 PM"=>"14.00", "16:19 PM"=>"15.00", "17:19 PM"=>"16.00", "18:19 PM"=>"17.00", "19:19 PM"=>"18.00", "20:19 PM"=>"19.00", "21:19 PM"=>"20.00", "23:19 PM"=>"22.00", "00:19 AM"=>"23.00"}
In all above scenarios, SQL plays best compare to conventional rails way. Sometimes I would feel it’s not a big deal if you not follow design patterns as long as your ability to provide the best outcome.
Any suggestions and feedback would be welcome!!!
Happy querying!!!