Rails (via ActiveRecord) makes it super easy to work with your data, and for most simple tasks insulates you from having to think too hard about SQL. But eventually, everyone hits the ORM wall, and needs to start casting SQL spells around to get the data that they want. And writing SQL can really feel like that sometimes, like an incantation that you speak over your database — and magically the data rearranges itself into the form that you need.
Take, for example, the problem of calculating a duration in business hours. Let’s say we have a
Tickets
table with opened_at
and closed_at
timestamps. Our challenge is to calculate the total time the Ticket was open, based not on calendar time but on business hours, like 8am-5pm PST Mon-Fri. Now, you could come up with some Ruby code to calculate that for each row, but we want to do it in the database for all records. So, we are going to build an AR scope that will add an additional column called duration
, calculated at query time.
(NOTE: This post assumes you are one of the cool kids and are using Postgres as your DB.)
To start out, let’s get a duration of simple calendar time.
Now, you can say
which will add an additional column called duration that you can access ruby as usual. Note that there is nothing you need to do in your Rails model file (no need for an attr_accessor etc) for this to happen. ActiveRecord will simply add any extra columns selected to the model automagically. (However, since Rails has no type information for the column, it thinks the Postgres interval type is a String).
So, that was easy. Now for the hard (fun) part. We need to calculate the time difference between
opened_at
and closed_at
but taking into account only M-F and 8a-5p PDT. We are going to basically construct a raw SQL query, and take advantage of Common Table Expressions (CTEs) in Postgres, which are underused but full of awesome.
The SQL (the explanation of which I will defer until another blog post) we need is wrapped up in an AR scope:
So, now we can say
But, this ‘scope’ is not really an AR scope, as it does not return a chainable AR Relation. So you have to always use it at the end of the chain. But the way it currently stands, the query doesn’t know anything about any existing relations, so this won’t work
It would be cool if we could capture the records in the existing scope, and only use those in our query. We can achieve this with another CTE
So we basically converted the current scope to a SQL statement, and used that as a CTE to run the query against, thus limiting the rows we are operating against. We can use other scopes or where clauses as long as we call our
with_biz_duration
scope at the end of the chain.
Now to take it to the bitter end, let’s add the ability to pass in the business hours we want, as well as the timezone.
One issue with our
tickets
table is that the opened_at
and closed_at
fields were created as timestamp
fields, which in Postgres do not have any timezone information. If we assume our DB server was configured to use UTC as the default timezone, then we need to cast the fields into fields with a UTC time zone, which we then cast again into the timezone we want.
So, now we can say
Tune in next time as we delve into the mysteries of
generate_series
and CROSS JOIN
which are at the heart of this powerful incantation. Happy SQLing!
Comments
Post a Comment