back to Adaptly.com

ARel: Nifty & Vexingly Undocumented

By Philip Champon Developer Team Lead

tl;dr Reference (examples and output, sans commentary)

Ambiguous column reference, what ever do you mean?

Ryan Bates was good enough to sneak in some hints on better ActiveRecord encapsulation in his excellent “Advanced Queries in Rails 3”. I began using #merge and #& immediately. Unfortunately, not all of our code was ready to live in an encapsulated world. Quite often, I found myself with SQL errors, because columns were referenced  as symbols and ActiveRecord failed to qualify the name space… ActiveRecord only converts symbols in finder methods (find* and #where) to canonically named columns. Any #group, #having, #select, or #order symbols will burn you sooner or later, particularly if they’re part of a scope.

I began solving this on my own, by using the #arel_table class method within my model scopes. As far as piecemeal solutions went, this kept me happy… But I wanted to share the joy with my co-workers and I couldn’t justifiably ask them to read ARel code and tests (since ARel mostly lacks anything nearing useful documentation). I was forced to finally make a presentation…

Ambiguous column references in scopes can be merged into other queries.

The following snippet is a valid scope. It exemplifies the shortsighted tendencies a lot of Rails models assume. Once you attempt to merge this with another model, you’ll find your SQL server returning an ambiguous column reference.

class Product < ActiveRecord::Base
  scope :with_id, select(:id)
end
  => SELECT id FROM "<T>"
class Category < ActiveRecord::Base
  scope :with_product, joins(:products).merge Product.with_id
end
  => SELECT id FROM "categories" INNER JOIN "products" ON "products"."category_id" = "categories"."id"

Better encapsulation yields more reusable and saner code. This can be rectified by hardcoding the table name, using the #table_name class method, or accessing the symbol via the #arel_table class method.

I’m partial to the arel_table method, because it gives you an Arel::Table object, which is very powerful. Aside from canonically naming the field for you, ARel will also correctly quote the name for each database engine you use. 

select("<T>.id")
  => SELECT <T>.id FROM "<T>"
select("#{table_name}.id")
  => SELECT <T>.id FROM "<T>"
select(arel_table[:id])
  => SELECT "<T>"."id" FROM "<T>"

Ambiguous and ugly

def self.cheaper_than(price)
  where("price < ?", price)
end
  => SELECT * FROM "<T>" WHERE price < N

This method is ambiguous and it can be more easily written using ARel’s relational algebra. ARel supports many predicates, including eq, not_eq, in, not_in, matches (LIKE or ILIKE in PgSQL), does_not_match (NOT LIKE or NOT ILIKE in PgSQL), gt, gteq, lt, and lteq. Each of the aforementioned predicates also supports _any and _all suffixes, adding AND and OR support for arrays.

def self.cheaper_than(price)
  where(arel_table[:price].lt(price))
end
  => SELECT * FROM "<T>" WHERE "<T>"."price" < N
ARel to the rescue, almost

ARel is fun and fascinating, the helpers it fails to document make writing SQL even easier than ActiveRecord on its own.
scope :order_id_asc, order(arel_table[:id].asc)
  => ORDER BY "<T>"."id" ASC
scope :order_id_desc, order(arel_table[:id].desc)
  => ORDER BY "<T>"."id" DESC
scope :order_name_ci, order(arel_table[:name].lower)
  => ORDER BY LOWER("<T>"."name")
ARel is also a bit young and quirky. For example, when you alias fields using #as, you must use strings, not symbols. ARel also provides you with convenient wrappers for aggregate functions (AVG, MAX, MIN, COUNT, SUM) and infix operators (+, -, *, /). Of course, they don’t all behave quite the same.

When using aggregate functions the results, in the SELECT, is aliased as function_id. This is true for all of the aggregates except count, which will remain unaliased, by default.

scope :avg_cost, arel_table[:cost].average
  => SELECT AVG("<T>"."cost") as avg_id FROM "<T>"
scope :avg_cost, arel_table[:cost].average.as(:avg)
  => SELECT AVG("<T>"."cost") as avg FROM "<T>"
scope :count_cost, arel_table[:cost].count 
  => SELECT COUNT("<T>"."cost") FROM "<T>"
scope :count_cost, arel_table[:cost].count.as(:count) 
  => SELECT COUNT("<T>"."cost") AS count FROM "<T>"

Also, when using multiplication or division infix operators you can alias the results, but you cannot alias addition or subtraction results. Furthermore, there are no infix operators for SQL bitwise operations.

scope :add_n1_n2, arel_table[:n1] + arel_table[:n2]
scope :mul_n1_n2, arel_table[:n1] * arel_table[:n2].as "foo"
ARel also supports grouping (parenthesizing statements, controlling order of execution).
scope :order_id_asc, arel_table.grouping(arel_table[:id].gt(1).and(arel_table[:n1].gt(1))
  => SELECT * FROM "<T>" WHERE ("<T>"."id" > 1 AND "<T>"."n1" > 1)
The tl;dr Reference contains guides on more advanced topics, like polymorphic joins, sub selects, and unions.
21 notescomments

Adaptly Hackathon

By Sean Shillo, VP of Engineering 

The Adaptly office was almost completely empty on December 23, 2011 with the exception of the developer team, a strong aroma of highland scotch in the air, and the excitement surrounding our first ever hackfest. The purpose of this friendly hackathon was to let our developer’s skills and creativity run loose with the guidelines of building something related to both social media and advertising. 

Our goal was to evaluate the number of public fans of a current page on Facebook and then rank them by their level of influence. A fan’s influence is measured by his ability to affect other users. Thus, not all fans of a page are created equal, and given equal prices for fans, certain demographics are preferred over others for long term engagement with a page, brand, etc.

Due to time constraints the Klout score was chosen as the best way to measure a person’s social influence. The biggest challenge with using Klout is the API is searchable by Twitter handle and not Facebook user.

Our first approach to linking the Twitter users to Facebook users was to assess the information of all of the active people on a Facebook page. Based on each person’s name, username and location, we were able to use the Twitter search API to match up these fields and make the connection. This method seemed efficient until the rate limiting started. Undeterred, we resorted to Google and issued a search of a person’s name constrained to the Twitter domain. A little regex magic produced a list of possible Twitter usernames, providing us with the relevant information needed to connect the Twitter and Facebook accounts.

There are certainly many concerns and questions with the approaches used and assumptions made here. The main deal breaker assumption is that the fan subset found will have a valid representation of the whole set. There is also the reliance of the Klout metric.

In the end it was felt that with assumptions holding, the project would give the ability to chose a certain demographic of fans against another all others things being equal. It should be noted that all information gathered was public. Next up on the hackfest list:

Backend language shootout - Scala, Clojure, Ruby, Go, Python, there can only be one
Big Data Madness  (AKA how do I crash postgres) - HBase, Mongo, Cassandra, or Riak
Hadoop Shadoop - To insert or to log and parse

Tools Used:
Heroku
Postgres
Ruby on Rails 3.1
Relevant Gems: koala, klout-rb, resque, geocoder, twitter, mechanize

16 notescomments