### Building an activity feed #### Common solutions using a relational design # When faced with this application requirement, the most common approach by # far have been to create an *activities* table, and rows in this table would # reference a *user*. Activities would typically be generated for each # follower (or friend) when a certain user performs an action, like posting a # new status update. #### Problems # The biggest issue with this design, is that the *activities* table will # quickly get very huge, at which point you would need to shard it on # *user_id*. Also, inserting thousands of entries per second would quickly # bring your database to its knees. #### Ohm Solution # As always we need to require `Ohm`. require "ohm" # We create a `User` class, with a `set` for all the other users he # would be `following`, and another `set` for all his `followers`. class User < Ohm::Model set :followers, User set :following, User # Because a `User` literally has a `list` of activities, using a Redis # `list` to model the activities would be a good choice. We default to # getting the first 100 activities, and use # [lrange](http://code.google.com/p/redis/wiki/LrangeCommand) directly. def activities(start = 0, limit = 100) key[:activities].lrange(start, start + limit) end # Broadcasting a message to all the `followers` of a user would simply # be prepending the message for each if his `followers`. We also use # the Redis command # [lpush](http://code.google.com/p/redis/wiki/RpushCommand) directly. def broadcast(str) followers.each do |user| user.key[:activities].lpush(str) end end # Given that *Jane* wants to follow *John*, we simply do the following # steps: # # 1. *John* is added to *Jane*'s `following` list. # 2. *Jane* is added to *John*'s `followers` list. def follow(other) following << other other.followers << self end end #### Testing # We'll use cutest for our testing framework. require "cutest" # The database is flushed before each test. prepare { Ohm.flush } # We define two users, `john` and `jane`, and yield them so all # other tests are given access to these 2 users. setup do john = User.create jane = User.create [john, jane] end # Let's verify our model for `follow`. When `jane` follows `john`, # the following conditions should hold: # # 1. The followers list of `john` is comprised *only* of `jane`. # 2. The list of users `jane` is following is comprised *only* of `john`. test "jane following john" do |john, jane| jane.follow(john) assert [john] == jane.following.to_a assert [jane] == john.followers.to_a end # Broadcasting a message should simply notify all the followers of the # `broadcaster`. test "john broadcasting a message" do |john, jane| jane.follow(john) john.broadcast("Learning about Redis and Ohm") assert jane.activities.include?("Learning about Redis and Ohm") end #### Total Denormalization: Adding HTML # This may be a real edge case design decision, but for some scenarios this # may work. The beauty of this solution is that you only have to generate the # output once, and successive refreshes of the end user will help you save # some CPU cycles. # # This example of course assumes that the code that generates this does all # the conditional checks (possibly changing the point of view like *Me:* # instead of *John says:*). test "broadcasting the html directly" do |john, jane| jane.follow(john) snippet = 'John says: How\'s it going ' + 'jane?' john.broadcast(snippet) assert jane.activities.include?(snippet) end #### Saving Space # In most cases, users don't really care about keeping their entire activity # history. This application requirement would be fairly trivial to implement. # Let's reopen our `User` class and define a new broadcast method. class User # We define a constant where we set the maximum number of activity entries. MAX = 10 # Using `MAX` as the reference, we check if the number of activities exceeds # `MAX`, and use # [ltrim](http://code.google.com/p/redis/wiki/LtrimCommand) to truncate # the activities. def broadcast(str) followers.each do |user| user.key[:activities].lpush(str) if user.key[:activities].llen > MAX user.key[:activities].ltrim(0, MAX - 1) end end end end # Now let's verify that this new behavior is enforced. test "pushing 11 activities maintains the list to 10" do |john, jane| jane.follow(john) 11.times { john.broadcast("Flooding your feed!") } assert 10 == jane.activities.size end #### Conclusion # As you can see, choosing a more straightforward approach (in this case, # actually having a list per user, instead of maintaining a separate # `activities` table) will greatly simplify the design of your system. # # As a final note, keep in mind that the Ohm solution would still need # sharding for large datasets, but that would be again trivial to implement # using [redis-rb](http://github.com/ezmobius/redis-rb)'s distributed support # and sharding it against the *user_id*.