Do you read more about politics or economics? What are the top websites you recently visited to read articles? How do you plan your casual reading?

My team (Katie Zhu, Basil Huang, and Amelia Kaufman) and I wondered about these questions and more at the start of Knight Lab’s “Collaborative Innovation and Journalism in Technology” course in spring of 2013. The initial idea we riffed on was about managing personal news consumption. We got to pick our approach and tools and were told to go wild… and we built an app! (graciously maintained by Knight Lab)

Slimformation is a tool designed to help you track your reading habits and even set goals and track how well you meet them. There’s a short video on Vimeo where I give a demo. Also, Katie (@ktzhu) wrote about this project for the Knight Lab blog and describes the motivations and details very well.

I thought I’d speak to the engineering challenges we faced.

Engineering Slimformation

Our weapons of choice: Chaplin, Brunch, Chrome’s extension API, some cowboy CoffeeScript, a dash of Clojure (news article categorization! we rolled our own!), and not a small amount of grit.

Chrome extensions are basically just html, css, and javascript, with a special manifest.json file. We have two html files of concern: background.html and popup.html. There aren’t visible in the repository, they have to be compiled. Also, we have a little JavaScript that is injected into every tab that helps us track time.

The extension communicates with Chrome’s extension API and newscat (our categorizer) in a process that runs in the context of the background page. Here’s the background view and template, and the popup views and templates, for reference.

We have the familiar concepts of controllers and models and routes at play. As always, the routes help uncover the overall structure of the app. Just follow the routes as you would in a Rails-like app.

A custom concept I introduced is services, building on existing patterns in Chaplin. Here’s all the services we have. Look at that news categorization service, for instance. It’s describing the function it will call whenever add:PageVisit is published. The other services work like this too. Anywhere in the code you can publish and subsribe to events and then execute code. This was absolutely essential to keep track of all of our distributed resources and keep things clean.

  • Javascript’s ecosystem is bewildering for a transient programmer. We tried many small scale approaches but there was too much drudgery… so we picked one tool that looked about right (it used Backbone and we had passing familiarity): Chaplin, with Brunch. This turned out well, overall. It gave us some reasonable conventions for the code and allowed us to have some soft tissue to wrap up Backbone. Building on others’ open source code is nice!
  • The standard choice for doing categorization of news articles was to use Alchemy API. But you have to jump through hoops and sacrifice a lot for the free plan. Forget that noise! I built a news article categorizer for us in Clojure, using the OpenNLP Java library (same one I used for Sentimental). It’s right here. I used Reddit to train the categorizer, grabbing the top articles from /r/Politics, /r/Business and so on for ["Politics", "Business", "Science", "Technology", "Entertainment", "Sports"]. And that’s about it… a few easy to write functions and I was done, and could host the whole thing on Heroku and call it a freaking day.
  • is dope
  • The publisher/subscriber model in Chaplin is very nice.
  • Sometimes, you gotta hack it in javascript! We were at first put off by having to do Flesch Kincaid readability tests on yet another external service… but I found a great little bit of code online and we use this right in the client side code! I told you, Cowboy Coffeescript :)

You can visit the full open source project here and Knight Lab’s version of one of the repos here.