Apr 102012
 

Ok, I can’t tell you how to do that, but here is how Instagram did it. I am not going to talk about how they understood their customers and how they created something customers loved. That is all marketese – I can’t tell you how to replicate it.

What I can tell you is that they only have three engineers supporting all their webops to take care of billions of photos, terabytes of data, and millions of users. The numbers are mind-boggling. If these numbers are thrown at any CIO of an enterprise, they would come back with a budget for 100 people and 3 year plan to implement a program to manage the data.

Here are a few simple things they did right:

  • They only focused on essentials – they did not focus on keeping anything in-house that did not belong there. Yes, they are entirely cloud based. They are heavy users of Amazon EC2.
  • They used open source extensively and hacked it when needed.
    • They use Ubuntu 11.04 on EC2
    • Django for app server (stateless web – means horizontal scaling).
    • Stripped down web server (normally it is apache + mod_wsgi for python, but for their needs they needed low CPU webserver and therefore, they used ‘Green Unicorn’ (a Python WSGI HTTP Server)).
    • PostgreSQL for database (sharded cluster with 12 replicas in different zones)
    • Amazon S3 for photo storage
    • Amazon CloudFront for CDN
    • Redis as in-memory storage for feeds
    • Memcache for caching web service support (not sure why did not use Redis here also – most likely the software already works with memcache).
    • Apache Solr for searching (with JSON interface)
    • Twisted for pushing billions of notifications
  • Good focus on DevOps
    • They used nginx for load balancing (see my proposal for earlier).
    • They used Amazon Elastic Load balancer (though, they could do without it).
    • Munin for monitoring
    • Outsourced services for incident notifications(Pingdom for monitoring and PagerDuty for incidents)
    • Sentry for App server reporting in real-time

     

Slide21

The picture is a rough approximation (most of the information is taken from the wonderful site: http://instagram-engineering.tumblr.com)

What lessons lie for us poor enterprise developers, who are stuck using Java, and forced to use in-house resources that are neither flexible not scalable? Unfortunately, we will have to wait until the IT people let go of their cold dead-fingers off the inflexible IT.

Nevertheless, here is what an architect could do:

  1. Architect the systems such a way that parts of the resources (data, especially) lies outside the enterprise.
  2. Use open standards like REST and JSON to quickly pull together different systems
  3. Focus on DevOps from the beginning. Assume that your application needs to be maintained.
  4. Keep a consistent set of tools (most of the tools used in Instagram are popular in Python community)
  5. Most importantly, focus on getting the job done!
 Posted by at 11:20 am

  7 Responses to “How to make a billion dollars in a couple of years”

  1. Great Post

  2. Dear Rama,
    Just a small note maybe…
    For push Notifications they use pyapns (Apple Push Notification service) besides using Gearman
    For pyapns – Twisted is a dependency
    Amazon’s Route53 as the DNS
    they also smartly use vmtouch for cache diagnostics and have scripts in python to dynamically generate vmtouch commands in different servers by parsing vmtouch output in one server.

  3. nice posting

  4. I liked they way you explained the architecture and the driving forces. As an aside: for those “stuck using Java”, I recommend taking a look at Dropwizard. If follows several principles you explained, such as stateless services, REST, stripped-down web server, simple architecture and light-weight database access, and also application metrics, logging and operational tools that would integrate well with some of those third-party monitoring services. I think you can leverage the power and speed of the Java platform without the weight of a heavy architecture.

  5. […] The final architecture from Rama (How to make a billion dollars in a couple of years) […]

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)