Use your widget sidebars in the admin Design tab to change this little blurb here. Add the text widget to the Blurb Sidebar!

Startups @ Scale: Log Everything, then you can Manage Anything.

Posted: June 22nd, 2011 | Author: owocki | Filed under: startups | Tags: , , , | 1 Comment »

One thing that hasn’t changed during the span of my time at Ignighter is the importance of our in-house analytics.  Ever since our first lecture at Techstars 2008, when we were prodded to “obsess over core metrics”, we’ve been obsessed with our usage data.  Having the right information on-demand is essential to being nimble in your decision making as a management team

“If you aren’t measuring it, you can’t manage it” – Greg Tisch

As CTO, the responsibility of maintaining our business intelligence infrastructure has fallen to me. So I’ve been logging anything that’s remotely significant to our decision making process.  We’ve been doing this for years and it’s not my first rodeo, but shit, so many things have changed as we’ve scaled.

Many of these things may evolve for your project too.

  1. The business model
  2. The volume of data
  3. The usage patterns
  4. The systems
  5. The technical architecture
  6. The reporting system
  7. The market

When we first started the company, I was logging all of our usage data to our MySQL database.   A DB write (maybe several) on every pageload!? Boy, was that dumb!  It didn’t take long before the site was crippling over the load of our usage. One of the first rules of disk-bound databases is that writes are the most expensive operation you can perform.  Even when you set up a master-slave MySQL, you cannot scale writes by much, since all write queries must be performed on the slave in order to keep it up to data!

Let me give you a snapshot of how things look nowadays, when we’re on the order of many many millions (maybe more – I’m not at liberty to say) of loggable operations daily.

  1. We’ve built a Logger class which allows us to pass in error, debug, user usage, user statistics, and general info to either the filesystem or into the database.
  2. For flat filesystem logs use open source syslog, syslog-ng to log operations as they happen.  Since syslog-ng can support up to 150k loggable operations per second, it’s an ideal tool.
  3. For database logs, we use memcached as a buffer.  Basically, what you want is an ‘Aggregated Stat’ class, which has an interface that updates a counter in memcache every time an action happens, then periodicly flushes the results to the database.

After this you just need to decide what’s relevant and loggable, and whether to put it in the filesystem or database.  Filesystem logging is more scalable, but there’s advantages to having data in our database too.  There’s no way to query your flat logs from your application.   For example, in the application, I like being able to know how many times user x has logged in the past month.  That’s as easy as a

mysql> SELECT SUM(`Value`) as `NumLogins` FROM `AggregatedStats` WHERE `Segment1` = 'LoginsByUser' and `Segment2` = '[uid]' AND `AddDate` > (UNIX_TIMESTAMP() - 60 * 60 * 24 * 30) LIMIT 1

Whereas, with the flat logs, it looks much more like:

$ cat LOG_STATS.log | awk -F'\t' '{print $3$4$5}' | grep LoginsByUser | grep [uid] | wc -l

Now I can access data about anything at any time.  This system scales, and it’s nibble enough to handle queries you did not foresee.  Of course, having the ability to view this information does not mean anyone’s actually going to do it.

As a matter of practicality, I’ve found it useful to provide the following tools (and make sure they are blazing – fast ).  All of them are plain-vanilla open-source and 100% FREE too!

  • Nightly email script that rolls up the ERROR and STAT logs, and sends the most interesting tidbits to the team on a nightly basis.
  • Make the data available  in our open source graphing system, Graphite.  I’m a huge huge fan of graphite.  Importing data into it is as easy as writing script that scrapes the flat logs periodically and passing into an included python script.  Big ups to Esty for letting me know about graphite. Check out these sample graphs from their implementation of graphite:
     

    Did I mention that I’m a super-fan of graphite yet?  It’s super nimble, fast, and it scales.  If you choose one tool from this post to implement, choose graphite.

  • Plug the data into your team’s private twitter-bot.
    Here’s a sample tweet.  Note this data is not actual usage data.
  • Make the data available the your admin section of your application.  I’ve found it useful to write queries that I frequently run, give them a name that even the business monkeys can understand, and make them available to everyone via a ‘reports’ section.  (Just kidding Adam and Dan, you’re not monkeys)
  • Make the data available via a board-level reporting system that ONLY includes key metrics.   The exclusivity of this reporting system is what makes it special.  Only the KEY metrics make it into here!


  • I have a data porn (get it, cause data is fun to look at? ;) ) box with several monitors in the office to show me how everythings going for the past 24 hours.    I especially like chartbeat for this.
  • Nagios is a great tool for informing your team of Systems issues. Munin allows you to see system-level information (CPU usage, load average, network transfer, swap i/o) over time.
     

 

Informed decision making made easy!  Watch out Zoltar, Now even us mere mortals can tell you anything about anything.

 

The usual warnings apply.  These are all just ideas and your mileage may vary based upon your technical ability, execution, and your gumption. I’d love to hear what your team uses and how it compares to what I’ve outlined in this post!  Leave me a tweet or a comment below.

Did you know? Ignighter is hiring.  We’re based in NYC, work hard, have a lot of fun, build cool shit, and we’re backed by some of the best investors in the business . Check out our open development positions.

 

 

 

 

Note: Any information of proprietary value to my employer has been removed or approved, and this post has been approved by my employer.


Startups @ Scale: Building an early warning system

Posted: June 18th, 2011 | Author: owocki | Filed under: Uncategorized | Tags: | 2 Comments »

I’ve been thinking how much things have changed lately at Ignighter.  We’re starting to get some press, a bunch of daily registrations, and a bunch more messages piping through our once-rinky dating website.  It’s beginning to feel like we’re not such a small startup anymore.

When milestones like those pass, it really changes the way your team builds software.  These days, I’m obsessed with building at scale.  Back when we founded Ignighter back in 2008, we built a system that was as efficient and scalable as an old rusty tricycle.  For the past 3 years, we’ve re-engineered (and re-engineered, and re-engineered) the system, and these days it feels like we’re building a jetliner right just as it’s taking off.

It’s a lot of fun.

Anyway, I wanted to share a new project I’m working on to help me keep an eye on everything as we fire up the afterburners.  This week, we launched the Ignighter Early Warning System.  It’s a private twitter account that keeps an eye on our logs and let’s me (and my team) know when there is significant movement, up or down, in them.  I’ve configured it to tweet out changes in our error, stats, info, or debug logs.  And since it’s entirely homebrew, it’s completely customizable.

These metrics are for illustration purpose only; They do not represent actual Ignighter.com usage.

Since our developers are all already on it, it makes sense to share this information on twitter.  While this system supplements, not replaces, traditional regression and Unit Testing, this project allows us to act on production issues before they snowball into a full-blown catastrophe.

Time from hallucination to first iteration: 3 hours.  If you’re interested in building one for your project: here are the tools I used:

  • Syslog-NG to aggregate our logs
  • Plain vanilla bash scripting, scheduled via a crontab
  • TTYTwitter to tweet updates
  • And, of course, twitter

Next up, I’m looking to extend the project to text us when a particularly egregious swing our statistics occurs.

If you build your own, I’d love to hear about it!  Leave me a tweet or a comment below.

Did you know? Ignighter is hiring.  We’re based in NYC, work hard, have a lot of fun, build cool shit, and we’re backed by some of the best investors in the business . Check out our open development positions.

 

Note: Any information of proprietary value to my employer has been removed or approved, and this post has been approved by my employer.


Know a kick-ass PHP developer in NYC? Ignighter is hiring!

Posted: February 17th, 2011 | Author: owocki | Filed under: startups, Uncategorized | Tags: | No Comments »

Ignighter is hot off the heels of our Series A and is hiring part-time PHP developers in NYC.

We’re a Venture-funded team of 6. We’re young, fun, we’ve got some rapid growth in our target market, but we’ve got a chip on our shoulders and a lot of work to do. We’ve been in the game for a few years, we earned our wings during Techstars Boulder 2008.

Do I fit the profile? If you’re young, hungry, do fantastic work, are looking to make an impact and make some new friends, then you just might be the hacker-ninja-badass we’re looking for. For a list of technical skills, check the deets here: http://newyork.craigslist.org/mnh/eng/2218416800.html

What’s in it for me? Great learning experience and the chance to get hooked into the NYC startup scene. A chance to sharpen your skillz, make a few new friends, and be a part of something big. Oh, and we’ll pay you.

Sounds sweet, what’s next? Send us a blurb about you, your resume, and a link to some of your work. If we like you, we’ll reach out and invite you to our sweet Union Square Office for coffee.

Know someone who might qualify? Bonus points / drinks on me if you pass this post around.

Note: Any information of proprietary value to my employer has been removed or approved, and this post has been approved by my employer.