Meetings and Events

Aggregating Metrics & Events, Alexis Lê-Quôc
2011-07-06 @ 19:45 EDT (23:45 UTC) - Suspenders Bar and Restaurant

Aggregating metrics & events, a necessity to grok systems & apps

Take any off-the-shelf web application, scale it a bit, put in on the cloud. Its faster, cheaper and easier to assemble & deploy than before. But easier to operate it is not. Whereas 2-4 boxes with 40 metrics each would suffice for the entire app 10 years ago, were looking at 10s or 100s of nodes acting semi-autonomously and an avalanche of system metrics, system events, alerts to weed through.

The only way out is through aggregation, filtering and visualization, which is the topic of this talk. Starting the talk from where we should be, we will then look at some libraries/applications that you can use to do this and discuss where these currently fall short.

Alexis co-founded Datadog to help fellow developers and webops track in real-time events, changes and metrics that can affect their applications. He currently splits his time between caring for Datadog's data stack and thinking about how to improve the product.

Prior to Datadog, Alexis was building infrastructure software and leading a team of IT operations staff as a Director of Operations for Wireless Generation, supporting several million teachers in the U.S. In practice that has meant everything from racking servers to obsessing over sql queries, to writing embedded code deployed in teachers' hands nationwide. In an earlier life he spent time optimizing the performance of web applications for Orange's 25 million mobile subscribers in France.