Analytics / Statistics refresher for developers.

Back in my early days of creating corporate applications using the waterfall approach, our main form of feedback was asking the user what they thought about the feature we just implemented and then we assumed(and hoped) they would use that feature. If the functionality ran slowly we would try to log timings to a log file and then hope we could find the log to determine what’s up(and sometimes if we were lucky we could profile the code). Obviously using this type of methodology and feedback loop has many, many issues. The focus of this short blog post is to make sure you don’t do what I did back then.

The first thing I’ve learned over the years is that you should attempt to measure every part of your application. Not only should you measure every click your users make but you should also know how long the different functions of your application take to execute and how many times certain critical functions are executed. So as you add a new feature to your application you should have a line item somewhere in your design document that says “How am I going to prove to the business that this feature was worth the money we spent on it(maybe clicks, swipes, time on page, etc) and how long do the functions in that feature take to execute.“ The first part of the question is analytics and the second part is statistics. Of course you could say both are statistics but it’s easier to differentiate between the two types of data this way(at least in my view of things)

Lets discuss analytics. These are things like user location, logins, clicks, swipes, session times and anything else that might provide more understanding into what your users are doing in your application. This type of data can be streamed or batched up and there are several tools that fill this gap. In AWS for streaming data you can use Kinesis Firehose(https://aws.amazon.com/kinesis/firehose/) and for batch you can use Data Pipeline(https://aws.amazon.com/datapipeline/) to write the data to S3, Redshift, and Elastic Search. In GCP for both streaming and batch you can use Cloud DataFlow(https://cloud.google.com/dataflow/) to write the data to Cloud Storage, Cloud Pub/Sub, Cloud Datastore, Cloud Bigtable, and BigQuery. Once the data is loaded into one of these tools then the business can make a better determination on where new functionality should be added and if a piece of functionality is even being used at all.

Now on to statistics. Statistical data is for the developer kinda like analytical data is for the data scientists. It’s how you determine whether the function you wrote is being used and performing consistently. You could write your own stats to a log file or maybe a database but there’s a nice tool written by Etsy called StatsD that makes it very easy to keep track of this kind of information. StatsD allows you to do counting, timing, gauges, and sets. You can either create your own Graphite server to receive the stats or if you are paying for DataDog, you can take advantage of their DataDog agent to send stats directly to a DataDog dashboard. Of course you also create custom stats in AWS Cloudwatch Metrics and StackDriver as well. Once you have these stats in place, it should make it much easier to diagnose slow running functions.

I know these two items seem like pretty basic concepts but I see so many developers cut out these things when they have to hit a deadline. Don’t cut them out!

My next post will delve a little more deeply into the analytics side of things since I’m learning more about that part of things at the moment.