Validated Learning

Having recently read Eric Reis' book The Lean Startup I was keen to put some lessons to the test.

This is based around a presentation I gave for the Brisbane Agile Meetup. I wanted to dig into the concept of Validated Learning so I did a case study using one of my apps, “‘Roid Rage”.

Eric Ries defines a startup as an organization dedicated to creating something new under conditions of extreme uncertainty. He further elaborates on the need to Build ideas into products. While many business build an idea and then hope that they’re successful, Reis advocates that all steps are Measured along the way. These measurements are the key to Learn whether you should perservere with the idea or you should pivot your business.

I had previously created the mobile game “Meteor Madness” which I had published in the Windows Store. This was performing about average for a mobile game, which is to say: terribly. However, I did have one thing for this app that I hadn’t for previous apps, stats.

I had a very simple sales funnel that showed me how people were engaging. I would treat as a Lead anyone who saw my ‘ad’ in the store. Prospects would then be the people who after seeing the ad decided to download a free trial. Finally, a customer was anyone who ended up purchasing the game.

My original game used some graphics that I had collected back in the 1990’s along with a black starfield background. A screenshot from the Store follows which was captured from the store. With this ad, I saw a 28% conversion rate of users to downloaders.

Even those this download rate is pretty good for a game in the Windows Store, I had been thinking that this number could be improved. The worst part of the ad in my subjective opinion was the drab background. Rather than being black, what if it was a cartoon purple. I mocked up the following screenshot just to get some early feedback from friends and family.

Everyone I showed it to seemed to think it was a great improvement, so I spent a few weeks updating the actual game and re-publishing to the store. In essence, this was my hypothesis:

The data started flowing over the next couple of days. But how do I actually measure ‘success’. There are different numbers of downloads, that made me ‘feel’ like it was a valid hypothesis, but how to know for sure?

While there’s an interesting history about the mathematicians behind the statistics, I’ll skip the color here. However, with the work of Jacob Bernoulli, William Sealy Gosset and Bernard Lewis Welch, we have the formulas we require.

Visually, the problem is one of whether these two distributions are the same. The distribution below in blue is the original distrubtion of 28% +/- 2pp. When the experiment was first run, the lower-level orange distribution appeared. While the mean wass higher than the blue the flatness of the curve indicated a very large variance. The key is that initially even though the average was higher, we could not say with confidence that it was better. Mathemetically, we use the Welch’s T-Test to determine whether the two sets are statistically different.

Over time, the higher peaked red curve developed out of the data and we could finally say with confidence that this hypotheses was statistically accurate.

While I haven’t gone deep into the math, here’s a screenshot of the Excel spreadsheet developed to make these analyses. In the blue column, the set of trials from before the change are recorded. In the orange column, the trials after the change are then recorded. The mean and standard deviation for both are recorded and graphed. Finally, below the graph is a calculation as to whether the two sets can be compared at all and, if so, whether they are statistically distinct. [Download Worksheet](Lean Lead Conversion.xlsx)

So, in keeping with Eric Reis’ suggestion, we have measured the outcome and validated our learning.