Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
So, you need a statistically significant sample? (stitchfix.com)
49 points by astrobiased on May 26, 2015 | hide | past | favorite | 10 comments


The "default" alpha and beta are not the correct ones for a website A/B test.

If you're designing a drug, you'd better be very careful not to accidentally approve something that is useless. It would cost a ton of money and lives in the long run if it was no better than placebo. False rejection of the null is very bad. False acceptance of the null is not so bad.

By contrast, if you're doing an A/B test on a website, you're actually not in bad shape if you accidentally think that a red button is a bit better than a blue button, assuming that they're pretty close. False rejection of the null is okay.

However you are screwed if you miss out on the chance that a red button gives you 50% more conversion. With websites, false acceptance of the null is very bad. It's okay to mistakenly think your button is effective but it's very bad to mistakenly think that the button is ineffective.

Websites have the opposite cost benefit calculation to science generally and shouldn't use the same parameters.


Perhaps a simple conceptual tool is to consider risk as the product of cost and likelihood, and choose an acceptable level of overall risk for type 1 and 2 errors. Thus the potential cost of each error has to be part of the decision making process.


> [F]or any study that requires sampling ... making sure we have enough data to ensure confidence in results is absolutely critical.

Is this necessarily true if you can sample from the population in a fair and unbiased way?


>making sure we have enough data to ensure confidence in results is absolutely critical.

Yes, it's necessarily true. If your sample is small you are necessarily subject to large sampling error.

In essence: the individuals you happened to pick (even fairly) are overrepresented, and the rest are underrepresented.


Yes, of course, I can see that if you have an extremely small sample, then the _resolution_ of your results will suffer. However, I think it's much more important to ensure unbiased sampling than it is to ensure a large sample size.

For example, if you sample 1% of the population in a fair and unbiased way, that would tell you something with a much higher degree of confidence than if you sampled even 10% of the population in a biased way (or in a way such that you don't know whether you are biased or not).


If you have a sample of 10 people, 1 person represents 10% of the sample. The opinion of one person can swing your results by 10%.

For a sample of 100, it's 1%. For 1000, its 0.1%. The more opinions you can collect, the less they individually mean.


Yes, but it's the resolution that would suffer, not necessarily the result. For example, if 65% of the population would vote for candidate 1, an unbiased sample of size 10 would indicate that either 60% or 70% of the population would vote for candidate 1. A sample that is biased could literally tell you anything, regardless of how large it is (in absolute numbers).


Not every time, surely? A sample is randomly taken, so there will be variation. Even if it's unbiased, your samples will swing between all possible extremes. So you need to take a very large number of small samples.


The result of your sample would typically be anything between 30% and 100%.

Whereas, if you instead take a sample of 1000, you typically get results between 61% and 69%.

Source: http://www.wolframalpha.com/input/?i=binomial%2810%2C0.65%29...

You can't take an "unbiased" sample in the sense that you mean here.

There's only one way you can take a completely unbiased sample: if you know exactly how everyone will vote in advance and select them carefully on that basis. But if you already know how everyone will vote in advance, then sampling is a fruitless exercise.


I feel like my entire Stats course in college could be summed up with this one article. Bookmarking this for later reference, thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: