Hey HN, here’s the result of gathering as many tech talks as I could find and then trying a bunch of ranking heuristics to find one that produced reasonable results. I’m currently using the lower bound of each talk’s Wilson score confidence interval based on likes and dislikes.
I find good tech talks to be a combo of entertainment and broadening my toolbox of programming concepts. When I hit good talk I generally do a double take, “wha, I have not thought that way before.” There are definitely some good talks in this list.
There are a few great “Awesome Talks” lists that I’ve enjoyed perusing, but I’ve found that talk title intrigue does not seem to correlate with talk quality. So in lists of 50+ talks I have a hard time finding the “next best talk”.
I’m keen to get feedback on the site as is but also if there is interest in a “top tech talks” in the last month (or X unit of time) style of digest.
Hopefully there is a talk in here that gives you a double take.
Maybe these are mostly too new, or you have different (more practical, hands on?) definition of tech-talk - but from a quick look the only speakers I expected - and found - were Sandi Mets and Rob Pike.
If it's practical, I'm surprised not to see the js "wat" lightning talk (which I now can't seem to find...).
If it's more general "best of", I'd expect something like Guy Steele "growing a language" :
https://youtu.be/_ahvzDzKdB0
Alan Kay "doing with images makes symbols":
https://youtu.be/p2LZLYcu_JY
Or, if that's too long, the much more condensed ted talk:
"a powerful idea about teaching ideas":
https://youtu.be/Eg_ToU7m1MI
(Maybe that's not a "tech talk"?)
You're right those are all great talks (and fit in my definition of a tech talk). I just checked, and none of them are in my dataset, which I'll admit I'm surprised about. But they (and related ones) will make it into the next round.
The issue seems to be that they are not typically watched on youtube. For example, the "simple made easy" linked above is a low-quality pirate youtube copy, the proper place to watch it is here:
Is your dataset limited to only videos on youtube? The Fronteers conference has been publishing its videos on vimeo, and that includes some really "awesome" ones: https://vimeo.com/fronteers/videos/sort:plays
It is currently, but I'd like it not to be. One issue is my current ranking alg uses both likes and dislikes, but Vimeo only does likes, so I can't currently cross-compare between youtube and Vimeo without switching up my ranking alg. I'm curious about trying a version of my current alg that just uses likes and views though, which would be more portable.
You could also use "net promoter" scoring: fraction of likes minus fraction of dislikes. I don't think it has any theoretical basis but the NPS system [0] is fairly popular.
Yeap! Views are also done by promoters and detractors. Assuming one view per person you could get number of passives as: passives = views - promoters - detractors. Once you have passives, you can compute the NPS.
Do you know of some generalization that instead of just positive and negative ratings would work with real numbers? E.g. rating could be anything between 0 and 1.
Interesting, I like the simplicity of that. Do you have any info how to determine good initial values for the prior? In this example good values for pretend_up and pretend_down? Would it make sense to use average_upvotes and average_downvotes or values that have that ratio?
Values that have that ratio might be good, but I'm not sure about the magnitude because maybe the average number of votes is too high so that the prior overwhelms the data. The scores get pulled towards that ratio as you increase the magnitude. If the ratio is close to 0 it has the effect of downranking videos with few votes, and if the ratio is close to 1 it has the effect of upranking videos with few votes. The effect might be too strong if you use the average magnitude. It might also be good to set the ratio a bit lower than the average ratio if you want to rank conservatively.
Parametrising it like you suggest might make it easier to experiment:
ratio = 0.5
number = 100
pretend_upvotes = ratio*number
pretend_downvotes = (1-ratio)*number
You could even set ratio to 0, but I actually think it makes sense to rank 1 up / 2 down above 101 up / 200 down, because the latter is definitely bad whereas the former might be good.
You can either estimate the prior as part of a hierarchical model, or use empirical Bayesian estimation. I spoke last year about an example of EBE applied to music trends:
I find good tech talks to be a combo of entertainment and broadening my toolbox of programming concepts. When I hit good talk I generally do a double take, “wha, I have not thought that way before.” There are definitely some good talks in this list.
There are a few great “Awesome Talks” lists that I’ve enjoyed perusing, but I’ve found that talk title intrigue does not seem to correlate with talk quality. So in lists of 50+ talks I have a hard time finding the “next best talk”.
I’m keen to get feedback on the site as is but also if there is interest in a “top tech talks” in the last month (or X unit of time) style of digest.
Hopefully there is a talk in here that gives you a double take.
Enjoy, ~yaj