Co-founder of Kyso here. We will be opening up the platform before the end of the year. So going forward, it will be free to teams of all sizes. If anyone is interested, feel free to reach out to me at kyle@kyso.io
Have you ever considered Kyso (https://kyso.io) (disclaimer - I'm a co-founder)? I could be way off, and it would be a corner case for us right now, but it seems we'd fit all your requirements:
* We sit on top of git, which remains the single source of truth for your code - versioning, reproducibility, etc.
* Python & R notebooks or simple markdown (also on the web app itself).
* Show or hide code.
* Simple links for sharing reports.
* Tagging.
* Comprehensive search - by tag, keyword and content (soon).
Would be really interesting to hear your thoughts on the fit.
Kyso is a central knowledge hub for sharing and collaborating on technical reports posted by the company's data scientists, engineers & analysts, so everyone can read and learn from the generated insights.
We are in the process of building out our library of templates - these are ready-made boilerplates for various data science tasks that are designed to allow new teams to get their reports deployed as quickly as possible. An example of such templates can be found below:
Awesome article - I'm wondering, for "Publishing the Notebook" part of the workflow, have you ever seen Kyso (https://kyso.io) - disclaimer, I'm a founder. We started Kyso to make it easier to communicate insights gained from analysis to non-technical people by converting data science tools (e.g. Jupyter Notebooks) into conversational tools in the form of blog posts. You can make public posts or have an internal "data blog" for your team, where you push your work to Github and it is reflected on Kyso. Would love to hear your thoughts on how it could fit into existing workflows.
They definitely are, but I'm replying to the idea that the drop in life expectancy is because of inefficiencies - the us system has always been pretty inefficient and nothing there has changed in a way that would trigger a decrease in expectancy as far as I know. This current decrease in life expectancy seems to be linked to the ongoing opiod crisis, not a sudden decrease in health delivery efficiency
Life expectancy started to drop in 2015, right as the uninsured rate reached its all-time low. They numbers don’t correlate. It might track to the increased cost of coverage and care, but not the uninsured rate.
It's scary seeing the evolution of the effects of climate change over time. How anybody can continue to deny it is unbelievable. This [1] plots out global average temperatures since the Industrial Revolution, month by month.
Hey - not OP but I wrote the article. There was a lot of discussion about recency bias in that thread, including the graph you shared and I decided to expand on the analysis. Note there are a few other graphs in my post. I did just draw this up quickly yesterday. But you're right, sorry - I should have cited your comment in my post. Adding it now.
Hi, there are what seem to me (not a statistician) to be a number of serious errors in the article. The most egregious is the first conclusion, which threatens to undermine the whole thing.
> There does not seem to be any strong connection between number of votes and a movie's IMDB rating.
Well, this just isn't true. I happen to know this because I've looked at this data before myself. The issue is that you're cutting off the graph at a maximum of 60k votes with no explanation or even pointing out that you're doing it! I'm sure this is just an honest mistake, but it cuts out basically every movie that is actually popular!
This slowly ate at me until it was enough to get me out of bed to redo this graph myself. I've uploaded my quick and dirty result in R here: https://i.imgur.com/TTuCFEL.png
As you can see from the graph, there's a direct and obvious link between the number of votes and the average rating. The reason I say this threatens to undermine the whole thing is that I also know from previous experience that more recent films tend to have a lot more ratings on average. This (naively) suggests to me that we should see some recency bias if only because more recent blockbusters get a much larger number of votes than others and the sort of people who vote on blockbusters have different standards for what makes a great movie. (Avengers: Endgame was briefly the #1 movie of all time on the top 250 list.)
I don't have time to plot all the different graphs you did, but perhaps you can recheck your results after fixing at least this issue, and get back to us?
Follow up: I redid one of the graphs to satisfy my own interest in the question. Instead of averaging over all the movies ever released like the article did, I averaged over all the votes. This was an attempt to answer the question of whether the average viewing experience of a film from year x is better or worse than that of a film from year y.
I found that there has been a noticeable decline in the average rating of over half a point (out of 10) since ~1930 or so. https://i.imgur.com/bY9vPvk.png
I think this should be explained in a combination of two ways:
* History acts as a filter. If I choose to watch a movie from 1947 it's probably because a lot of people over the years have said it's good.
* To some extent, older movies may really be better than more recent ones.
I still suspect that the blockbuster effect means that the movies people are mostly to watch are going to receive higher ratings than the average film overall. And this is mostly because people rating blockbusters are less critical than the film-viewing community as a whole. So while there might be no "recency bias" of the kind this article was looking for, there might be blockbuster bias where in the last several decades studios have figured out how to capitalize on a less critical / cynical audience. (That hockey stick graph of 8-10 ratings in the article is certainly suggestive.)
Actually -- one more follow up for the zero people who will read this -- there's a simpler explanation, which other people who do these kinds of statistics should probably note.
The average rating between 1920 and 1980 fluctuated at about 7.5. Then the number of movies exploded, and the variability in quality of them exploded even more than that. Movies that were better than the 1980 average had their ratings compressed (you can only do a little better than 7.5 on IMDB). Movies that were worse were much less compressed (you can easily do much worse than a 7.5). So the average gets dragged down since there are now both more movies getting 0-4s and movies getting 8-10s.
I realized this after plotting the median, not the mean, and observing that it stays remarkably constant, and possibly even shows a little recency bias for the last 2 years.
Part of the reason I make data vizzes is so people can expand on the ideas, so there's no issue there! (as long as proper credit is given to the original work)