Hacker Newsnew | past | comments | ask | show | jobs | submit | datawander's commentslogin

That book you read was Russell and Norvig, it's one of the best textbooks ever written on any subject:

http://aima.cs.berkeley.edu/


Yes, I read the article and I applied an idea I took from one of the first articles on Nate Silver's 538 'news' site [0] of using Bayesian statistics to decode news as you read it and in my opinion the data presented is very underwhelming and there isn't any firm ground they are standing on here.

Here's my 'Bayesian logic' applied to it. My 'prior believe' in the claim being made is that I don't believe Uber has broad penetration across all demographics. Yes, people who are in the tech world and under 30 know of them, but not much outside of that. I was the first one of all my friends to use Uber (I would know because I have gotten over $60 in credits for signing them up). So outside of SF, my initial personal belief is that Uber doesn't have wide usage. So either the largest demographic of DUI violators are nerds who are younger than 30 or Uber is magically being used by people who would prefer using a more expensive Uber service than driver over a cheap taxi ride.

So let's look at the evidence they provide to see if it changes my prior significantly: one city was explicitly given data about without any hint as to what time-frame they are talking about. The other cities were literally the equivalent of hearsay arguments so because I'm the judge of this court, that evidence doesn't count. When you're talking about month-over-month and even year-over-year trends, a drop of 10% is just as easily explained by a random process modeled as a poisson [1] over the entrance of Uber. I would know having dealt with lots of time series data.

One finally comment, my apologies to the author of the bar chart, but that is just terrible. It's hard to read and doesn't make much sense from looking at it first. How many bar-chart graphs are out there that look like that? It could just be me speaking aesthetically because I'm weirded out by the random floating bars in the middle, but I feel Edward Tufte would have something to say about that.

jusitizin, I know you weren't speaking to me directly, but I read the article. Hope this comment proves it :)

[0]

http://fivethirtyeight.com/features/a-formula-for-decoding-h...

[1]

http://www.wired.com/2012/12/what-does-randomness-look-like/


It actually doesn't even provide data for two cities, just one. It talks about anecdotal data that is the equivalent of hearsay: "we consistently hear from our riders in Chicago and elsewhere."


Did you read the article? It gave the sources for the data for both cities.


I don't really buy it. What if DUI rates fell across all cities nationwide, regardless of Uber being on the market?


Doesn't really matter for winning the narrative.

The US enacted a 21-year old drinking age in 1984. It has been an absolute train wreck. One of the most backwards and out of touch with reality national policies that we have. The only reason it's still around is because of an idea that it somehow "saves lives". The stat most often cited to "prove" this, is the number of alcohol related traffic fatalities that have occured in the US since 1984. They have declined, significantly. Does it matter that this has nothing to do with the drinking age? Does it matter that number of highway deaths have declined across the entire world, and at a much faster rate than the United States? Does it matter that highway deaths declined at the same time as a rapid increase in seat belt usage? No. None of it matters. In terms of the narrative MADD won, and half the country believes the drinking age saves lives, even though it's utter baloney.

If Uber can play up the idea that having Uber around saves lives, it's a trump card. It doesn't matter how sound the statistics are, they'll win the narrative in every city. As a politician do you really want to be against saving lives?


It does do some good.

Countries with lower drinking ages have much higher rates of binge drinking.

The WHO Europe region has a 70% higher incidence of teen (15-19) binge drinking than the Americas region: http://i.imgur.com/oAbeg7x.png?1

Canada, with its 18 and 19 year old drinking laws has a far higher rate of teen binge drinking than the US, as do the European countries: http://i.imgur.com/iaJbDLd.png?1

Finally, we find that the US has a overall rate of binge drinking in the middle of Europe, lower than France and the UK but higher than Germany and Spain: http://i.imgur.com/h0BkKGe.png?1


You also have to give the value of 20-something binge drinking in both continents for the picture to be complete. Us could have a perfectly good case of postponed binged drinking.


This is possible although I don't have that data. It's still preferable as the amount of brain damage would be significantly less than during teen years, and they would hopefully be at least somewhat more responsible.


The lack of error bars when tracking illegal activity is vary suspect. However assuming there numbers are accurate heavy drinking in and of it's self is a vary minor issue compared to say DUI related deaths which are lower in Europe. Sure it's complex but only looking at negatives tends to create bad pollicy as you end up with stuff like "Taboo till 21 may actually promote drinking in the over 21 crowd."


Lower DUI in Europe might have more to do with availability of public transport then with anything else. Drinking group does not need non-drinking member nor pay for taxi to get home. They just take relatively cheap bus.


Completely anectodal, but I've been a big time binge drinker in my teens in two different EU countries, and I never really felt the urge to drink starting in my 20s. I've gotten that out of my system, no longer as cool as you'd think. I don't even drink when I go out to a bar and club (you won't know that though, it looks like I am) and I go wilder than most drunkards.


I wholly agree with this article. The exact point the author is getting at is something that I have been trying to say, but rather inarticulately (probably because I didn't actually go out and survey people and define "what is programming and what is wrong with it").

I really can't wait for programming to be more than just if statements and thinking about code as a grouping of ascii files and glueing libraries together. Things like Akka are nice steps in that direction.


I find this sentence very weirdly written from a stat viewpoint and not helpful at all to the "average" reader.

>>>>About one in 14 tech workers is black or Latino both in the Silicon Valley and nationally. Blacks and Hispanics make up 13.1 and 16.9 percent of the U.S. population, respectively, according to the most recent Census data.

Why didn't they say about 7% rather than 1 in 14?

1/14 ~=7.14..%

I accidentally scrolled into the comment section and my head imploded. Thank goodness for HN :)


Because the average reader may be inclined to think the numbers are reasonable, and then they'd be less likely to spark an outrage, and then this newspaper wouldn't get as many ad impressions.


I'm confused, black and hispanics make a combined 30% of the U.S. population but 7.14% of tech workers and that seems reasonable?

What am I missing here?


Yes, it does. The number that matters is number of available black and hispanic tech workers versus the number of employed black and hispanic tech workers. If there was a large discrepancy between unemployment rates for technology workers along racial lines, then we could claim discrimination by companies.

Instead, what you and this article are talking about is that perhaps not enough black and hispanic people are being exposed to technology as an industry. In which case, the failure would be on a lot of variables: community, family, schools, socioeconomic status, etc.

Blaming a company for not hiring non-existent people is sure great for Jesse Jackson's business model, though.


It's also possible that there is a legitimate reason (from a business perspective) they'd preferentially hired white tech workers, such as socioeconomic class leading to better training. (We see this effect in other places.)

I highly doubt that many of the confounding variables (age; exposure over time; socioeconomic status; etc) have been ruled out to come up with that statistic.

Not that I don't think there's a problem, just saying, the businesses might not be racist so much as minorities underperform (relative to their potential) because of socioeconomic factors. If that's truly what's happening, I'd argue that the place to combat that is the economics of the situation, and not attacking businesses for making prudent decisions.


I never, ever blamed the companies, I agree that " not enough black and hispanic people are being exposed to technology as an industry".


It could just be the particular writer's style, I doubt it is meant to misrepresent the facts, since there's no need to do so in this case, the facts by themselves are damning enough.

7%, when there should be about 30% means only 23% towards fair representation, that is still pretty bad. Of course there's the issue of the education pipeline.

You should don a hazmat suit before descending into internet comment sections, especially when it is about race.


does it necessarily have to be equal representation?


People will look back upon this date with utter lament because the world of lambdas has been unleashed upon a much more dangerous and wider world of programmers.


You realize the same exact thing could be said about video game designers, right?


Video game designers aren't forcing me to play their game in order to submit necessary forms.


There is also a complete lack of tests. It'd be nice if there was a library of tests for all sorting algorithms. I'm sure there is, but something more widely accepted and well known.


There already are a pair of data-driven journalists at the WSJ whose day-to-day sounds more like a data scientist than a typical journalist. They are Tom McGinty and Rob Barry.

They gave a great talk recently, which I wrote about (shameless plug below), summarizing how they investigated the Asiana airline crash in SF recently and what their day looks like. They're brilliant:

http://datawonder.co/blog/2014/01/15/data-skeptics-wsj-meetu...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: