The actual operation of the site is no longer like that. There has been a total inversion of the premise.
Wikipedia should have an anti-deletion bias.
There. I said it.
It should take a massive number of people to delete a page. Hundreds. Thousands maybe. Some particular amount of registered users (5% say, or double the square root. Whatever).
This is a series of radical assertions each presented without evidence.
Anyone who spends serious time on Wikipedia could tell you immediately what the site would be like if it required "hundreds" of people to delete a page: completely overrun with spam and vanity pages --- not about BBS door games or individual BBS's or open source authors or even bloggers, but about individual local little league teams, bands that have never played gigs or released tracks, marketing professionals, individual episodes of fake TV shows, attempts at defaming individual high school girls, 6th place finishers in city council elections, political parties with one member, "microbrew" beer pages for each of some guy's homebrew projects, made up cities in made up countries as part of some guy's counterfactual world history, repeat all previous entries in this list for each of Japan, England, Belgium, and South Africa, amateur open-mic-night standup comedians, individual bootlegged videos of concerts, "one of the 100 most influential Chinese-Canadians in British Columbia", individual unpublished short stories, as-yet-unreleased new energy drinks from as-yet-unstarted startups, 9000 pages about individual aspects of the Ron Paul campaign, film festivals from towns in Nebraska with fewer than 5000 people, repeat all previous entries in this list 10 times over the course of a year with slightly permuted names, fake professors at real universities, real professors occupying fake posts at fake universities, 46 different theories of cold fusion, fake cancer cure, fake cancer cure, fake cancer cure, fake cancer cure, fake cancer cure, world championship players from nonexistant Starcraft leagues, aspirational pages about made-up video games, individual authors of self-published ebooks, and of course, 1-5 pages about every single company incorporated in every US state.
In other words: exactly what you'd expect from an "encyclopedia that anyone in the world can edit". The site would be every naysayer's 1999 prediction for what Wikipedia would become.
I agree with your point that it is a good thing it doesn't take hundreds of moderators to delete a Wikipedia article.
However, while the actual state of Wikipedia is much better than your alternate future, what we have right now is also extremely dangerous and misleading. Today it is composed of accurate, well-written sentences and inaccurate, self-aggrandizing, or just schizophrenic sentences. But they're all intermingled.
Pick any article, say block cipher modes, CTR in particular:
It starts off ok, but then heads off the weeds with "Nevertheless, there are specialized attacks like a Hardware Fault Attack that is based on the usage of a simple counter function as input.[14]" This is a reference to a minor paper in a conference I've never heard of, likely added by the authors themselves.
"Fix it yourself", comes the cry. But without some sort of educated review board, you end up having to explain the background of the field in order to get around to why a given reference is not noteworthy. Often, the choice comes down to who spends more time on Wikipedia. Guess who doesn't?
Meanwhile, a generation is growing up with this as their only example of an "encyclopedia".
How does Britannica's coverage of block cipher concepts compare to Wikipedia's? My guess is, Wikipedia "roflstomps" it, to use Patrick McKenzie's favored term.
I think my kids are far better off with Wikipedia than they would be with "real" encyclopedias.
The fact that Wikipedia does better than Britannica on that subject isn't exactly relevant to your argument... especially so as it mostly goes to help NateLawson's counter-argument: NateLawson demonstrated how the existing article contains something that in your vision of reality should be deleted, and you seem to be responding "and yet it is still better for it!"... if you want to even pretend to be consistent here you need to demonstrate how and why that information should be deleted, as it is non-notable in the same way as everything else on your long list.
The content Nate points to isn't wrong, it's just given inappropriate weight. But Wikipedia's coverage of block cipher modes is, compared to any other encyclopedia, successful. My kids are better off with Wikipedia's block cipher coverage than Britannica's.
I suppose my friend Nate could be arguing the following: it's unfortunate that there is marginal content that survives wikilawyering while at the same time there's good content that doesn't. But he hasn't supported that argument with evidence. What's the "good" content that belongs under CTR mode that he'd have a hard time placing there? The CTR coverage in WP isn't bad because of deletionism; it's bad because there isn't enough crypto expertise in the WP editor community.
If Nate sat down and wrote a well-sourced article on CTR mode, it would survive.
It actually is wrong, in that its summary on Microsoft Research's site shows it advocates for a permutation of a counter (e.g. LFSR) as a solution to glitch attacks against CTR mode.
That's completely ineffective. The paper by Jaffe shows how to perform DPA (passive, not active attack) against various counter modes of AES, despite the permutation in use for the counter.
However, in order for me to remove the first reference and add the second one, I'd have to undertake a war of participation on Wikipedia. The outcome would not be determined by anyone I respect in cryptography, but by people who happen to be dedicated to a particular a message board.
In the spirit of scientific and philosophical inquiry, I simply fixed the WP article we're debating about. It took 4 minutes. I did it under an anonymous IP account. Let's see how much fighting I have to do to defend the fix.
Incidentally, while I strongly advise not reading the "StankDawg Debates", I'll note that they're a case of me, a deletionist, losing an argument to anti-deletionists.
None of the content in question is "wrong": we are only discussing things with "inappropriate weight". Are you saying that the game Space Empire Elite (the article that Jason Scott is angry about today) did not exist? Was there something else on the article that you believe was "wrong"? We are explicitly discussing "notability" here.
(Sadly, I can't even look at the article, for the reason I already complained about: when Wikipedia decides to delete something, it is totally removed from the website in a way such that you cannot use edit history to get it back, making arguments about it impossible post-facto. I wish I could make a real judgement for myself.)
I do now see, however, why you made your Britannica comment: you are responding only to the final sentence of Nate's comment ("Meanwhile, a generation is growing up with this as their only example of an "encyclopedia"."), which I mostly ignored as it was just rhetoric ;P. (I am sorry that I did that, as it caused me to misunderstand your's.)
Regardless, looking at the actual argument thread, Britannica is entirely irrelevant unless it can be shown that either 1) they were of higher quality due to their more limited focus, or 2) that their eventual downfall was related to them being more likely to cover a niche topic than Wikipedia ;P.
Regardless--to now respond to your new comment--at some point you need to make a choice: either that content is notable, or it is not. If it is, you have to decide who is allowed to determine whether that fact. Wikipedia, FWIW, does not appreciate content that requires "expertise" much, as they want everything verifiable by the layman.
(Again, leading to what I consider to be a ludicrous situation with respect to the various articles surrounding RSA: there is much more coverage of the recent non-attack on embedded systems with bad random number generators than there is on actual modern mathematical attacks, seemingly because the former could be verified by newspaper articles and the latter requires citing academic papers, as the last time a review paper in this field was published, a requirement for Wikipedia's "primary sources must be verified by secondary sources" requirement, was 1999.)
"...similarly, a scientific paper documenting a new experiment is a primary source on the outcome of that experiment."
"A primary source may only be used on Wikipedia to make straightforward, descriptive statements of facts that can be verified by any educated person with access to the source but without further, specialized knowledge."
"For example, a review article that analyzes research papers in a field is a secondary source for the research"
"All Wikipedia articles should be based on reliable, published secondary sources. Reliable primary sources may occasionally be used with care as an adjunct to the secondary literature, but there remains potential for misuse." (emphasis copied from original)
It is simply not true that WP forbids citations to the academic crypto literature. Anyone can go read the WP policy on primary sources (WP:PSTS) and WP:V and see that what you're saying is false.
What you're doing here is cherry picking a very reasonable policy that Wikipedia has and generalizing it far past its actual application.
What you're saying is that WP frowns on journal articles because they are "primary sources".
What WP actually says is that you can't go take the results of journal articles and use them to synthesize conclusions that the author didn't actually make inside a Wikipedia article. In other words, you cannot use WP articles to ratify new science.
I feel like if there's one place on the Internet that should appreciate a policy forbidding unfounded extrapolation from journal articles, it should be HN, which routinely hosts debates that pick apart the media's broken analysis of scientific articles.
(edit:) Why do you keep ignoring the arguments about the actual topic? Please respond to my first two paragraphs. This entire thing about verifiable sources is an off-topic argument that I'm only going down as an aside because you forced the issue by ignoring the notability problem.
You do realize that I quoted Wikipedia's guidelines, right? That those links I added were not some kind of lie: they were direct references to validate my argument.
Wikipedia seriously goes so far as to state that "mainstream newspapers" are an example of "the most reliable sources", in addition to all of the stuff I already pasted from their guidelines about journal papers.
(OK, I have 2% battery, and I can't find the reference for this, so I'm going to have to temporarily cede this argument. Grrr. I had that bit on newspapers as direct quotes in an argument from a couple months ago on this same topic, though. :( It certainly explains common practice, which "you are allowed to cite primary journal articles without backing them up from secondary sources" does not, in addition to bring a contradiction of the various guidelines.)
I cited downthread chapter and verse from WP:V explicitly stating that journal articles were good sources. In the comment you're replying to, I cited the specific policies I was referring to; I didn't provide hyperlinks because you had already hyperlinked to them. Please don't pretend that I'm making a point that's more controversial than it actually is.
It is simply not true that WP disallows citations to the academic cryptologic literature. The opposite is true.
Again, you've confused the actual policy, which says that you can't take a journal article, infer from it conclusions the author didn't make, and then draw those conclusions without any other support in the body of a WP article. The policy does not say that journal articles aren't sources.
(edit on previous comment. I am on my iPhone right now) Why do you keep ignoring the arguments about the actual topic? Please respond to my first two paragraphs (of the grandparent comment). This entire thing about verifiable sources is an off-topic argument that I'm only going down as an aside because you forced the issue by ignoring the notability problem.
"""None of the content in question is "wrong": we are only discussing things with "inappropriate weight". Are you saying that the game Space Empire Elite (the article that Jason Scott is angry about today) did not exist? Was there something else on the article that you believe was "wrong"? We are explicitly discussing "notability" here.
(Sadly, I can't even look at the article, for the reason I already complained about: when Wikipedia decides to delete something, it is totally removed from the website in a way such that you cannot use edit history to get it back, making arguments about it impossible post-facto. I wish I could make a real judgement for myself.)"""
Everything after those two paragraphs was "ok, of you insist on making a confusing non sequitur about Britannica based on that one rhetorical sentence from Nate's argument, I guess I can play along and argue", but is sadly the only thing you insist on arguing about, ignoring th deletionism issue. :(
You've quietly edited your previous comments to avoid having them rebutted by my comments, so I'm done. I'm mystified by why you'd feel the need to do that, by the way.
Actually, I explicily "ceded" them, as I can't appropriately source them on an iPhone (and we are now arguing on that path entirely based on who has better quotes), and I considered them a waste of time anyway, as my first contention on that argument thread was that you were off-topic.
That means you won, congratulations: it does not mean I don't want them rebutted, as if so I wouldn't explicitly cede; the implication there would be I don't want to lose, and yet here I am, saying you must be right, as I can't back my arguments up anymore. You won: don't be angry you won.
However, you simply ignored all of the main deletionist-related arguments in your quest to defeat the one unrelated thing you felt you could argue against (the verifiability of crypto papers on Wikipedia). "I'm mystified by why you feel the need to do that, by the way."
You first ignored Nate's argument, concentrating on the rhetorical ending note; you side-stepped my calling you out on that (so you could argue more about verifiability), and then you ignored my attempt to reconnect to the mainline argument... you are pretty much just trolling, and I fell for it :(.
(added:) Wait, define "quietly edited"? I didn't remove anything from them (normally for typos and grammar). I often redraft what I say during the couple minutes after I post it, but I don't alter the argument. The only edits I made to that post were 1) to remove a sentence "please argue your point better" which wasn't even in my original submission (and when I looked at it, I decided it was needlessly insulting; it was there maybe 30 seconds; I shouldn't have said it, and I apologize), 2) to add the paragraph at the top, which I explicitly said was an "edit", as I felt it not worth a second reply; and 3) to remove a hyperlink I had originally used as the reference for the thing on newspapers in a panic at 3% battery life based on a Google hit that was wrong, replacing it with me explicitly ceding the argument when I couldn't find the right hyperlink, which only puts you at an advantage.
False alternative. Two things better than a Wikipedia article on cipher modes:
1. Britannica choosing not to have an article on it since they can't do it justice.
2. Papers or books (you know, things that cost money and real effort to develop) on cryptography that give much more thorough, accurate, and unbiased information.
Not sure I follow. Britannica has in fact chosen not to cover these topics. There are in fact books and papers on block cipher modes. And there's a Wikipedia article that attempts to serve as a survey to the existing literature on block ciphers.
Aside from spam and pages that exist merely to advertise products/companies/etc, it seems like most are those examples aren't necessarily bad, and if no one is using the articles, it's not like a huge number of static text/image pages are going to be a massive drain on resources.
Perhaps the answer to this is to allow for "personal" wiki pages that can be approved for inclusion into the main body of Wikipedia.
From my perspective, I would rather have lots of extra fluff in Wikipedia that I will never see rather than whole subjects being deleted from the site on the whims of a few jerk admins.
Edit: And you seem to be defending the deletion of BBS history under the guise that "Else spam would be everywhere!" which isn't an argument I agree with.
The problem with the unused articles is slightly subtler than "it will use up our hard disk space!"
A friend of mine recently created a Wikipedia article for one of her own economic theories. She is not an economist, and as far as I know has never studied economics. The theory is loosely based on the work of one economist, but the sources she listed included her own blog, that economist's work, and newspaper articles--not articles about "the XYZ Theory," but articles about events she believes are relevant to her theory. But the end result was a reasonably well-written article, with plenty of linked citations (which would appear relevant to anyone not paying close attention), about a theory that does not exist. And because the article got so little traffic, she was the only one editing it.
The thing is, when she linked to that article from her blog and social media profiles, and from other tangentially related Wikipedia articles, suddenly that theory seemed to have some validity that it hadn't had before. And googling "XYZ Theory," which would have gotten 0 results the day before, now got a handful of results. All of them were related to that article or her blog, of course, but someone casually searching for information might just click through to the Wiki article without noticing.
The point is, articles about non-notable topics aren't just irrelevant. They're of much lower average quality than the rest of Wikipedia because they have much less traffic and many fewer potential editors knowledgable about the topic. They're more likely to have factual inaccuracies, more likely to be biased, more likely to just suck. And it'd dramatically reduce the confidence one could have in any Wikipedia article about anything one hadn't heard of before, because you might be its first viewer.
Wouldn't it be a good thing as that would open our eyes to the fact that Wikipedia articles are really not reliable at all? I don't see how a fast deletion mechanism makes the articles more reliable.
The proper solution seems to be to let the articles stay, and learn that they don't have much weight.
And if deletionism is a reaction to SEO optimization, as the other comment states, it is even more of a tragedy. Really, we have to delete articles for fear of somebody getting an unfair SEO boost? Again the solution seems to be to just lower the impact of links from Wikipedia.
This is an excellent anecdote about how Wikipedia's existence gives unsubstantiated weight to those who choose to spend the time writing stuff there.
We all say "sure, we know not to trust wikipedia" but when it's the first google hit for many topics, can you really pretend you always read it critically?
First, I'm not "defending the deletion of BBS history". It's telling that merely trying to put what's happening on Wikipedia in any kind of balanced perspective counts as taking a side here; it shows how artificially polarized this issue is.
Second, there are clear practical reasons why it would matter if the examples I provided survived as Wikipedia articles:
* The overall perception of Wikipedia's reliability --- which, in reality, is comparable to that of Britannica's for most subjects, and particularly for subjects that are important in the context of traditional encylopedias --- would be sharply diminished.
* The encyclopedia would be full of overtly false information, since no contributor to the project would be able to verify any of the information in those articles.
* Wikipedia would be ruthlessly gamed to get pages to the top of Google SERPs (it already is, and a lot of deletionism over startups and individual people is a reaction to that).
* The project would lose huge amounts of time to mediating squabbles over article name real estate as different garage bands or local volleyball teams argued over which should be the first listed name for their project.
> The site would be every naysayer's 1999 prediction for what Wikipedia would become.
The naysayers in 1999 said Wikipedia would be hopelessly inaccurate, not that it would contain lots of information. I don't think "This site will become a vast repository of information about all subjects great and small" would even have counted as naysaying.
I'd agree that we don't want all of that stuff, but there's something very wrong with the filtering process when niche historical material gets purged while every Buffy the Vampire Slayer character and episode has a detailed article.
Why? The content about those articles is verifiable, any given Buffy episode is clearly notable compared to, say, the "Fourlokotini" article that was submitted to Wikipedia in November 2010, and nobody is gaming Wikipedia to get Buffy content to the top of Google SERPs. What difference does it make if writeups of Buffy articles occupies one article or dozens?
Because it reflects the fact that Wikipedia is currently written by the kind of Comic Book Guy who posts endlessly on Hacker News and edits Wikipedia. We are all Comic Book Guys to some extent when posting on the internet, but there are quite a few important deleted articles that are interesting only to non-Comic Book Guys. Inclusionism is about protecting their contributions from the depredations of rules lawyers who care more about "Notability" than whether a topic is included at all.
Why is any random Buffy episode "clearly" more notable than a cocktail fad? More to the point, why is a Buffy episode more notable than an extant webcomic or a niche programming language or any number of perfectly worthwhile subjects that enthusiasts on Wikipedia merrily purge?
Subjects are notable by dint of being written about in reliable sources, which is something you can say about any Buffy episode and can't say about a cocktail that some guy in Camden made up one weekend and wrote about on Wikipedia as a joke.
'The "notability" argument from Wikipedia Deletionists is fascinating because it basically outsources judgements of significance to major media conglomerates.' -- opendna
You're answering a criticism of Wikipedia's standards by appealing to Wikipedia's standards. Further, you're certainly aware of your painfully obvious cherry-picking.
Let's start, then, with the fact that when a page is deleted, not only does the information get removed from the active site, but the history is removed in a way that only an administrator can see it, making it impossible to even judge if the information was relevant. Personally, I consider that inexcusable for a website that claims to be an open system in which everyone plays a part, and I feel undermines the argument that deletionism is a valid premise.
You know what, though? For all of the breadth of your list of horrors that would suddenly be covered on Wikipedia: I'd love to see every single one of those on the site. I want to see them represented, in fact, so strongly that I am having a difficult time convincing myself that you aren't actually just being really sarcastic about this whole issue (and then I'm being too daft to realize "oh, he's being absurdist, listing things which we actually would want, to make a point").
The concern should only be whether or not the information, if found on the site, could possibly be accurate or trusted, and the realization to have there is that a better solution to that problem is to more effectively expose the edit state of the article (a la IBM's "History Flow" visualization), as the problem of accuracy and notability is already a serious problem in individual paragraphs of articles about topics whose conclusion is non-controversial even to deletionists.
I didn't present a "list of horrors". A parade of horribles is a debate tactic where an opponent of an idea extrapolates it as far as possible and presents a set of luridly bad outcomes that will result from it. It's considered a cheap argument and given the name ("parade of horribles") because of the subtext that the horribles are actually unlikely; that anyone can take any argument, extrapolate it to an absurd extreme, and come up with a series of bad things that won't in fact happen.
Unfortunately for that rebuttal to my comment, AfDs on Wikipedia are archived:
Wikipedia:Articles_for_deletion/Log/#{ YEAR }_#{ MONTH }_#{ DAY }
You will, perusing these pages at random, discover a couple things:
* There are a LOT of terrible articles submitted to Wikipedia every day
* The examples I provided upthread are representative of what Wikipedia actually deals with.
* An astounding amount of effort goes into diligently handling those articles on a case-by-case basis.
A few months ago, another "deletionism" freakout on HN happened when a famous SEO expert's friend found her article deleted, and the SEO expert wrote an angry blog post. In discussing the incident on HN, two things emerged: first, the erstwhile WP subject was not in fact notable, and second, in handling the AfD for the article, before any media attention had landed on that particular debate --- that is, while that AfD was just another routine debate --- a Wikipedia volunteer actually took the time to look up the library circulation numbers for a book the subject had ostensibly published.
I think you are claiming that I am referring to the notion of "parade of horribles" with my comment "list of horrors"; I have no formal debate training, and have never heard of that specific "cheap" tactic you are now defending not having committed. If your defense is relevant, it is sufficiently pedantic as to go over my head. :(
My "rebuttal" was actually that I liked that entire list of things, and I can only see Wikipedia being better for including information on all of those topics (along with appropriate visualization to demonstrate the inability to as easily trust the content that has had fewer editors, something you need anyway to solve the "subparagraph of article 4" problem).
Honestly, I thereby cannot see one thing in your most recent comment that actually responds to either of those contentions; again, to be highly repetitive on purpose: 1) that those things are not inherently bad, and 2) that there are better ways of solving verifiability for both articles and paragraphs (the latter of which being important).
(Also, on the off-chance this wasn't clear to anyone: the actual articles are deleted permanently. What is archived publicly is only the argument regarding the deletion, which is often just "I don't think it has enough sources", "I think it counts", "no, it didn't", as in the case with the article Jason Scott is complaining about today: without the actual text of the article or what the references even were, you cannot judge for yourself. I maintain this is inexcusable for a project with Wikipedia's charter.)
I provided WP's reasoning for keeping articles about "Fourlokotinis" and non-notable local amateur sports teams off the site. They are:
* Those articles are extremely likely to be inaccurate, because nobody outside the very small number of people with firsthand knowledge about them can verify them.
* It's manifestly obvious that those articles would quickly overwhelm the encyclopedia with obviously bullshit content; again, just look to the absolutely enormous quantity of totally ridiculous articles logged, complete with deletion debates, on the AfD link I provided.
* The articles themselves would spark huge time-wasting debates about placement and weighting, which is something that already happens with verifiable articles.
* A huge incentive exists to push vanity content onto WP because of its prominence on the Internet.
If Wikipedia were infinitely large and largely full of garbage, I would not notice. The Internet is "overwhelmed" with spam and vanity websites, and yet I don't notice. The issue is only, I will restate as you again seem to have ignored it, whether I might accidentally overly trust content I find on Wikipedia because it is on Wikipedia, and again: that is already a problem with paragraphs of larger articles, and there are better solutions that solve both at the same time (such as visualizations of edit controversy, as a particular example: IBM History Flow).
I already made these arguments: I do not see you responding to them; I, and other people on this thread, do not agree with your personal assertion regarding how the site will be "overwhelmed", so that is not an argument unless you can provide actual evidence that an infinite number mostly-pointless articles will cause Wikipedia harm. The closest you come is jut asserting that people will overly trust it, not why or whether deletionism is a better fix than my proposal.
(edit:) You also should tie the response back to the game in question: even if many of these spam articles should be culled, maybe the barrier to culling should simply be higher, in order to decrease the false positive rate. That was the argument made at the very top of this thread and, you know what?... you seemed to ignore it as well, as you only seem to care about the one issue: whether crap could exist and whether the site could get a lot of it if it had no filters at all... that isn't even controversial (and to the extent that it is, it seems to mainly be surrounding whether the specific things in your list were crap, not whether one could imagine something that was truly worthy of being deleted, even "speedily").
The world wide web is filled with crap. That is why we have search. Search is the solution, not deletionism.
Indeed, your tacit assumption is that deletion improves quality, on average. But (a) you don't consider the many false negatives and (b) you don't consider the effect of deterring knowledgeable contributors, without whom quality will decline. We know many knowledgeable people have been discouraged because of the hostility of Wikipedia editors towards newcomers. You've seen the stats on how Wikipedia edits are topping out, how a tiny core of editors rules the site.
It is not hard to have a good search algorithm determine which pages are crap and which aren't. Some obvious features: the number of editors, Flesch-Kincaid level, number of edits, size of the page, number of references, etc.
Automatic search will always outperform manual deletion at scale, especially for an audience as diverse as Wikipedia's.
This argument doesn't make any sense. The internet is full of places to host content. The only virtue to hosting it on Wikipedia is to obtain the label "Wikipedia article". You're complaining that the label is selective, but selectivity is the only virtue of the label to begin with.
If you think "search" is the answer, you don't need to do anything. Let the deletionists do their thing, and just put your content somewhere else. Search will find it.
That any small gang of weirdos can get pages deleted is bazonkers. And given the rise of deletionism it's easy to round up a deletion posse for damn near anything.
It needs to be harder to delete. The Library-of-Babel examples[1] you give would still be deleted even if the bar were higher.
How many of those examples do you think are unrealistic?
Also, do words mean specific things to you, or are you just trying to communicate how article deletion makes you feel? Because when you call Wikipedia a "lie" that has "inverted the premise" and "ratched the door shut" on contributors, you should know that's not only false, but trivially falsifiable.
It does take more than one; it's a public process that usually takes about a week to complete, in which anyone can chime in and lobby for the article or, for that matter, fix the problems with the article.
It seems likely to me that a lot of people who are up in arms about "deletionism" really do think individual admins randomly zorch articles for no reason. That's not generally how it works.
I have no idea where on Wikipedia this is a problem. Since this is Hacker News, I humbly request than you expand your stomping grounds to include articles on algorithms. Coverage and quality is woefully inadequate, and I've yet to see the tracks of a deletion brigade running through any of them.
Also, I've found learning then explaining algorithms is a wonderful way to retain them long-term. Benefits all around.
This is an article about an individual non-famous executive at a tech company.
I've never once had to defend it (it's been more than a year since I even looked at it). Why hasn't it been deleted by roving bands of "deletionists" trying to score points?
Because it cites sources and makes a clear statement of notability, as the Wikipedia project asks.
Does the Wikipedia project make mistakes and delete articles it shouldn't? Sure! All the time. But, for the most part, if you do what Wikipedia asks you to do, the system works fine. If you write an article about an algorithm that cites the academic literature, it'll most likely survive.
On the other hand, if you write an article about a well-known algorithm but fail to cite sources or include a single-sentence lede about why the algorithm is important, it is somewhat likely that some Wikipedian patrolling new articles will nominate it for deletion. Why? Because it'll be a member of a cohort of similar-looking articles most of which will be the CS equivalent of cold fusion research, and without citations, Wikipedians will have nothing to judge it by.
To me, it's a small miracle that Wikipedia works as well as it does, and that Wikipedia has more or less replaced "real" encyclopedias. That they'll occasionally jump the gun on deleting articles that don't look legitimate seems like a very, very small price to pay for that.
> To me, it's a small miracle that Wikipedia works as well as it does, and that Wikipedia has more or less replaced "real" encyclopedias. That they'll occasionally jump the gun on deleting articles that don't look legitimate seems like a very, very small price to pay for that.
A price is something you must give up in exchange for something else. I am not convinced that the wikilawyers are the price of Wikipedia any more than a crazy man punching people outside of McDonald's is the price of a hamburger. In reality, you could probably get rid of one and still have the other.
The value of Wikipedia to most people is in the massive amount of work that is put into improving the articles and the breadth of information it contains despite all the deletion — I know not a single normal person who looks at Wikipedia and says, "Thank God I cannot find anything non-notable on here. I was worried about that."
You haven't actually engaged my argument. My argument is that judgement calls about which articles not to host on Wikipedia are one a small number of driving forces that make the project actually work. You've responded to that by equating judgement calls about not hosting articles with a guy punching people outside a McDonalds. You'll be upvoted for that, because it's good, funny writing, but there's nothing intellectually honest about the point you've made.
I'm not sure if we're just talking past each other or what, but I feel that you're ignoring my point rather than vice-versa. Deleting an article on a computer scientist because he was written up on LWN instead of ComputorEdge does not make the article on Intel any better; it doesn't make the article on the American Revolution any better; it does not have any external impact besides pissing off the guy who wrote the article. Deleting articles does nothing but get rid of those articles. I believe that the site would survive just fine if one day the admins decided to reinstate every good-faith article that was ever deleted.
Where's a link to a debate where a computer scientist's points were deleted from WP because "ComputorEdge" (or any trade rag) trumped LWN?
That's a point you didn't make anywhere upthread, so it's disingenuous to say I'm avoiding it the way you avoided my argument, but I'm happy to stay on track. Point me to the pervasive class of mistakes WP is making by trusting some sources and not others?
I'm sympathetic to this argument, because when I was actually volunteering for WP back in 2007, I spent a lot of time beating back vanity pages that were anchored in one line mentions in trade press articles that were merely regurgitating press releases. So I'm with you about the low value of ComputorEdge. But when you say that computer scientists are systematically disadvantaged because of WP:V rules that prioritize ComputorEdge over LWN, you lose me, because I don't see that happening.
My apologies — I communicated that poorly. I had meant that as a facetious way of saying "niggling issues," rather than a specific indictment. My point was primarily that deleting a questionably notable article does not contribute significantly to the value that people get out of Wikipedia. They are largely orthogonal concerns. I don't think Wikipedia would lose one iota of value in the common person's eye if (without loss of generality) an article on a band in Wisconsin were allowed to remain. Wikipedia was not richer during the period Nemerle's article was deleted.
Incidentally, I just looked over Wikipedia's notability rules and they seem to be a bit more reasonable than they were when I used to edit things there, so props to them for making progress on that issue.
No apology necessary. So, I mostly agree with you: the value of deleting a barely-non-notable article is marginal. But as I've shown I think pretty effectively upthread, with the link to the URL pattern for AfD debates, that's not the problem that confronts Wikipedia; instead, editors on WP are dealing with a torrent of extremely non-notable articles, into which valid articles are, due most often to poor editing, occasionally getting caught up.
It has sat unmolested because you wrote a good bio about a computer security professional. It's difficult for a random Wikipedia denzien to quickly reach the conclusion that Ms. Davidson isn't notable enough. She's works for a powerful, well-known company, and is notable enough that she was asked to testify before Congress on a topic.
Wikipedia gets fuzzy when you step outside the basics. Is comprehensive list of "Two and a Half Men" episodes from 2003 notable? Are the results and player profiles of the 1959 NBA draft worthy? A stub article about a village in rural Poland?
In those cases, the answer is "yes", because there is a constituency for NBA fans and TV fans. When you step outside these types of topics, you are stepping off of a cliff, and wikipedians will capriciously and relentlessly enforce whatever rules they deem important.
From experience on HN: articles about specific living people are the hardest to support. The site has a specific policy (WP:BLP) that raises the sourcing standards for articles about living people.
But I didn't have to do anything to keep my article on the site. All I did was (a) write a clear statement of why the topic was notable, and (b) cite sources. That is not a difficult pair of rules to remember.
But if you believe the prevailing sentiment on HN about how WP and "deletionism" works, it should have been extremely difficult for me to keep Mary Ann Davidson on WP. I should have been in multiple AfD debates defending the article. Instead, I wrote it, walked away, and 5 years later there it stands.
More often than not, what's actually happening in specific deletion freakouts is, the article in question cites no sources, and makes no claim about why the subject is notable.
> If you write an article about an algorithm that cites the academic literature, it'll most likely survive.
Most of the time you actually aren't allowed to do that; there seem to be exceptions in a few fields, such as Medicine, but the overall policy of Wikipedia is that you cannot cite primary sources, preferring, very specifically, newspapers. Of course, there are secondary sources in academia (summary papers), but they are fewer and far between, making it difficult to defend some newer topics. The article on "Coppersmith%27s_Attack" against RSA, for example, is seemingly forced to cite a summary paper. (The article on RSA itself has a couple citations to an original paper, but only if it can be backed with a secondary source.)
This is what Wikipedia actually says about citing journal articles:
Where available, academic and peer-reviewed publications are usually
the most reliable sources, such as in history, medicine, and
science. But they are not the only reliable sources in such areas. You
may also use material from reliable non-academic sources, particularly
if it appears in respected mainstream publications. Other reliable
sources include university-level textbooks, books published by
respected publishing houses, magazines, journals, and mainstream
newspapers. You may also use electronic media, subject to the same
criteria. See details in Wikipedia:Identifying reliable sources and
Wikipedia:Search engine test.
Their policy appears to be the opposite of the one you suggested they had.
If coverage and quality is woefully inadequate, that means people haven't been doing much with them, so of course nobody is going around harassing these nonexistent people. Only pages that people are actually interested in get the jerk brigade's attention.
What editors are interested in and what's important isn't necessarily the same, though. I try to focus my efforts on areas with the lowest ratio of editors to value of the content. I find that provides a much better effort-to-results ratio than trying to throw in my $0.02 on Israel-Palestine with 100 other people. It also leads to more pleasant colleagues, because when I'm writing about ancient Greek archaeological sites or mathematical theorems, my co-editors are typically other idealistic people who genuinely want to improve the Wikipedia articles on those subjects. If you're writing on something political, then a lot of your co-editors are going to be people with political agendas.
On contentious areas where a lot of people are interested and strongly disagree, I'm not really sure how to do it better. Wikipedia is often suboptimal, but so is the opposite, "expert-based" model. I'm an academic, and if you get bunch of us in a room, from different strongly opposed viewpoints, and ask us to try to come up with a consensus survey article, that experience is usually going to end up being painful. I think I'd actually rather wade into a Wikipedia edit war than serve on those kinds of document-writing committees.
It's not just political "hot button" topics. No matter how innocuous the topic, if somebody takes notice of it, you'll often find yourself constantly having to fight random people — who are not the normal contributors to the articles — over stupid legalistic crap. This article with five sections and 20 references should really be a short subsection of that article because, well, he likes shuffling things around; another one should be deleted because the guy has never heard of it and most of the publications that covered it are online-only, etc.
If your particular niche has avoided this, bully on you. But I personally wouldn't want to join it, because if more people come on board, that means more attention and thus (in my mind, at least) greater likelihood of attracting the wikilawyers.
So, where did this happen to you, or anyone you know? It should be very easy to cite a source to "the jerk brigade" glomming onto some innocuous bit of good content. Most of what happens on the project is logged. Deleted articles "vanish", because the point of deleting content from the project is not to host it at all, but the discussions and talk page articles and AfD debates are still there.
You might agree or disagree with some particular cases, and sometimes things go the right way, but either way it's still a hassle that you have to fight. Personally, I don't remember what the specific pages were I used to help maintain. I was doing it because I wanted to help out and I saw those pages could use it rather than because I cared a lot about the topics. But I do remember it was unpleasant and I wouldn't want to deal with such people again.
Oh come on. Clojure has a Wikipedia page. _why has a Wikipedia page. Y Combinator has a Wikipedia page, as does Paul Graham.
Cite the Wikipedia debates you say are happening. Citing HN freakouts isn't a valid argument, because anyone in the world can spark one at any time, because anybody can nominate any article for deletion.
What are you even talking about? I don't know about these "Wikipedia debates" I apparently brought up, unless you mean the AfD discussions that go along with the HN articles I just linked. Yes, Clojure and Why and all those things have pages, but that's because people fought for them. Nemerle wouldn't have a page if people hadn't fought for it — nor would Why or any number of other things. You yourself said in that thread you found the state of the Y Combinator discussion offputting.
The thing is, even when you succeed in fending off all that, it doesn't really feel good to have spent time on Wikipedia politics.
The fact that anybody can nominate any article for deletion doesn't really run counter to my point, which is, to reiterate:
> No matter how innocuous the topic, if somebody takes notice of it, you'll often find yourself constantly having to fight random people … it was unpleasant and I wouldn't want to deal with such people again.
It is trivially easy to create a bullshit freakout on HN about deletionism. Anyone in the world can go, right now, and nominate Don Knuth for deletion. That's how WP works. And, in at least two of the HN freakouts you cited, that's exactly what happened: articles that had no chance of actually being deleted were marked by someone for deletion, and, predictably, weren't deleted.
In fact, when these freakouts happen, people who believe in the articles and know how WP works have to take pains to tell people not to jump into the AfD debate and start "voting" for the articles, because that actually makes the process work worse. Most of the time, when a self-evidently valuable article is proposed for deletion, WP's editors do a just-fine job of making sure they aren't actually deleted.
What you did here was move the goalposts. You claimed that computer scientists are getting shot down in content debates on WP. I asked you to point us to one of those debates happening, where the "jerk brigade" of editors and admins on WP were shouting contributors off of topics. All those debates and shoutings-down are logged.
You responded by highlighting discussions on HN of people freaking out about deletionism. We already know people on HN are freaking out about deletionism. Stipulated! The point is: those people freaking out on HN are mostly wrong.
All those programming languages actually were deleted. According to the result summary of Why's case, he probably would have been deleted if more "serious" sources hadn't been added in the interim. It happens.
But no, I am not moving the goalposts. I'm pretty convinced at this point that you've projected other people's goalposts onto my playing field. I'll quote myself again:
> No matter how innocuous the topic, if somebody takes notice of it, you'll often find yourself constantly having to fight random people
> Just Googling around Hacker News, I find a number of innocuous pages whose maintainers have had to defend themselves against deletion.
> Personally, I don't remember what the specific pages were I used to help maintain. I was doing it because I wanted to help out and I saw those pages could use it rather than because I cared a lot about the topics. But I do remember it was unpleasant and I wouldn't want to deal with such people again.
Those have been the goalposts all along. I didn't say anything about getting "shouted off topics." I just said it involves more headaches than I feel it ought to. (Clearly some people like mjn and yourself haven't had that experience, and I'm glad, but that doesn't take the bad taste out of the mouths of people who have.)
There were no sources on the original _why article. Sources were added because they had to be. It doesn't mean anything to say "the article would have been deleted if sources hadn't been added"; you can say that about any article in the whole encyclopedia.
If the "fight" we're talking about here is (a) one simple sentence stating why the subject is notable and (b) a couple of links, I'm not sure "fight" is the right term.
I guess I haven't actually found that to be true, despite editing in many niches. Sure, sometimes stuff gets shuffled around, but that's okay too: I want people to improve on my work, which sometimes means moving it elsewhere, splitting it up, whatever. I don't feel the need to "own" the articles I write. If anything, I want more people to do so! I've written articles that are 100% untouched years later even though I know they are not really "done", and someone could improve them. It can be quite nice when someone comes by and tidies up a rough draft: fixes some spelling, adds an infobox and geographical coordinates, rearranges my text-dump into some nice sections, formats my citations nicely and adds ISBNs and DOIs to them, etc.
The people I've seen run into problems most often are either in political hot-button areas, or have too close connection to a subject: someone writing an article on their own programming language, on their own academic contributions, their own company, or that of someone/something they have a close relationship to. If anything, that kind of CoI editing is still rather laxly tolerated, rather than too strongly policed. I know of at least one university that actually has paid staff writing puff pieces on their professors, and most of them are sadly still there, untouched, because there honestly isn't that much close scrutiny.
Imo experiences are much better if you start from the perspective of wanting to improve the encyclopedia, rather than from the perspective of wanting to get a particular thing into it and then maintain/defend that article. The way I usually work: start with a good source I have no personal connection to, and write articles based on it (and citing it!). For example, pick up Knuth's TAOCP, find some interesting subjects it discusses that are not yet covered in Wikipedia, and write a well-cited article. There are >99% odds that people are going to be happy with that kind of contribution, not try to delete it. I've recently been doing that with some books on archaeological sites, and I've gotten only positive comments about it; people are generally happy that I'm filling in articles on important sites that Wikipedia still lacks articles on, and that I'm doing so with references to good scholarly literature on them.
That's not to say every encounter with another editor I've had is positive, just that I think it works reasonably well on the whole, especially given the scope of the endeavor, which I would have guessed was frankly impossible, if you had asked me 10 years ago ("random people on the internet writing the superset of all subject encyclopedias?! won't it just be filled with cranks and nonsense?!"). Actually that's an interesting aspect of the HN reaction: HN is generally worried about Wikipedia being too closed, too deletionist, etc, whereas I'd guess that 90% of the world is worried about Wikipedia being too permissive, not fact-checking or insisting on sources strongly enough, etc. The most common criticism outside HN is pressure to add more reviewing and quality-control mechanisms, which gets especially strong in the wake of occasional hoaxes or libel scandals.
…focus my efforts on areas with the lowest ratio of editors to value of the content…
This sounds quite interesting. Do you do this by rough intuitive feel, or is there any quantitative/analytical data from Wikipedia that can be used to prioritize attention this way? (For example: a list of places where the imbalance between readers/inbound-Google-referrals and content/active-editors is greatest?)
I generally do it by feel, partly because I'm treating "value" somewhat subjectively, something like "intellectual value" or "scholarly value" rather than "number of hits". So I look for things that Wikipedians don't seem to be spending a lot of effort on, but which imo a good encyclopedia should cover.
Yep. I once had a page deleted within a minute or two after creating it for (as I recall) "duplication" and "not enough content". It wasn't a duplicate at all (it was about a metalworking tool that might look superficially similar to a woodworking tool, if you don't know anything about tools, which the editor didn't), and one reason there wasn't enough "content" was that I was still uploading the original images that I'd shot myself specifically to illustrate the article. I did get the delete rolled back, but got chided for "incivility" along the way (funny, I would have thought that deleting someone else's work when you don't have a clue about the topic of the article would count as "uncivil").
I lost enthusiasm for contributing to Wikipedia after that and haven't done much there since. Occasionally I correct minor-yet-annoying spelling errors, but that's about it.
I don't really know why they do this either. A friend of mine is a (light) sci-fi author with a few books published, not self-published, each one launched in a bookshop in Notting Hill in London, carried nationally in Waterstones etc etc.
His page, written by someone else, was deleted too, on vague notability grounds, and he was accused of vanity driven self-publisher by some or other wikipedia insider who decided it had to go. Never mind that the label he was published by had booker-prize nominees, never mind the mainstream availability, it has been decided by me and my two buddies that you're out. I guess it probably saved them ~5k disk space in text, ~50k for an image and all of half a megabyte a month in hosting costs.
--edit-- this is not to say that my friend is as important as retaining the history of BBS and other pre-internet computer culture BUT it was the day I realised that something weird and tragic was afoot inside wiki. Before that I assumed the only thing deleted from such a place was spam and blatant self promotion.
Prefacing this, I'm largely with Jason Scott on this (and most issues). That said:
The deletionist argument is not and has never been been that it's going to save storage or bandwidth. (Generally, it's about it being difficult to keep up the average quality of the encyclopedia when there's a lot of less notable stuff. Which is absolutely true.) You should at least try to understand the argument for the opposition before you make up your mind.
Policy debates should not appear one sided, etc, etc.
This article motivated me to go and investigate if wikipedia had got any better.
Started writing an article about a technical topic I consider important which was not present. When it was only 2 short paragraphs long, only 8 minutes after I had created it, gone without warning or comment. So, nothing has changed at all.
Would you mind saying what the topic of the article you started writing was? Also, did you do it in a sandbox on your account page or just out in the open?
Without more context your statement creates an extremely biased view against Wikipedia.
Why is it a extremely biased view against Wikipedia? I just stated exactly what happened, when I tried to make a new article. Now, you can argue that I shouldn't just start making an article, but that's another discussion.
I started writing an article about 'C++14', the new C++ standard. I was writing a nicely referenced list of the new features which have already been voted into the standard, when a box popped up telling me there was a conflict, as someone else had wiped the page clean. So I threw my text away and left wikipedia again.
I think your experience suggests that Wikipedia's tools are not as user-friendly as they should be to new or inexperienced editors, rather than that contributing to Wikipedia is useless, because your contributions will be deleted.
That's a problem though, it got merged in as a blurb, but the article itself hadn't had a chance to be fleshed out (where it could've been enough to justify its own page) after just a few minutes.
Interesting. Just tried making up a page and saw a bulleted list of info at the top.
You can also start your new article at Special:Mypage/Erlang++.
There, you can develop the article with less risk of deletion,
ask other editors to help work on it, and move it into "article
space" when it is ready.
That points out the userspace pages. However, it's not loud enough. If the risk is so high of deletion/merging in short spans (minutes), then this should be the preferred and well advertised method of creating articles.
So actual collaboration to create content should take place outside of the context of the canonical wiki, and the wiki itself should only be used for quasi-final drafts? How is that consistent with the nature of a wiki?
What, exactly, is the problem with having work-in-progress articles present in the main wiki? What argument is there for delitionism other than some obsessive-compulsive desire to hold the wiki to some arbitrary standard of orderliness?
I agree with you, my point is merely if the deletionist types are so fast that it's actually a problem for new content (that the Wikimedia Foundation has made special notice of it), then they need to resolve it. Either ensure new pages receive a grace period, or come up with a clearly marked draft area. The grace period is my preferred solution, 8 minutes is not nearly enough to determine that a particular article should be a one paragraph blurb and not an honest to god entry.
Apologies; I'd thought you were saying that userspace was the appropriate setting for in-progress articles, and that this simply needed to gain widespread knowledge.
But I think the "grace period" idea scarcely scratches the surface. Is it better if the article is given time to mature before it's deleted entirely by people dogmatically adhering to some arbitrary criterion of notability?
If Wikipedia is to remain a useful resource and to realize its unique potential, deletionism neesd to be vigorously suppressed. If the current Wikipedia culture is too far gone, then perhaps it needs to be forked and maintained in a culture that regards the fact that enough collaborative effort was made to produce a coherent article as itself sufficient evidence of "notability".
True, I did miss that. At some point I started glazing over reading that block of text at the top of the page.
While there may well be good reasons not to, I have long been surprised wikipedia has not moved to all new pages living in a starter section, until they are ready to be promoted to the full site. Multi layering would also give somewhere for less well researched topics to live.
I would note, I specifically sought that information. It blends into the background very easily. If it's really their suggested solution to a problem in their community, they need to make it more obvious or the default for new page creation.
I understand it's not final, but it's offputting to novice contributors, and the promptness (literally minutes in this case) means that the contributer will be left confused (in this case thinking the content was completely deleted). Even if this speed is not representative, that it can happen will turn people away. How a community presents itself is incredibly important to growing it. Since that's a goal for the Wikimedia Foundation it's something they need to actively work on improving.
True, I could save my text, and try to merge it into the much larger and older C++ page. My suspicion is that such a large addition to the general C++ page would not last long.
No details. Just a well formed, minor contribution to an article that was insta-deleted because of "notability".
Which was news to me because, while I am aware articles themselves need to be notable, now some folks are enforcing notability on subsets of articles. In this case, a single row in a table listing like items.
Meanwhile, for every object ever drawn in an anime, that anime MUST be mentioned in the article for the object. Because this is notable content, you see...
Wikipedia, 2030: "The Holocaust was also mentioned in episode #22 of Robot Cheerleader Girl, where Tanaka-san fell headfirst into a bucket while giving a speech about it."
The fact that all it takes is 3 people to try, decide, and kill an article has always baffled me. The internet isn't exactly known for it's lack of sock puppets and mini-gangs, so the fact that Wiki allows 3 people to be the arbiters of article deletion is bizarre.
I'd say culturally BBS' are massively important to the idea of Wikipedia, to lose their history is to turn a blind eye to the roots of where it came from.
Yes. Wikipedia is fundamentally broken. As a Wikipedia admin since 2003, trying to make basic, intelligent changes to articles that are 'protected' of late has just baffled me. It is very sad that this has occurred to what was previously a strong community. The notion of 'notability' is basically obsolete in these internet days, and should be dropped from Wikipedia's (many and spurious) policies. People that spend their days deleting others' contributions should have to write three times as much accredited content to delete something.
Meta is killing wikipedia. Look at the number of username boards there are - it's bizarre. Just push a list of unaccepted stuff to developers and hard code what isn't allowed. (eg, dotted quads used the be usable as names, but are not any more.)
People re-submitting articles in AfD until they get their way is a problem. I think either articles should be banned from AfD after being kept, or resubmissions should go through mandatory arbitration process with more eyeballs and longer times to make a decision. Admin power abuse was also a small problem when I did AfD (again, years ago), ie. admins deleting articles when there was no clear consensus or the 7 days weren't over.
How could something that is not worth deleting once be worth deleting at a later point? What is the GAIN of deleting an article?
It seems the only purpose of deleting is to make Wikipedia appear more Serious Business. If Wikipedia TRULY wanted to be a repository of "human knowledge" they could send articles to "Wikipedia Ephemera" and only delete articles that are pure spam. Why not this?
You're defining the problem away (No True Scotsman fallacy). Instead of deleting articles that are not-notable (according to criteria X) you suggest deleting articles that are spam (according to criteria X). In both cases, you'll need some sort of mechanism for performing and reviewing deletions. Many of the articles in AfD are, in fact, spam.
So, the gain of deleting articles seems to be clear to you (you are in favor of deleting spam), it's only the criteria which lead to deletion that are up for debate.
Deletion of something may be worth re-examination/arbitration because both "sides" of the argument may try to game the system, particularly if a negative vote on AfD was eternally valid. Banning further discussion and modification also seems contrary to the wiki spirit.
2. The tendency for many pages to attract zealous guardians. I've had spelling errors immediately backed out. I expect that 99.9% of potential contributors quit at this point and never return.
3. Diminishing subjects to cover. All the "major" ones are done, all that's left are the billions of pieces of harmless trivia that deletionists decide, apparently at random, to do away with.
Note also that half the editors are under 22. This is probably why you can find vast articles on Pokemon and deletionists go around flushing articles on anything that happened before 1990.
> Note also that half the editors are under 22. This is probably why you can find vast articles on Pokemon and deletionists go around flushing articles on anything that happened before 1990.
Do you have any idea what you're talking about? Vast articles? No, the Pokemon articles were purged pretty early on; look at lists like http://en.wikipedia.org/wiki/List_of_Pok%C3%A9mon_characters or https://en.wikipedia.org/wiki/List_of_Pok%C3%A9mon_%28202%E2... which devote a few lines to main characters or major pokemon, and just compare some with their respective entries in Bulbapedia (where the Pokemon Wikipedia editors ultimately fled to escape the deletionists, much like the Star Wars editors earlier fled en masse to Wookieepedia).
Even the Pikachu article has been eviscerated: http://en.wikipedia.org/wiki/Pikachu (Hope you enjoy a list of awards and mentions... It's 'out of universe' material donchaknow)
Of all the things to complain about, you're going after the fact that wikipedia has 15 printed pages about important characters in a cartoon with over 750 episodes? This is a list of important, plot-affecting characters; you're just failing to appreciate the sheer scale.
At the very least complain about the species list if you want something ill-fitting for an encyclopedia.
2007 was around the Seigenthaler incident and the kneejerk panic reactions by WMF people and administrators, like Wales unilaterally turning off anonymous page creation and lying to everyone about it being an experiment; I've personally always dated the rise of the deletionists and shifts of burden of proof to around this period.
Is Wiki server space scarce? I should be able to post an article about the tree in my back yard if I want. "Notability" is simply not something an online encyclopedia should be worried about.
Not that BBS history isn't notable. I'm just saying it would be wrong to delete it even if it wasn't.
while I'd like the "everything inside wikipedia" approach to work, the problem with it is that every new page makes it less maintainable, disambiguation pages grow larger, category listings grow larger, search results and suggestions multiply etc.
And the notability argument is that you can already post an article about that tree, on the larger internet.
while I'd like the "everything inside wikipedia" approach to work, the problem with it is that every new page makes it less maintainable, disambiguation pages grow larger, category listings grow larger, search results and suggestions multiply etc.
This argument doesn't seem helpful. All of those problems will continue to exist regardless of whether some articles pass the "Notability" requirement or not.
Nice post hoc excuse. In all the hundreds/thousands of AfDs I've participated in or closed, I don't think I've ever seen anyone say 'this article should be deleted because it will make a disambig page smaller'.
It's not an excuse, as I have never flagged anything for deletion.
It's a purely external reasoning: wikipedia as I see it is already in a un-maintained state for many not notable articles, more articles without a change to how WP works will lead to more of those diminishing the overall quality of the project.
And no, I don't buy that not deleting articles will magically increase the number of contributors.
> more articles without a change to how WP works will lead to more of those diminishing the overall quality of the project.
You know, if you let everyone use the Internet, the average quality of the writing will go down!!!
The average is completely the wrong metric to be using here. Not as good articles do not 'infect' the good articles, and the average is meaningless. If you don't need to know about a topic covered by an article you disapprove of... you don't have to read it.
it's not, it's an argument that wikipedia _as of now_ is not suitable for an indefinitely large number of pages.
I am not a wikipedia editor and am not defending strawman deletionists, I am only giving an opinion based on my daily findings of broken links, flags raised years ago and never updated, missing links between languages et cetera.
Content is already hard enough (i.e. nearly impossible) to keep accurate and updated. If all of us wanted to record the tree in our garden... it would be too much noise compared to signal.
The "aha" moment with Wikipedia is realising that it is not an archive (which is what some people want). It summarises archives, of course, but the point is to write a compendium of useful knowledge.
Where to draw the line is not easy; I think we are too heavy on the notability front. A better way is usually to think "how many people would find this useful or interesting", and if it is a reasonably significant then it is worth writing about.
The culture is a monolith and can be hard to break into - I've long moaned about that. Changing that is hard; I don't begrudge people moaning or criticising it, but I suggest that the most constructive approach is to take part and bring some common sense on board :)
Just don't get sucked into the politiking and game mongering.
Every time this comes up I'm reminded of a very simple idea: fork Wikipedia, in a way that articles still get synced from the original but at the same time deleted articles just stay in the database. If you want to make it really fancy, put in some code that tries to detect whether parts of an article have been deleted and, when in doubt, keep the old version it live. Maybe there could even be some neat UI that makes it easy to navigate different versions of articles (no, the Wikipedia UI isn't it). I think in time a project like that could become very successful. Keep using Wikipedia as a source, but give new life to the underlying idea that is no longer being taken seriously over there.
Inclusionist 'forks', 'annexes', or 'salvage yards' have been attempted a few times. They usually adopt the same MediaWiki software and general article-format, for familiarity and ease of starting-up.
But, since that stack has coevolved with community practices, it is essentially dependent on the same mentalities, content-standards, and critical mass of contributors in order to function at all. A site that's "exactly Wikipedia, but inclusionist" imports many of the stresses and doctrinal limitations that have fed deletionist urges, but without a vibrant-enough seed community to build the full set of alternate practices that inclusionism would allow.
Between other projects, I've been working on an alternate kind of reference wiki that I believe can capture more info without as much conflict. Variances from standard Wikipedia/MediaWiki I'm trying are:
• all information must be contributed in small capped-size chunks – think about 2-3 times as large as a tweet, or like a Google search result snippet but in complete sentences
• community scoring of chunk quality, so that rough/undersourced/needs-improvement material can live on, somewhat out of view, rather than being lost completely to deletion
• extensive use of atomic, single-click feedback (upvotes, flags, likes, thanks, etc.) for quick reinforcement/correction loops (learning from Facebook, Quora, Twitter, etc.)
Though recent updates are few, you can read more about the plans and progress at my project blog – http://blog.thunkpedia.org. Following the @thunkpedia twitter handle or otherwise contacting me will get you invited to the beta as soon as it opens.
I started to type something similar then did some research into the costs. The DB download is 8.8GB, they say it expands to terrabytes when uncompressed. That's no longer an impossibly difficult amount of storage to acquire though. The Wikimedia Foundation budget for this next year is $42.1 million, I only skimmed so I didn't find a breakdown on costs of things like electricity and bandwidth out of that. I like the idea, though wonder how you could gain traction from potential contributors. Would those alienated by the current Wikipedia processes along with some other people interested in the novelty be enough to start a community?
The actual infrastructure would probably be relatively cheap (my el-cheapo Hetzner hosting plan has 3TB of space for instance), bandwidth is another matter entirely - such a project would have to be ad supported I think, it's the only thing that scales with the visitor count.
Community is a big issue, but it wouldn't necessarily need to be large, especially at first a small band of moderators would be enough. Initially, such a project would base its data off Wikipedia entirely, but over time more things need to be curated. While we're on it, voting on articles and moving discussion threads about them to the forefront might also be worth looking into. Social proof could be one of the mechanisms used to determine the "best" version of an article. I think there are hundreds of parameters worth experimenting with.
I guess there are a lot of people alienated by Wikipedia, but the beauty of the idea is that we wouldn't have to rely on that. Incoming links would in time drive most of the traffic to the site, as it would be - by definition - a better place to link to for durable and extensive information.
Sounds like a nice idea for a quirky, bootstrapped startup. :)
I like the idea, I wonder if it has to be ad revenue though. I find that idea (personally) distasteful, but don't have any better ideas besides getting donations.
Maybe this exists, MediaWiki may even support it. What I'd like to see is a wiki that allowed you to populate articles by crossreferencing other pages.
An example:
Page: Lisp <tagged as FP>
A functional programming language invented in...
<Portion marked as summary>
Page: Functional Programming
<Summary of FP>
List of languages in the FP category:
<Populated from tagged articles>
<Populated from summary in tagged articles>
You'd only edit that Lisp summary in one location, then other articles could pull from it as needed. Not sure how well it'd work in practice, and as I said maybe MediaWiki and others support this now. I haven't looked at their features in years.
If you initially constructed your wiki from Wikipedia (and maybe other sources), the initial effort would be on removing this redundancy, but you'd still provide as many articles they'd just be better synchronized with each other. Adding breadth (Scheme, Common Lisp, etc) would be straightforward by sharing whole blocks of text. Shorter articles wouldn't be penalized because they'd just be pulled into others via this same cross referencing mechanism. They could be hidden in some fashion to avoid namespace clutter, but outright deletion would be unnecessary.
Well that was a braindump, and maybe not as coherent as I thought it would be. This is why I should sleep more, and practice describing my ideas to other people for feedback.
EDIT: To continue, a thought I should've included originally, use some form of markup for those pages to autofill parts and maybe autogenerate the summary. A language has a place, person and time of invention. A set of paradigms it may be described as fitting into, a set of parent, children and cousin languages. Marked up, this allows for portions to be very consistent in presentation across pages in a category. I suppose I should look at semantic wikis again. As I said, this was an idea I had years ago and pondered but never really researched.
> put in some code that tries to detect whether parts of an article have been deleted and, when in doubt, keep the old version it live
Unlike deleted pages which are hidden from non-admins, deleted parts of articles are almost always reachable if you dig them out in the revision history of the page. The rare exceptions are for text which violates someone's copyright, or which is defamatory, all of which require manual intervention by an admin to be deleted for real. So this content is still archived by Wikipedia, and publicly available, just not very easy to browse.
Forking is one solution but wikimedia could spin up forks of its own with different standards and pull from the different sources. Ideally, you could make a wikipedia account and maybe configure which subwikis you wanted to pull from.
Tangentially related, I've suffered a quiet death of my own BBS history because nearly a decade of 90's data is jailed on old 250MB+ QIC tape backups that were recorded with various unknown programs running from DOS to Win 3.1/95 days. Has anyone had a positive experience with using a service to migrate data from media this old and unknown backup programs at a reasonable cost?
I'm assuming someone may have already said this, but content isn't locked into Wikipedia. It's Creative Commons-licensed, meaning that if someone cared about curating and maintaining this history, they could have moved those articles over to another wiki. There are even wiki hosting solutions that make these kind of moves relatively easy.
The problem is not that the Wikipedia community was curating their content, but that that batch of content was not being actively curated by someone. This is how history works. Someone has to determine something notable and worth preserving. If no one stands up for a piece of content, then it eventually dissipates into the ether.
The problem is not that the Wikipedia community was curating their content, but that that batch of content was not being actively curated by someone. This is how history works. Someone has to determine something notable and worth preserving. If no one stands up for a piece of content, then it eventually dissipates into the ether.
If that's the case, I don't see why anyone should value a resource where the content is decided by whoever wastes the most time judging which content is "worthwhile". It seems like it's a lot easier for a small group of people to go around flagging articles for deletion than it is for people to go around finding articles to defend against deletion.
So if group A decides they are just going to start deleting all articles older than X years old, whose job is it to stand up to them and why would that situation be a good thing?
Edit: And how is this similar to how other records of history are kept? Do you expect authors to have to physically defend a library from roving bands of book burners? "If they really thought the content was important, why would have stopped people from burning the books."
> Someone has to determine something notable and worth preserving. If no one stands up for a piece of content, then it eventually dissipates into the ether.
His entire point is that this process is broken and heavily biased against the person who wishes to stand up for the content. Did you have a specific reason why he's wrong about that?
My point is more that if the community has an issue with the content, then take it someplace else. There are more wikis than Wikipedia. If you imagine that all knowledge in the world is a library, then Wikipedia is only the two or three shelves of big, heavy encyclopedias. If you don't think there's a wiki for your domain, then start one.
Let Wikipedia be some weirdly curated thing that is often useful but not the primary source for everything. I agree wholeheartedly with the article: if you care about something, cover it and protect it separately from Wikipedia. It'll get found through search anyways.
Looking for that specific page, which they already know about? Sure, they'll just enter the URL or click the bookmark.
Looking for some page containing similar keywords to mine? Again, if I could reliably cause people with relevant interests to come to any page I create, I'd be in a great place financially. I'd just start an online poker site, then go out and buy myself a pony made of diamonds.
Perhaps not what you were hoping for, yet never-the-less: its up to - you - Mr. Scott. Keep your remembrances of that era alive at your site. If not you, who will?
wikipedia should be like github. I should be able to clone it, write whatever I want in my branch, and you can merge it into your branch if you want, or not if you don't. End of story.
Its a noise to signal ratio problem; when you search for any person you don't want a list of one thousand people; you probably want to read about the few popularly know ones. Same thing happens with acronyms, dates and abbreviations.
Plus the big problem of sources curation, those are more unreliable in unpopular articles for the lack of eyeballs on them. The amount of work mods have to do would grow exponentially if they didn't take notability into account.
Of course Wikipedia could do better, but it is a hard problem, not an easy one.
> Its a noise to signal ratio problem; when you search for any person you don't want a list of one thousand people; you probably want to read about the few popularly know ones. Same thing happens with acronyms, dates and abbreviations.
Just recently (only last century) some clever fellows came up with a workable solution for this. Look up www.google.com in your web browser and see if works for you.
They came out for a solution for themselves, or do you happen to have the link to their github repo? Otherwise you may want to donate a few million dollars so they can come out with a similar solution.
Google aims to "index all the world's information".
These are different but complementary goals.
That the Wikimedia Foundation hasn't improved a user interface that was slapped together in the early 2000s is not somehow a mathematical, universal constraint on the growth of the subjects that are covered. It just isn't.
I am prepared to bet honest folding money that almost nobody uses the internal mechanisms of search and navigation that Wikipedia provides. Apart from clicking an in-text link referring to another subject, I am prepared to bet that traffic to Wikipedia is dominated by search engine referrals by at least one order of magnitude. Probably two.
Google so dominates the actual usage of Wikipedia that it is ridiculous to advance the poor UI of the Wikipedia platform as some kind of serious argument in favour of deletionism.
It'd be like walking into a library circa 1950 and saying "All these index cards are a schlep, let's start throwing away books".
You'd be committed to a loony bin. And now we have unlimited search and retrieval capability and you suppose that a poor interface is the killer blow for deletionism?
"Use google please, our searching and listing algorithm is not as good as theirs".
Most people don't return to Google after the initial access to the website, they keep using the Wikipedia interface when they need to search related subjects.
And I will surely throw away some books from the library if random people were allowed to put their books in there.
> Most people don't return to Google after the initial access to the website, they keep using the Wikipedia interface when they need to search related subjects.
I'll bet you $100, AUD or USD, to be donated to the charity of your choice, that this is not so, by a factor of at least 10 to 1.
As I said, this does not include following links in articles. I am talking about the wikipedia search engine and category pages. I am talking about search engines generally, though I expect google to be the dominant one.
How are you so sure most people don't return to Google? I do it more than half the time, and just add Wikipedia as a search term to narrow the results. For years I've felt it made my search more fruitful than using Wikipedia's search. Maybe my feelings are wrong, but if I feel this way, then a lot of other people could, too!
You can tell from auto-complete many people add 'wiki' or 'wikipedia' to searches at Google. And with the growth of Google-search-from-browser fields, I would also strongly expect that most people who type reformulated queries during an extended Wikipedia session do so through Google, not Wikipedia's much weaker onsite search.
Perhaps is related to the kind of subjects you are searching for; or it may just personalization by google.
Or maybe we are both biased for ultimately silly reasons and the truth lies somewhere in the middle but the lack of available data reduces everything to mere speculation.
This 'little test' on 3 arbitrary early-20th-century historical names, only looking at the top 4 suggestions, doesn't 'show otherwise'. I said 'many people', not 'most'.
Turn off 'instant' so you see the top 10 autocompletions, and watch over all your queries. You will very often see "wiki" as a suggested suffix.
Or better yet, just add " w" to the end of any of your own tests: " wiki" will be the top suggestion, which demonstrates that 'many' people add it as a suffix on Google.
Everything on this question is not reduced to bias and 'mere speculation'. I've observed many peoples' search behavior, not just my own, and habitual recourse to browser-based search boxes or always-requery-at-Google is growing over time (especially with the rise of Chrome and its 'onebox').
Wikipedia also did usability studies in the 2009-2010 timeframe, from which Wikimedia director/developer Erik Moeller reported: "our test subjects tended to resort to common web search engines to navigate Wikipedia instead of using the site’s own search" [1]. (Wikipedia has since moved the site search box to help it be found, and I suspect that's boosted its use, but it's still subtle compared to the always-available, always-familiar in-browser Google-powered search.)
Plus the big problem of sources curation, those are more unreliable in unpopular articles for the lack of eyeballs on them. The amount of work mods have to do would grow exponentially if they didn't take notability into account.
Maybe you didn't intend this but it sounds like you are saying that Popularity == Notability. I don't think that's a helpful stance to take. Especially since as time progresses, older things will generally become less popular and thus trend towards deletion if what you say is true.
There is way too many people writing everyday in Wikipedia for time degradation to be a viable solution, plus it doesn't do anything for the problem of source curation.
My point was that it sounded like you were basically saying that only popular articles should be allowed on Wikipedia, regardless of whether that meant anything older than a few years old should be deleted. Since I don't agree, I was pointing that out.
Between anime and pop music I am not sure which has more pages dedicated to it. You can find page after page about individual episodes of an anime series just as you can find many individual pages for one song off an album.
So what quantifies notability? I would love to know how an obscure; at least to me; anime or song qualifies versus some of what does get deleted.
The bias of individuals play a major role in the historical and biographical accuracy of the articles they write, not so much in obscure songs. Also the credibility reduction in the public eyes is virtually none for mistakes in such things as obscure songs.
Yep, those are the two arguments for notability criteria I found most convincing. Anybody arguing for total inclusionism will have have to provide answers to those. And if you're not arguing for total inclusionism, you need to provide better notability criteria that work with all the edge cases. That's also really really hard.
As someone who recently imported Wikipedia information for a project, it's trivially easy to determine notability, either by monthly pageviews (publicly available), or by incoming links within Wikipedia.
As long as articles always take popularity into account when ranking (in searches, etc.) there isn't any problem.
The bigger problem is verifiability via citations to third-party sources, not really notability. If you can cite a good reference, in some kind of scholarly or otherwise decent source, and not written by the actual subject of the article, it typically stays in. I've written articles on some very obscure things, but they were well-cited, so nobody challenged them. My usual methodology is actually to start from a good source: pick up a book like Knuth's TAOCP, find algorithms he discusses that have no article, do a little searching to find a 2nd or 3rd source on the algorithm, then write a little article cited to those 2-3 sources. Haven't run into any problems doing things that way.
There didn't used to be as much emphasis on citations, but a mixture of hoaxes, libel controversies, and an influx of fringe-physics people led to a significant clampdown on unverified material around 2005-06. The good side of it is that the previous focus on notability is mostly gone: if you have good sources, it's ipso-facto notable, due to having, in fact, been noted in good sources. This is quite nice when I write articles on obscure subjects which will get only a handful of visits, but which are well-referenced. The downside is that subjects which are notable but for which there isn't good material to cite end up in an awkward place. I wrote a bit on that shift last year: http://www.kmjn.org/notes/wikipedia_notability_verifiability...
In fact, you can provide citations which don't actually say what you are using them for (or say what you want while totally failing to substantiate it in any way) and it tends to work as long as the group of people who camp that particular article agree with your editorial slant on the matter.
I'm always chasing down citations on wikipedia and finding this to be the case.
That's definitely something worth chasing down. I haven't found it in the articles I edit, but I guess I tend to avoid editing articles where people have axes to grind (Israel-Palestine, U.S. politics, climate-change, alternative medicine, that kind of thing).
I do sometimes find misreferences, but in the areas I edit (archaeology, geography, engineering, history) they're most often misreads or misinterpretations by someone who was reading either too carelessly or without enough background.
Wasn't there an article on HN recently about fishing old games for gaming concepts? I wonder if someone took up that idea and thought that the next best course of action was to conceal their source before blatantly ripping off every classic game mechanic from the last 30 years? After all "Good Artists Borrow, Great Artists Steal" I really can't imagine any other reason to target BBS games specifically. Also, the deleter happens to be a programmer as well.
The façade says "come and contribute!"
The actual operation of the site is no longer like that. There has been a total inversion of the premise.
Wikipedia should have an anti-deletion bias.
There. I said it.
It should take a massive number of people to delete a page. Hundreds. Thousands maybe. Some particular amount of registered users (5% say, or double the square root. Whatever).