In the first experiment, researchers sought to observe how parents would interact with their children (in this case, 3-year-olds) after the parents were asked to describe times in which they had recently experienced scarcity. A control group of parents were instead asked to describe other recent activities.
How many studies like this need to not replicate before we stop treating them seriously?
Because people are reading this excerpt from the article without reading the paper itself, this isn't a self-survey. The survey itself is a 'manipulation'. (But I won't comment on the reproducability or value of this methodology)
> This left us with a final sample of 84 dyads, randomly assigned to either the Scarcity (n=42) or Control (n=42) conditions. Dyads across conditions did not differ in age, child or caregiver sex, caregiver education, or family income
> The caregiver and child were seated across from one another at a table. The child completed an unrelated experiment with the researcher, while the caregiver completed the Scarcity or Control manipulation survey on a tablet. When the caregiver had finished, the researcher left the room under the guise of loading a second survey onto the iPad, leaving the caregiver and child alone with a toy the experimenter happened to offer as she left. A video camera and/or tape recorder recorded their interactions.
And there was also a second experiment:
> The second experiment used [..] tiny “talk pedometer” devices worn by children that record their conversations and count the words they hear and say.
> [..] analyses revealed that parents engaged in fewer conversational turns with their children at the month’s end, a time that typically coincides with money being tight
>>How many studies like this need to not replicate before we stop treating them seriously?
Hard to answer this, IMO. It would be hard to stop keep versions of such anecdata from informing our personal worldviews, for example. Business knowledge is basically made out of such stuff.
Replicability, for questions like this may be the wrong bar. Should we expect a relationship between financial hardship and talking to three year olds to be the same for a group of parents in 2021 california and 1997 Tokyo?
A lot of the replication crisis is shoddy work. Bad research. But... a lot of it is a problem of the subject matter itself. It's complex, and can't be isolated. Does dumping a ton of manure in a forest result in more leaves? more flies? Hard to describe the relationship between the manure and leaves in a replicable way. You can do controlled experiments, but the lab version doesn't answer the same question... the one about complexity.
IDK... I'm not excusing the replication crisis and many of the worst affected fields do need to pay attention to the gunk they've accumulated. OTOH, we do need ways of describing relationships that are complex. "We know nothing" is insufficient.
It's not "we know nothing." It's "we have low confidence in this thing." Which is exactly what you should be willing to say if you want an accurate view of the world. You can take action of knowing something with low confidence, you just make sure it's not too extreme, and it's not incompatible with being wrong.
Well, they did two studies, with different methodology, that came to the same conclusion. I would agree that it would be better to see another group try to replicate, and the actual study link (https://psyarxiv.com/byp4k) has this:
"Data Availability: For Study 1, all materials and analysis scripts, including the protocol to be used in the case of a future replication, are available on OSF..."
But, inspecting the data closer, it does seem like the effect is not huge, and that is perhaps why the Berkeley website that is summarizing the results did not choose to mention anything about the size of the effect.
Jokin aside I've seen way better studies of this kind where the parents would wear a mic at home and they would then do a word count, word analysis, etc.
That's only relevant if it affects one group more than the other, maybe you can control for this or take another approach, in any case I would find the results way more convincing than a survey.
Especially considering that going through a tough time will colour your memory and memory is extremely unreliable.
Statistical significance is a measure of the likelihood of getting that effect size by random if the two approaches didn't actually affect the outcome. With a large enough sample size, an effect size of 1 fewer word could be "statistically significant", even if it is practically meaningless.
In this case they got a p-value of 0.1 iirc. So assuming there's nothing special about the circumstances, you'd expect to get samples this different 1 in 10 times you ran the experiment by luck alone.
No I know what statistically significant means, I was curious why you phrased it that way.
The authors listed many results, and noted whether each was significant. It seems weird that you keyed in on one measure (caregiver vocabulary) with a higher p-value, and not the other results that did suggest that when money is tight, parents talk less.
Oh I see I jumped the gun. The first result seemed close to the headline of the paper but its actually a bit different. The actual headline is from further down.
I'd still ignore it though. You see how the first 3 are under "pre registered analyses" and the significant finding is under "exploratory analyses"? That's another way of saying their experiment failed to find anything interesting so they combed the rest of the dataset to try and find something they could publish. Basically just classic p-hacking. Probably not malicious or intentionally deceiving, but p-hacking nonetheless.
If you throw your hands up and decide to check what's interesting on a 14x14 correlation matrix, you've just tested around 100 hypotheses without realizing it. If your significance threshold is 0.05, you should expect to have 5 false positives in there already.
I already knew you’d ignore it - in your rush to dismiss you missed the section of preregistered analyses did generate statistically significant results.
Exploratory analysis is not in inherently p-hacking. Publishing exploratory analyses as such is the proper action. There’s nothing in their methodology that suggests they analyzed a large number of possibilities and discarded the high p-values. (How would that even work in this case? Run through all possible conversation topics? All possible time divisions of speech?) The exploratory topics extend naturally from their preregistered hypotheses, from speech and income to speech relating to income and speech on calendar days when income is an issue.
Your cynical take on publishing negative results is unhelpful, as the accusation of bad faith and the straw man.
> I already knew you’d ignore it - in your rush to dismiss you missed the section of preregistered analyses did generate statistically significant results.
I'm open to being proven wrong but I've reread this section a couple times and I'm pretty sure all four tests in the primary pre-registered section and both tests in the secondary pre registered section are all non-significant.
(looks like there's actually two studies here. I've only read the first)
> Exploratory analysis is not in inherently p-hacking. Publishing exploratory analyses as such is the proper action. There’s nothing in their methodology that suggests they analyzed a large number of possibilities and discarded the high p-values.
Exploratory analysis is p-hacking. p-hacking is not (always) an evil unprincipled scientist trying to push a story. It's usually a scientist without a lot of statistical knowledge trying to see what's interesting and then letting their personal biases confirm coincidences as they appear because they want to find SOMETHING. You can publish these results, but you'd better be very clear that, you know, it's not very good. They're relatively conservative as they should be in the scheme of things here, and that's good. But look at the article and the discussion its generated. Clearly some people think its a trustworthy "scientific" result.
> How would that even work in this case? Run through all possible conversation topics? All possible time divisions of speech?
You take your data, you generate a correlation matrix of everything you've got, and point your finger at the values that look high. Then you test those hot spots and find significance. Very easy. You're implying here that in order to p-hack you need to check out every possibility. Not true.
They already tested ~6 ideas. Assuming they're independent of each other (to be fair, they're not), the likelihood of finding something significant with a p value below .05, purely by chance is already 0.95^6 => 26%. That is to say, naively, if you run 4 of these studies, one of them is expected to get a positive result even if they're all bogus. If they're allowed to do exploratory tests, they're now getting the freedom to cherry pick additional options. Remember, you have to count both the things they actually test AND the things they decided not to test after seeing the correlation matrix.
If they consider an additional 3 ideas, your studies now have a 33% false positive rate. At 15 ideas you're now at a 66% false positive rate. At about 40 extra ideas you have a 90% chance of finding something significant. If everyone's doing this, then your entire field is probably bogus.
It is extremely easy to unintentionally cheat and write off 40 insignificant ideas glancing at a correlation matrix. Pre-registering ideas is critical.
and I'm harping on correlation matrices because that's probably a lot of people's first idea, but there's plenty of other ways. That is the whole point of exploratory analysis.
You can see for example that one of their exploratory analyses is what happens to the relationships when we group by income? Well, you get to test another handful of ideas, I can guarantee you that much.
> The exploratory topics extend naturally from their preregistered hypotheses, from speech and income to speech relating to income and speech on calendar days when income is an issue.
That's not remarkable at all. That's the kinds of information that existed in their dataset.
> Your cynical take on publishing negative results is unhelpful, as the accusation of bad faith and the straw man
Well I did say "Probably not malicious or intentionally deceiving" so if anyone's pushing a bad faith interpretation here I'd say it's you.
I'm glad you posted and took the the time to check it, because intuitively I would agree with the headline, less money = more stress = less talking (probably not for everybody, but most people will talk less when under stress, especially over time).
At the limit, this is "my parents couldn't afford proper care for a debilitating diesease" vs they could. There is one group there that would talk more.
if you're poor, broke, alcholic, meth head, on heroin, you're probably not gonna be talking to your kids as much as someone who is not.
To see that idea with a self survey is kinda hilarious. kutgw to get this stuff right.
> To see that idea with a self survey is kinda hilarious
It wasn't a self survey.
> This left us with a final sample of 84 dyads, randomly assigned to either the Scarcity (n=42) or Control (n=42) conditions. Dyads across conditions did not differ in age, child or caregiver sex, caregiver education, or family income
> The caregiver and child were seated across from one another at a table. The child completed an unrelated experiment with the researcher, while the caregiver completed the Scarcity or Control manipulation survey on a tablet. When the caregiver had finished, the researcher left the room under the guise of loading a second survey onto the iPad, leaving the caregiver and child alone with a toy the experimenter happened to offer as she left. A video camera and/or tape recorder recorded their interactions.
>Why are the poor in the America described in terms of "meth heads"? Being poor is not a moral failure.
They're generally not referred to that way outside discussion of behaviors addicts engage in to get their fix and the discussion is limited in scope to addicts.
Making the generalization that everyone refers to the poors as meth heads serves the same ideological needs as generalizing the poors as addicts albeit to a different end.
I just picked one, but I noticed this more than once. Either the poor are referred to as "trailer trash", "Walmart monsters" or "meth heads". It is just interesting to see how American culture is so anti being poor that it seeps into your language without you knowing.
All cultures have anti poor sentiments as well as empathetic attitudes towards the poor. What is "interesting" here is prejudice attitudes from people like you towards america.
Also stop calling Americans "interesting." Using a term like this is a deliberate insult on a culture or a person. You are examining the culture like it's a lab rat and commenting on how the behavior is "interesting."
You are not an idiot. People do not talk like this in real life by remarking on how behavior is "interesting" to the subjects face. You and others only use these terms behind the anonymity of a forum. Therefore you are aware this is insulting. Stop.
This is a common tactic used to get around the HN rules. You say "interesting" posing it as an innocent remark. It is not, you are conducting a deliberate and insulting attack on American culture here.
Do you hold an oracle which tells you which studies will not replicate, or is there some particular trait of the study that provably doesn't replicate?
This quiz will let you test how accurate you are at predicting which psych papers replicated or not. Last time I took it, I think I was at 8 or 9 out of 10. I think something close to 50% of such papers don't replicate - so it's probably a reasonable prior to just assume any given social psychology study won't replicate.
> so it's probably a reasonable prior to just assume any given social psychology study won't replicate
I'd pay that. Though in this case it's not people reading the paper, but reading the article about the paper. (On reading the paper, the findings are less conclusive than the click-baity headline, but that's almost always the case)
It's not about an Oracle. It's that the mentioning of a topic prior to some conclusion generating observation has practically never generated replicable results. Still, these studies make news all the time because they're easy to p-hack into some interesting conclusion.
Reading the study it appears they went to some pains in not revealing their intentions:
> The caregiver and child were seated across from one another at a table. The child completed an unrelated experiment with the researcher, while the caregiver completed the Scarcity or Control manipulation survey on a tablet. When the caregiver had finished, the researcher left the room under the guise of loading a second survey onto the iPad, leaving the caregiver and child alone with a toy the experimenter happened to offer as she left. A video camera and/or tape recorder recorded their interactions.
The conclusion could be that being reminded or queried about hardships has a chilling effect
Also to be specific, none of the volunteers needed to consult an Oracle. The factors they cite (low N, "newsworthiness") are also the same factors that would give a layperson pause.
And a high portion of the cohort of that study were people that read academic papers as an occupation. I'm not saying that the study couldn't generalize to lay people, just that you haven't presented evidence of it
(I believe one of the early controversies that sparked the reproducability crisis was the discovery that an excess of studies were using college students as their participants, skewing their results)
There was a report by a psychologist who said Native American mascots have a negative effect on children of native Americans.
This may or may not be the case or may be true in some specific circumstances (Redskins, for example) but may not be true for all, however, many high schools are changing their team names based on this one study/conclusion which was funded by a tribe. Which is fine, but usually when there is a conflict of interest one wants to have additional independent studies that come to a similar conclusion.
If you're referring to Fryberg, Markus, Oyserman & Stone 2008, do you think that schools deciding to change their team names/mascots might have something to do with far larger and more broad cultural trends over the last couple of decades rather than these schools decisions resting primarily on the robustness and applicability (or otherwise) of this one paper?
Teams didn't change their names because of that tribe-sponsored paper, they changed their names because white people wanted them to change.
In 2016, when the main momentum of this started to come out, 9 in 10 Native Americans did not find it offensive [1]. But white people had already decided it was offensive, and white people have much more cultural influence than Native Americans, so in being perceived by white people as offensive it became offensive, to the point that just a few years later, a majority of Native Americans found it offensive [2].
But at all times through this, including today, a higher percentage of white people find the Redskins name offensive than of Native Americans. The Redskins changed their name because of pressure from white people, not because of pressure from Native Americans. More than a little ironic that even in this, Native Americans were barely listened to, while white people's feelings were considered very important.
This is even a worse reason to change names. Someone who has no stake in it at all then takes it open themselves to represent someone else as an aggrieved entity.
And, to be clear, I'm okay with presenting evidence one way or another, but I object to willy-nilly claims and crusades.
That’s not the question at hand. I’m asking a specific question about who it’s been “problematic” for.
> Why is it so important to you to continue to use caricature mascots
I don’t give a shit about mascots at all. My complaint is specifically about people who concern troll on behalf of minorities that never asked for it (e.g. the Speedy Gonzales cancellation).
Whenever someone tells you something is “problematic” without giving concrete reasoning, they are likely just looking to be outraged and spread righteous indignation.
You don't think indigenous people who are offended by caricatures of themselves being used as mascots is enough of a 'concrete reason(ing)'?
But I did ask you a question. I'll ask it again: Why is it more important to you to maintain racist mascots than it is to use something neutral in its place?
> many high schools are changing their team names based on this one study/conclusion which was funded by a tribe.
I've been following the efforts to have team names changed for quite some time and have never heard of this paper, but have heard of sustained lobbying over decades from a variety of native organizations.
So native Americans tribes have for the majority almost been wiped out and eradicated due to something not very far from genocide (or at the minimum, a very targeted policy of extermination)
And somehow you think that schools that have nothing to do with those people finally changing their very questionable names is due to a “conflict of interest of a psychologist”? Not because it was just racist?
It could be argued that using that symbolism is more a form of respect than it is a form of racism. I don't think people who decided those names actively participated in the genocide, and objectively, they used those names and that symbolism as an ideal to look up to, they used it as a warrior symbol for their team.
It could also be argued that it dilutes the image/culture/history and probably a bunch more things by wrapping it up in american plastic. I wouldn't argue with that.
> It could be argued that using that symbolism is more a form of respect than it is a form of racism.
Broadly speaking, the symbolism being used is usually either an obvious caricature of the real thing (see: the former Washington Redskins), or uses cultural and religious elements that are at best kind of offensive in the context of being used as a sportsball logo (see: most uses of peace pipe or war bonnet imagery).
There are exceptions, but they're of the kind that generally come with either somebody actually doing the research or people of the given group actually participating locally.
In your example there's no downside to changing the name, though. It's a low stakes decision. In the book "Trust- a Very Short Introduction" the author argues if the situation is "low stakes" you should need a lower level of evidence to trust someone than high stakes. While I think the author might discuss it only in the context of human interactions, I imagine you could apply the paradigm to this situation.
It didn't require a scientific paper to create the mascot but you seem to be arguing it requires a "good paper" that's been validated to rename it? You are demonstrating a status quo cognitive bias.
So if someone comes up with a paper that says the opposite (that using native American names) has a proven (by the a paid study) positive impact, since it's low stakes, we should go ahead and name teams after native American tribes?
I think we were talking about high school mascots not team names. Are those two scenarios equally low stakes?
In one you change the high school mascot from a native american cartoon to an animal or something. I'm assuming native americans were not consulted with the creation of the mascot to begin with.
In the other we change... what all the teams names? Is that a equivilent scenario?
To be honest I'm finding it hard to relate to someone would care enough about a high school mascot to even need a study at all. Low stakes B.S. But that's me.
Since this topic is probably of zero importance to anyone not native american who else is going to fund it, for that matter?
> I'm assuming native americans were not consulted with the creation of the mascot to begin with.
Seems like you are affirming the consequent and assuming it is harmful to native Americans.
You are ignoring the issue of the paper. If the decision is based on a scientific paper, but the paper is conflicted or unvalidated, that is harmful because of the misuse of the scientific paper.
If you want to change the mascot because of Native American’s preferences, that is a different matter.
If not it's kind of a catch 22 for the tribe, isn't it? Nobody wants to fund research on the topic, so when they fund research on it it's presumed invalid because of bias?
A more repeatable study designed today might partner with smartphone megacorps and/or also use in home microphones like Alexa, etc.
The ethics considerations of microphones always on would need to be extremely stringent, and only for safety of the children or others should they be breached.
The study might also require the provisioning of basics like Internet access and such devices. Though these days Internet access is as necessary for a productive member of society as power and water are, and IMO it should be a universal right.
> Though these days Internet access is as necessary for a productive member of society as power and water are
No it’s not. I have an uncle who is a plumber and gets by without Internet just fine. Banks with ATMs, branch offices, and phones. Pays bills using the the mail, etc.
In the US utilities are required to offer mail options for bills, but not email or phone. Unless you work from home over the Internet or have children remote learning, it’s absolutely just a luxury.
There are multiple performance and outcome gaps that are unequivocal in research across multiple disciplines. There lots of debate about what causes them. None of them necessarily imply anything permanent/immutable about any group.
Plugging your ears and crying ethical foul impoverishes the discussion and all it’s observers.
You could instead add real arguments to the debate. That enriches the discussion.
I agree with you 100% that more research is needed, but the person above me seemingly doesn’t, or they wouldn’t have made such a dismissive comment implying funding should be cut because the “true” answer is already known.
Don’t get mad at me for calling that person out. I’m the one who wants more research - they apparently already know the answer.
How many studies like this need to not replicate before we stop treating them seriously?