Hacker Newsnew | past | comments | ask | show | jobs | submit | bmsran's commentslogin

there are long regions on the Y chromosome that are very similar to the X chromosome, which would make the analysis difficult:

https://en.wikipedia.org/wiki/Pseudoautosomal_region


One minor correction - in step 2 the DNA is not amplified as this would reduce the fragment length and also lose the methylation information


Good point. Here they're starting from a cell line, so presumably just starting with as much DNA as they can get from the cells. Amplification is usually needed in other scenarios where the sample is more finite, though from what I've read, nanopore sequencing tech doesn't need much DNA.


The "ultra long" nanopore reads used in this study are often greater than 100kbp in length and occasionally up to 1Mbp



Just read it and they were not disputing the existence of two strains but rather that the virus didnt jump from animals twice. The article disputes the claim that because L strain is 70% of the cases, that it must be more virulent.


the distinction is important here because this subthread started from the suggestion that there are multiple strains circulating, with different virulence. The fact there are multiple lineages is a natural consequence of how viruses spread


"Is here. This not a joke. We can wonder about the license though. Maybe we should ask the walking product of this source: Craig Venter."

The reference build of the human genome (provided here by Ensembl) is almost entirely derived from the public human genome sequencing project, not the private project led by Venter.


And the public reference genome is about 2/3 from a single African American individual from Buffalo, New York!


This is the most important statement in the paper.


Let's pretend I don't have a PhD and don't know what that statement means though, can I get a summary?


Shorter and less accurate than the Wikipedia page: A SNP is a single nucleotide polymorphism, a single letter variation in a genome. This study measured about half a million SNPs for each person. A statistical test is then performed to see how well each SNP site predicts the trait, generating a p-value. If a single such test were being performed, then typical "significant" p-value levels would be 0.05 or 0.01 or 0.001, these are arbitrary but generally accepted. For data from SNPs unassocisted with the trait, p-values come randomly and uniformlay from the range 0 to 1. So with a hundred SNPs unassocisted with the trait, a person would expect about one p-value at <= 0.01. There are many ways to correct for these multiple hypothesis tests. For GWAS, the generally accepted significance levels are 5e-8, which under the rubric of the fancily named but simple Bonferroni correction, would be equivalent to a 0.01 to 0.05 p-value from a single test. These two reported SNPs, when correcting for multiple testing, don't meet the standard definition of significant.


It means the result could plausibly have occurred just by chance. The odds may be against it being chance, but it is at least plausible.


Eh, not really. What you said is true of any p value. It's tautological.


This may help: https://en.wikipedia.org/wiki/Genome-wide_association_study

They found "support" for associations between homosexuality and some specific genetic code positions, but not strong enough evidence to be very sure of anything ("significance").


It is common to compare traits in monozygotic vs dizygotic twins to help control for shared environments.


Also a great example of the system working as intended. A group publishes a dubious study with potentially important implications and people queue up to refute it, correcting the scientific record.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: