Everyone’s scared that it would be used for war but how would they break the alignment on llm models? They don’t even allow me to generate black people on AI. How the hell will it work for war related tasks? Or would there be a separate model fine tuned for government that allows being used to kill people?
You don’t say “find people to kill and kill them” you say, “given this list of locations, which ones could be harboring terrorists or hidden military bases?” Etc. Or even more abstract constructs based on domain aliases where AI assists in pattern matching and automation but isn’t really thinking in terms of moral domains.
If you do not partake in a war when a war is waged against you, you lose, and you either get subdued, or perish altogether. This is why pacifism for some part of a society is only possible when another part of the society is willing and capable of using lethal force to defend the society as a whole.
Due to this, it's important to always have sufficient quantities of very efficient weapons, exactly so that you would never have to put them to use.
War has existed as long as humans have. If you have any ideas for how to remove fear, aggression and disagreement from humans you might just be a god or a saint.
the context makes it even worse. its a strange kind of tribalism that is being promoted here. "do what you are asked to without understanding the real consequences". btw war is actual zero sum usually.
> If you have a job like that, or work at a company like that, the sooner you quit the better your outcome will be.
AI will render your job to be rent seeking. Like self driving cars will automate away truck drivers - do you not think they need to be laid off because of AI?
> Like self driving cars will automate away truck drivers - do you not think they need to be laid off because of AI?
geohot is talking about AI has its limitation and that it won't truly replace the human yet. Truck drivers and some people who contribute net positive value are not rent seekers at the moment.
AI could render our jobs to be rent seeking, we don't know when.
We have unions actively opposing self driving cars mainly to protect their own jobs.
In fact I think it’s much more common for a company to lay off because of real ai impact than anything else
i'm fairly certain Cory Doctorow does not understand the economics of Enshittification.
companies subsidise their products so that exploration of these products is more feasible due to lower initial costs for the end consumers. the initial consumers don't pay the full price but they are borne by the later consumers once the exploration is done and they have knowledge about that market and business.
Cory Doctorow also probably confuses democratisation and enshittifaction - its usually the case that products get cheaper by also marginally reducing the quality. we get cheap goods from China but that's not enshittification - that's just efficiency. as a consumer I'm happy I have the option of paying low prices for products.
i wouldn't take this person too seriously because it looks like they don't understand the larger picture
What are you talking about. Cory literally coined the term to describe this phenomena. He is not confused by the idea of cheaper products with wider appeal. He takes issue with vendor lock-in that is weaponized first against the end-user, then against paying customers, and finally against investors themselves. This is first and foremost a criticism of online products and platforms, not mass-produced gadgets from China.
In practice, tps is a reflection of vram memory bandwidth during inference. So the tps tells you a lot about the hardware you're running on.
Comparing tps ratios- by saying a model is roughly 2x faster or slower than another model- can tell you a lot about the active param count.
I won't say it'll tell you everything; I have no clue what optimizations Opus may have, which can range from native FP4 experts to spec decoding with MTP to whatever. But considering chinese models like Deepseek and GLM have MTP layers (no clue if Qwen 3.5 has MTP, I haven't checked since its release), and Kimi is native int4, I'm pretty confident that there is not a 10x difference between Opus and the chinese models. I would say there's roughly a 2x-3x difference between Opus 4.5/4.6 and the chinese models at most.
What about the VRAM requirement for KV cache? That may matter more than memory bandwidth. With these GPUs, there are more compute capacity than memory bandwidth than VRAM.
DeepSeek got MLA, and then DSA. Qwen got gated delta-net. These inventions allow efficient inference both at home and at scale. If Anthropic got nothing here, then their inference cost can be much higher.
DeepSeek also got https://github.com/deepseek-ai/3FS that makes cached reads a lot cheaper with way longer TTL. If Anthropic didn't need to invent and uses some expensive solution like Redis, as indicated by the crappy TTL, then that also contributes to higher inference cost.
> In practice, tps is a reflection of vram memory bandwidth during inference.
> Comparing tps ratios- by saying a model is roughly 2x faster or slower than another model- can tell you a lot about the active param count.
You sure about that? I thought you could shard between GPUs along layer boundaries during inference (but not training obviously). You just end up with an increasingly deep pipeline. So time to first token increases but aggregate tps also increases as you add additional hardware.
Hint: what's in the kv cache when you start processing the 2nd token?
And that's called layer parallelism (as opposed to tensor parallelism). It allows you to run larger models (pooling vram across gpus) but does not allow you to run models faster.
Tensor parallelism DOES allow you to run models faster across multiple GPUs, but you're limited to how fast you can synchronize the all-reduce. And in general, models would have the same boost on the same hardware- so the chinese models would have the same perf multiplier as Opus.
Note that providers generally use tensor parallelism as much as they can, for all models. That usually means 8x or so.
In reality, tps ends up being a pretty good proxy for active param size when comparing different models at the same inference provider.
reply