Hacker Newsnew | past | comments | ask | show | jobs | submit | syntaxing's commentslogin

Dad and millennial here and this change has been very noticeable in my circle of friends including myself and I’m all for it. Men have been doing their share of housework too. But I will say, it’s not all dads but enough that I think this will have a positive effect on the next generation.

Im gay and because of that was disowned. My partner has a brother “K” and K has three children. Watching K show up in basic ways for his kids, like remembering what songs they like and teaching them sports is the fastest way to make me ugly cry.

Thanks to anyone reading this if you’re trying to be a good dad. You’re making the world a better place in ways you don’t even see


This is the rare time I wish HN had emoji reactions instead of just upvotes.

I can honestly say that I don't have any time for a dad who isn't all-in for their kids. I understand if the responsibilities aren't 50/50, but if you're making mom handle everything I think you're a loser.

All my millennial dad friends clean, change diapers, cook, whatever. And make no mistake all the moms are incredibly hard-working and involved with the kids.

If I happened to meet socially a dad who wasn't doing those things I would literally make fun of them. "You're a grown man who can't change a diaper or clean a bathroom?"


I’m with you mostly. Some different specifics but the point in mind is this: it’s a common thread of rapport and conversation. I sometimes feel like an alien on earth when I spend time with friends or other groups where there seems to be a atrong “ughh my family and home life” vibe.

I said hello to another dad at soccer for three-year-olds, and he responded with something like, "Ugh, I'd rather be ANYWHERE else".

It's 10am on a Saturday and you're running around playing games with your kid. I just stared at him and went on.


I was strongly encouraged by my own parents, particularly my dad, to play sports (baseball, a bit of basketball) as a kid; even though I wasn't very good at them and wasn't very interested in them (and got made fun of by other kids for this). At some point I realized that me playing sports was something my dad was more invested in than I was. When I was 11 or so, I finally decided that I had had enough, and quit the neighborhood little league baseball team I was on in the middle of the season; I suspect the team was happy to have me gone, and I was happy that trying to play baseball was no longer my problem. Suffice to say, I have no happy memories of playing catch with with my dad at any time in my life.

My younger siblings were a bit more intrinsically interested in sports than I was, and my parents shifted their attention to their sports extracurriculars. I actually don't really remember what they did sports-wise because I did not care at all; and although I was the older sibling I was not so much older that anyone thought it was important to encourage me to take a pseudo-parental or caretaker interest in what my younger siblings were doing. I would go to the baseball field where one brother played his games because my parents were going, and then amuse myself by playing alone in the dirt beyond the bleachers, because that was more fun than paying attention to the game. By the time I was old enough to, say, drive them places in lieu of our mom, they had gotten to the age where sports were meaningfully competitive and were not actually good enough to keep playing.

So not only do I find this dad's attitude extremely sympathetic, I think that I would've found it sympathetic even when I myself was a child. This makes me some kind of outlier, I'm sure. Anyway, 3 years is young enough that there's no actual soccer happening, just running around with a ball, any kid can enjoy that. It's quite possible that, depending on the interests and dispositions of his kid, that dad won't be compelled to be on a soccer field at 10am much further in the future.


Exactly.

My older daughter is on a competitive cheerleading team. Not something we (parents) suggested but instead she found through school friends. She loves it. Has boosted her confidence and athletic prowess.

There aren't many dads at the meets relative to moms. Not remotely surprising. I'm the first person to admit that I don't know how to do hair or make up.

I see quite a divergence among the men in commentary. Some are there and happy their kids are loving it - they're finding a way to make peace with the situation. Some are checked out, on phones, looking grumpy at best.

Some part of me gets it. Wild asymmetry in that sport. Performances are just a few minutes long, but there's a shit-ton of practice and weekend days/entire weekends dedicated to cheer.

It would be so so so easy to say "get me out of here" but I've found a way to enjoy and make peace and make a friend or two along the way.

Contrast with her other current sport: lacrosse. First season and it's kind of a shit-show. But I'm with her in the sun on a Friday night - and with the right weather - it is a great place to be. We (parents, dads, etc.) see our friends there too.


I always wondered if Justin Kan’s Atrium closed door prematurely by just 2-3 years. It would have been cool to see a “technology” driven law firm and how it would have adjusted to LLMs.

There are loads of them now. Great for trivial work. Not so great to highly templatise more complex matters.

This is a very interesting strategy that might pay off. This model is a very good option for enterprise self host. I would argue a lot of companies are VRAM constrained rather than compute constrained. You could fit 4-5 running instances on one H100 cluster where you can only fit 1-2 Kimi K2 or GLM5.

This is 128B dense though. the K/V cache on long context is going to be massive

Don’t think kv size correlates to dense/moe

KV size correlates with attention parameters which are a subset of active parameters. So a typical MoE model will have way lower KV size than a dense model of equal total parameter count.

With turbo quant, you would reduce it by over 6X.

Is RAG dead? I would be very surprised a local small SOTA embedded model like llama-embed-nemotron-8b doesnt outperform the Haiku layer for this application. Should be pretty cheap and easy to prove out. With 32K context size, you can literally one shot the whole ticket.

Yea, but RAG takes effort. At the very least some kind of system to organize the documents and do the retrieval.

My theory is that the AI frenzy has reached new levels of insane, where it's literally just throw anything and everything at the model, and just burn tokens to let the AI figure everything out. Why bother paying the upfront cost for a RAG, when the models/agents are constantly evolving, so just slap in a markdown file telling it to check a folder, and call it a day.

Like in design world, people are doing minor tweaks like changing the spacing by typing in prompts instead of just changing a number in an input field. We are legitimately approaching just using llms instead of calculators, or memes like that endpoint that calls an llm to generate the code to do some business logic, rather than directly code the logic.


IMO RAG is mostly dead. The game changer with newer models like Opus is the reasoning. So instead of pushing all the context up front (RAG style), it's better to give strong primitives (eg. bash, SQL) and let the agent figure it out.

It's what Claude Code is doing now and the principles we applied for Mendral as well.

That said, you're right that some smaller models can outperform Haiku and we're thinking supporting oss models at some point. But it does not change the core design principles IMO.


It's more accurate to say that RAG is alive and well and is just incorporated into the agent's responsibility, it's just one more tool that it can call on instead of the user manually doing it.

Been using Qwen 3.6 35B and Gemma 4 26B on my M4 MBP, and while it’s no Opus, it does 95% of what I need which is already crazy since everything runs fully local.

It’s good enough that I’ve been having codex automate itself out of a job by delegating more and more to it.

Very excited for the 122b version as the throughput is significantly better for that vs the dense 27b on my m4.


You've got me curious. Two questions if I may:

- What kind of tasks/work?

- How is either Qwen/Gemma wired up (e.g. which harness/how are they accessed)?

Or to phase another way; what does your workflow/software stack look like?


1. Qwen is mostly coding related through Opencode. I have been thinking about using pi agent and see if that works better for general use case. The usefulness of *claw has been limited for me. Gemma is through the chat interface with lmstudio. I use it for pretty much everything general purpose. Help me correct my grammar, read documents (lmstudio has a built in RAG tool), and vision capabilities (mentioned below, journal pictures to markdown).

2. Lmstudio on my MacBook mainly. You can turn on an OpenAI API compatible endpoint in the settings. Lmstudio also has a headless server called lms. Personally, I find it way better than Ollama since lmstudio uses llama cpp as the backend. With an OpenAI API compatible endpoint, you can use any tool/agent that supports openAI. Lmstudio/lms is Linux compatible too so you can run it on a strix halo desktop and the like.


Curious how do you run opencode and qwen locally? Few times I tried it responds back with some nonsense. Chat, say, through ollama works well.

Which quants are you using? I had similar issue until I used Unsloth’s. I would recommend at least UD_6. Also, make sure your context length is above 65K.

https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF


Thanks I appreciate the info. I may try to spin up something like this and give it a whirl.

I would recommend trying oMLX, which is much more performant and efficient than LM Studio. It has block-level KV context caching that makes long chats and agentic/tool calling scenarios MUCH faster.

can you expand more on what you mean by 95%?

There are 2 aspects I am interested in:

1. accuracy - is it 95% accuracy of Opus in terms of output quality (4.5 or 4.6)?

2. capability-wise - 95% accuracy when calling your tools and perform agentic work compared to Opus - e.g. trip planning?


1. What do you mean by accuracy? Like the facts and information? If so, I use a Wikipedia/kiwx MCP server. Or do you mean tool call accuracy?

2. 3.6 is noticeably better than 3.5 for agentic uses (I have yet to use the dense model). The downside is that there’s so little personality, you’ll find more entertainment talking to a wall. Anything for creative use like writing or talking, I use Gemma 4. I also use Gemma 4 as a “chat” bot only, no agents. One amazing thing about the Gemma models is the vision capabilities. I was able to pipe in some handwritten notes and it converted into markdown flawlessly. But my handwriting is much better than the typical engineer’s chicken scratch.


by accuracy I meant how close is the output to your expectations, for example if you ask 8B model to write C compiler in C, it outputs theory of how to write compiler and writes pseudocode in Python. Which is off by 2 measures: (1) I haven't asked for theory (2) I haven't asked to write it in Python.

Or if you want to put it differently, if your prompt is super clear about the actions you want it to do, is it following it exactly as you said or going off the rails occasionally


Ironically, even though I write C/++ for a living, I don’t use it for personal projects so I can’t say how well it works for low level coding. Python works great but there’s a limit on context size (I just don’t have enough RAM, and I do not like quantizing my kv cache). Realistically, I can fit 128K max but I aim for 65K before compacting. With Unsloth’s Opencode templating, I haven’t had any major issues but I haven’t done anything intense with it as of late. But overall, I have not had to stop it from an endless loop which happened often on 3.5.

I have a Supernote and was looking at different models for handwriting recognition, and I agree that gemma4-26B is the best I’ve tried so far (better than a qwen3-vl-8B and GLM-OCR). Besides turning off thinking, does your setup have any special sauce?

Q8 or Q6_UD with no KV cache quantization. I swear it matters even more with small activated parameters MOE model despite the minimal KL divergence drop

Do you use it with ollama? Or something else?

Llama cpp is vastly superior. There was this huge bug that prevented me from using a model in ollama and it took them four months for a “vendor sync” (what they call it) which was just updating ggml which is the underpinning library used by llama cpp (same org makes both). lmstudio/lms is essentially Ollama but with llama cpp as backend. I recommend trying lmstudio since it’s the lowest friction to start

Yes and no. Are you using open router or local? Are the models are good as Opus? No. But 99% of the time, local models are terrible because of user errors. Especially true for MoE, even though the perplexity only drops minimal for Q4 and q4_0 for the KV cache, the models get noticeably worse.

Sounds like you're accusing a professional of holding their tool incorrectly. Not impossible, but not likely either.

Inferencing is straight up hard. I’m not accusing them of anything. There’s a crap ton of variables that can go into running a local model. No one runs them at native FP8/FP16 because we cannot afford to. Sometimes llama cpp implementation has a bug (happens all the time). Sometimes the template is wrong. Sometimes the user forgot to expand the context length to above the 4096 default. Sometimes they use quantization that nerfs the model. You get the point. The biggest downside of local LLMs is that it’s hard to get right. It’s such a big problem, Kimi just rolled out a new tool so vendors can be qualified. Even on openrouter, one vendor can be half the “performance” of the other.

What does heavy RL even mean…similar to how the CEO of cursor said how much better the perplexity got when it’s a terrible metric for model fine tune performance? Let’s be real here, it’s Kimi 2.5 fine tuned for Cursor. There’s nothing wrong with that but they tried to hide it and it’s some work they put in but nothing close to training a model of their own.

60B for Composer 2…that is built from Kimi K2… what ever happened to “Grok being the best”?

Am I the only one that thinks Composer is really good, when you factor in the speed and the cost?

I don’t doubt it is. End of the day, it’s a fine tuned Kimi. They tried to hide it and making their work sound more impressive than it is. It’s easy to have stuff be cheap when you don’t have to train your own model from scratch.

Composer is clearly dumber than the rest but then I only ask it dumb questions and it answers them really quickly.

yes, you are

With GitHub and Anthropic reducing subscription features, Chinese providers are looking more and more tempting.

Until you work for a company or government agency that is subject to any sort of technology audit. The moment offshore processes running in China comes up you'll have a never ending hole of questions to answer.

As the engineering saying goes, nothing more permanent than a temporary solution

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: