people like to say this like they’re apples to apples but this comparison isn’t remotely how the brain actually works - and even if it did, the brain does it automatically without direction and at an infitesimal percentage of the power required.
And we’re just talking about cognition - it completely ignores the automatic processes such as maintaining and regulating the body and it’s hormones, coordinating and maintaining muscles, visual/spacial processing taking in massive amounts of data at a very fine scale, and informing the body what to do with it - could go on.
One of the more annoying things about this conversation is you don’t even need to make this argument to make the point you’re trying to make, but people love doing it anyway. It needlessly reduces how amazing the human brain is to a bunch of catchy sci fi sounding idioms.
It can be simultaneously true that transformer based language models can be very smart and that the human brain is also very smart. It genuinely confuses me why people need to make it an either/or.
I’ve seen you/anthropic comment repeatedly over the last several months about the “thinking” in similar ways -
“most users dont look at it” (how do you know this?)
“our product team felt it was too visually noisy”
etc etc. But every time something like this is stated, your power users (people here for the most part) state that this is dead wrong. I know you are repeating the corporate line here, but it’s bs.
building for the loud users on a forum is generally a losing move. if we built notion for angry HN users, we'd probably be a great obsidian competitor with end to end encryption, have zero ai features, and make zero money.
Anecdotally the “power users” of AI are the ones who have succumbed to AI psychosis and write blog posts about orchestrating 30 agents to review PRs when one would’ve done just fine.
The actual power users have an API contract and don’t give a shit about whatever subscription shenanigans Claude Max is pulling today
Generalisations and angry language but I almost agree with the underlying message.
New tools, turbulent methods of execution. There's definitely something here in the way of how coding will be done in future but this is still bleeding edge and many people will get nicked.
Whatever makes you feel better about yourself, I guess. My account history on this topic is pretty easily searchable, but I guess it's easier to make driveby comments like this than be informed.
Eh, that’s not at all how I do it. I like to design the architecture and spec and let them implement the code. That is a fun skill to exercise. Sometimes I give a little more leeway in letting them decide how to implement, but that can go off the rails.
imho “tell them what you want and let them come up with a solution” is a really naive way to use these tools nearly guaranteed to end up with slopware.
the more up front design I’ve given thought to, they are usually very accurate in delivering to the point I dont need to spend very much time reviewing at all. and, this is a step I would have had to do anyway if doing it by hand, so it feels natural, and results in far more correct code more often than I could have on my own, and allows multitasking several projects at once, which would have been impossible before.
Modern "skills" and Markdown formats of the day are no different than "save the kittens". All of these practices are promoted by influencers and adopted based on wishful thinking and anecdata.
Uh, this couldn't be more false. I've implemented these from scratch at my company and rolled them out org-wide and I've yet to watch a youtube video and don't consume any influencers. Mostly by just using the tools and reading documentation - as any other technical tool.
Perhaps your blanket statement could be wrong, and I would encourage you to let your mind be a bit more open. The landscape here is not what it was 6 months ago. This is an undeniable fact that people are going to have to come to terms with pretty soon. I did not want to be in this spot, I was forced to out of necessity, because the stuff does work.
To be fair, if you have never watched a YouTube video in your life then how can you say the OP was wrong about what influencers are peddling? Side note, have you ever seen that Onion article on the man that can't stop telling people he doesn't own a TV?
Great, so how do you know this stuff works? Did you evaluate it against other approaches? How do you know it's actually reliable?
The Vercel team had some interesting findings[1]:
> In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it.
Others had different findings for commonly accepted practices[2], some you may have adopted from reading documentation, which surely didn't come from influencers.
And yet others swear by magical Markdown documents[3].
So... who is the ultimate authority on what actually works, and who is just cargo culting the trendy practice of the week? And how is any of this different from what was being done a few years ago?
Sorry, but from your first comment, I don’t particularly feel inclined to help you figure this out. I was just offering I’ve already deployed these things at a scale with success using many of the configuration options offered as documentation in the op here. this stuff isn’t some mystical blackbox, although you seem to think it is.
I measure the tooling success with a suite of small prompt tests performing repeatable tasks, measuring the success rate over time, educating the broader team, and providing my own tried and tested in the field skills that I’ve shared to similar successes to the broader teams. We’ve seen a huge increase in velocity and lower bug rate, which are also very easily measurable (and long evaluated stats) enough to put me in the position I am, which was not a reluctant one. You’re perfectly free to view my long history on this topic on this forum to see I am a complete skeptic on this topic, and wouldn’t be here unless I had to.
everyone is figuring this out still. There is no authority, I am my own authority on what I have seen work and what hasn’t. Feel free to take of that what you will. I just wanted to provide a counterpoint to your initial claim. I’m certainly not going to expose to a fine degree what has worked for my org and what hasn’t due to obvious reasons.
what? non techies are most at risk. There are a huge number of malicious skills. Not knowing or caring how to spot malicious behavior doesn’t mean someone shouldn’t be concerned about it, no matter how much they can’t or don’t want to do it.
I am an adminstrator of this stuff at my company and it’s an absolute effing nightmare devising policies that protect people from themselves. If I heard this come out of someone’s mouth underneath me I’d tell them to leave the room before I have a stroke.
And this is stuff like, if so and so’s machine is compromised, it could cost the company massive sums of money. for your personal use, fine, but hearing this cavalier attitude like it doesn’t matter is horrifying, because it absolutely does in a lot of contexts.
I run a small local non-profit which is essentially security hardening guide with some helper tooling that simplifies some concepts for non-techies (FDE, MFA, password managers etc).
LLMs have completely killed my motivation to continue running it. None of standard practices apply anymore
I am certainly not an expert but I agree a lot with your sentiment about the hubris - but the problem as presented in the article makes no sense to me.
If you see a value need for a receptionist, and you suspect that it is costing you thousands of dollars, wouldn't a normal response be, "I should think about hiring someone," rather than turning to an unproven, untested solution like this and leaving your business at the hands of how correct it is? I just cannot understand this line of thinking at all, reaching for a tool that would probably do a worse job than a human would do. Is it not wanting to hire? Not wanting to manage? Hype cycle? Where does this urge come from?
To take this further, if the focus really is the "luxury" part of the market, how do they expect this sort of response to go down well with customers?!
If someone is interested in paying luxury size fees, do they really want some cobbled together chatbot? I say this as an advocate for (high quality) chatbots for various practical needs, but it just seems like it is misunderstanding the customers (or maybe luxury is a bit of a loose term new in the area this mechanic works in?)
These customers own expensive cars - or at least, cars that were expensive when they were new. The car might now be ten years old or more, and the owner bought it used. They want a prestige marque, but the customer does not have the money to buy a new prestige car. So they are looking to save on service.
All the time I see cars with expensive names - BMW, Mercedes Benz - broken down on the side of the road, while old Hondas and Toyotas keep cruising by. Those are the customers for this shop: they spent all their money buying an expensive used car, and now they can't afford to maintain it and fix looming problems; meanwhile the Toyota or Hyundai driver gets maintenance and maybe even takes it to the dealer for it.
A mechanic like this can't afford to hire someone to answer the phone. Such a person is expensive, and these customers want rock-bottom prices despite the car being expensive. So a chatbot is good enough and better than nothing.
The most trustworthy mechanic I used in England had an appointment book pretty much full for four months in advance. He didn't answer the phone, didn't have a computer, just a desk diary. If you wanted him to work on your car you turned up at his workshop and spoke to him. If you were willing to wait until he'd finished whatever thing he was doing he'd take a quick look at your car and suggest a course of action. And despite his full order book if something looked urgent enough and small enough he'd fit you in quite quickly.
He charged reasonable prices, but definitely not rock bottom. He had no need to compete with the bottom feeders because every customer acted as his public relations agent.
Business owners tend to resent having to rely on and pay their workers.
Many of them believe people should line up and volunteer/be forced to work at their companies for free, the fact that they have to pay them is an insult.
They need workers, but workers are not worthy of being needed by them, or paid, so they look for any out at all.
The word you’re looking for is greed. These systems are greed enablers. The narrative used to pump them plays on greed. And so on.
Hiring a person for the job is 3000$ per month? Great let’s try to do this with 500$ and a tangle of vibecoded toothpick bridges!
For a luxury service with generous margins this is a failure-prone mentality.
They'd still try to replace workers, even if their attempted route of automation cost them more than hiring employees would, because of their resentment towards them
Aside from a cost? It's also managing the actual human being, and making sure they have enough work. If the place has 5-10 calls a day, then it's pointless to hire receptionist that will do nothing for 1 hour, and then get 2 minutes chat. It used to be pointless to build software to do that, but since claude code it's cheap enough to make sense.
receptionist as a service has been a thing for like... forever. You are never going to solve the problem of accurately estimating and quoting with AI or an answering service, so pay for someone to answer the phone and take down the details; have a mechanic or trained service rep review and estimate. Cheap code that doesn't solve the problem is not cheap.
Yes, of course. The bot can request information and the customer can provide it if they feel like it, and then someone qualified can call them back when they have their hands free.
But there's no bot, per se, needed at all. An answering machine from 1993 can do this same information-gathering job. :)
So update the device from 1993's new-fangled digital answering machine to 2009's Google Voice, and have it do the transcription from voicemail to text.
Someone will still have to call Bill back about his Honda (which is actually the Kia he bought for his daughter -- Bill is not a very technical guy these days[1] and he confuses such concepts regularly) in order to get any trading of money for services done.
It doesn't take an LLM to get there, and Bill would probably prefer to avoid being frustrated by the bot's insistent nature.
Look, you‘re kicking an open door.
I think LLMs applied like this are just a layer of complexity that os mostly replacing lower level programming solutions that could do the same thing
The transcription + callback loop is honestly underrated.
Most of the value here is just capturing intent accurately
("Honda" vs "Kia" aside) so the mechanic can prioritize
callbacks. A dumb voicemail-to-text pipeline handles that
fine. The LLM layer adds complexity without solving the
actual bottleneck, which is someone qualified picking up
the phone.
But I'm not sure that a bot can be trusted to make good decisions about priority, either. So even if it makes good decisions based on context (which it can increasingly-often do, but does not always do), it lacks the context that is necessary to form the basis of good decisions.
Suppose a message comes into the box with this form: "This is Wendy, can you call me? My car is making that noise again."
The bot might deprioritize that call because it lacks actionable contextual information. "My job as a bot is to get more jobs into the shop. This call does not have enough data to do that, so I'll shove to the bottom of list of callbacks behind more-actionable jobs."
But the mechanic? The mechanic knows Wendy's Ford very well, and he also knows Wendy. She's a been a good customer for over a decade. The mechanic also knows the noise, and that Wendy has 3 little kids and that she's vacationing 900 miles away on a road trip with those kids in that Ford. The context is all there inside of the mechanic's brain to combine and mean that this might be the highest-priority call he gets all week.
Wendy may not have actively relayed any urgency in her message, but the urgency is real and she needs called back right away. She needs answers about what to do (keep driving and look into it when she gets back? pull over immediately and get a tow to a decent local shop? maybe she even needs help finding such a shop?) pretty much immediately. Not because it means more business today, but because it means more business for years.
The mechanic can spot this from a list of transcripts in an instant and give her a ring back Right Now. The bot is NFG at this.
The addition of the bot only adds noise to the process, and that noise only works to Wendy's detriment. When the bot adds detrimental noise to Wendy's situation, it also adds detriment to the shop's longevity.
The presence of the bot -- even as a prioritizing sorting mechanism -- asymptotically shifts the state from an excellent shop that knows their customers very well to a bot-driven customer-averse hellscape.
(And no, the answer isn't to make the bot into an all-knowing oracle that actively gets fed all context. The documentation burden would be more expensive, time-wise (and thus money-wise) than hiring a competent human receptionist who answers the phone, handles the front door traffic, and absorbs context from their surroundings. A person who chatted with Wendy last Thursday right before she left for her trip is always going to be superior to a bot.)
If someone put on their website and voicemail that they were available for calls only from 8-10am (for example), or that they would return my call at that time, I'd make a point to call them then. It's reasonable that people are busy too.
Because the capital owning class in America commonly has an aversion to labor.
Labor is other humans and all their social hierarchy monkey brain bullshit activates in a way that a machine doesn’t. That’s why you’ll see companies spending equivalent or even slightly more money for a tool to do a job over a human being.
Walmart employs this amount of workers only because it is subsided by food stamps and other government assistance. The minute they were forced to actually pay for the labor they employ would fire a lot of people
You are suggesting that if the government gives you a tax break, your boss would lower your salary? Why does your boss wait for the tax break or handout and doesn't just lower your salary now?
Also what's your counterfactual here? If Walmart fired their employees tomorrow and replaced them with robots, those ex-employees would magically no longer need food stamps nor government assistance? (Or more realistically: Walmart could pivot to the Aldi model of labour and replace many low intensity jobs with fewer higher intensity jobs. For the affected workers, the outcome is the same.)
If those ex-workers don't magically get off government assistance, if Walmart is out of the picture, in what sense is Walmart to blame for their poverty?
Conversely: if Walmart laying off these workers would magically improve their welfare, why do these workers wait for Walmart to lay them off?
> Walmart could pivot to the Aldi model of labour and replace many low intensity jobs with fewer higher intensity jobs.
Yes, this is the expected change.
> For the affected workers, the outcome is the same.
No? There are two classes of affected workers:
1. Workers who have been converted to full-time with benefits. These workers benefit from the change.
2. Workers who lose their jobs. These workers are worse off.
Your argument ignores class 1.
I don't think we'll get anywhere debating the relative merits of the tradeoff of those two groups, but I personally prefer the existence of class 1. At least with that class there are some winners.
There's practically no (1). It's a different class of workers, of people than who Walmart currently employs at low intensity and low pay.
People who prefer a higher intensity, higher paying job than the bottom rung at Walmart can already get that kind of job today. They don't need to wait for Walmart to fire everyone else.
Walmart has some of these jobs already, probably. But Aldi and other companies exist. The whole Jeff Bezo's workout at Amazon Warehouses falls in a similar category too: Amazon pays pretty well for the sector and requires no prior experience, but they expect you to stay on your feed throughout.
> Walmart employs this amount of workers only because it is subsided by food stamps
And then those food stamps are used at Walmart, its a win win for Walmart and Walmart. No other country gives their poor food stamps instead of money, I wonder why?
I'm projecting, but I think you're right. Not wanting to manage is probably a large driver. I can imagine that if you've dealt with messy humans before, that a robot receptionist that's not going to show up late, call out when hungover, need an advance for a family member's surgery and then quit, is quite attractive.
Until the robot breaks for reasons unknown and you have to pay for expensive engineering time to fix it. Surprise, since the engineer vibe coded the whole thing, he also has no idea how to fix it except to get the AI to try.
> If you see a value need for a receptionist, and you suspect that it is costing you thousands of dollars, wouldn't a normal response be, "I should think about hiring someone," [...]
If you only have thousands of dollars is savings from the move, hiring someone might be too expensive.
And we’re just talking about cognition - it completely ignores the automatic processes such as maintaining and regulating the body and it’s hormones, coordinating and maintaining muscles, visual/spacial processing taking in massive amounts of data at a very fine scale, and informing the body what to do with it - could go on.
One of the more annoying things about this conversation is you don’t even need to make this argument to make the point you’re trying to make, but people love doing it anyway. It needlessly reduces how amazing the human brain is to a bunch of catchy sci fi sounding idioms.
It can be simultaneously true that transformer based language models can be very smart and that the human brain is also very smart. It genuinely confuses me why people need to make it an either/or.
reply