Hacker Newsnew | past | comments | ask | show | jobs | submit | vorticalbox's commentslogin

I have been using remotely save and a free bucket from backblaze. It as a s3 compatible api so works using the s3 feature.

I'm doing the same since this is the only method I found I can let my bot access the files, something I couldn't achieve with Obsidian Sync.. until now!

This is why tutorials in programming don't really teach much because you get the finished version. Not all the wrong steps that were taken, why they failed, what else was tried.

These steps are what help you solve other issues in the future.


> prohibitions on domestic mass surveillance

so foreign mass surveillance is all good?


could we not instruct the LLM to run build commands in a sub agents which could then just return a summary of what happened?

this avoids having to update everything to support LLM=true and keep your current context window free of noise.


Make (or whatever) targets that direct output to file and returns a subset have helped me quite a bit. Then wrap that in an agent that also knows how and when to return cached and filtered data from the output vs. rerunning. Fewer tokens spent reading output details that usually won't matter, coupled with less context pollution in the main agent from figuring out what to do.

q() { local output output=$("$@" 2>&1) local ec=$? echo "$output" | tail -5 return $ec }

There :)


That would achieve 1 of the 3 wins.

If you use a smaller model for the sub agent you get all three

Of course you can combine both approaches for even greater gains. But Claude Code and like five alternatives gaining an efficient tool-calling paradigm where console output is interpreted by Haiku instead of Opus seems like a much quicker win than adding an LLM env flag to every cli tool under the sun


Probably the main one, people mostly complain about context window management rather than token usage

Dunno about that. Having used the $20 claude plan, I ran out of tokens within 30 minutes if running 3-4 agents at the same time. Often times, all 3-4 will run a build command at the end to confirm that the changes are successful. Thus the loss of tokens quickly gets out of hand.

Edit: Just remembered that sometimes, I see claude running the build step in two terminals, side-by-side at nearly the same time :D


one thing that aways slowed me down was writing jsdocs and testing.

Now i can write one example of a pass and then get codex to read the code and write a test for all the branches in that section saves time as it can type a lot faster than i can and its mostly copying the example i already have but changing the input to hit all the branches.


> let's have LLMs check our code for correctness

Lmao. Rofl even.

(Testing is the one thing you would never outsource to AI.)


Outsourcing testing to AI makes perfect sense if you assume that tests exist out of an obligation to meet some code coverage requirements, rather than to ensure correctness. Often I'll write a module and a few tests that cover its functionality, only for CI to complain that line coverage has decreased and reject my merge! AI to the rescue! A perfect job for a bullshit generator.

outsourcing testing the AI also gets its code to be connected to deterministic results, and show let the agent interact with the code to speculate expectations and check them against the actual code.

it could still speculate wrong things, but it wont speculate that the code is supposed to crash on the first line of code


> Testing is the one thing you would never outsource to AI

That's not really true.

Making the AI write the code, the test, and the review of itself within the same session is YOLO.

There's a ton of scaffolding in testing that can be easily automated.

When I ask the AI to test, I typically provide a lot of equivalence classes.

And the AI still surprises me with finding more.

On the other hand, it's equally excellent at saying "it tested", and when you look at the tests, they can be extremely shallow. Or they can be fairly many unit tests of certain parts of the code, but when you run the whole program, it just breaks.

The most valuable testing when programming with AI (generated by AI, or otherwise) are near-realistic integration tests. That's true for human programmers, but we take for granted that casual use of the program we make as we develop it constitutes as a poor man's test. When people who generally don't write tests start using AI, there's just nothing but fingers crossed.

I'd rather say: If there's one thing you would never outsource to AI, it's final QA.


Yep. I have had had success getting AI to write tests. They all needed review, but still a massive speed up for me.

It made about 2 mistakes in over 100 tests, and the coverage of the tests was higher than I would have attempted.

So about 2 hours of work instead of 1 or 2 days of boring effort avoided and a better outcome.


> (Testing is the one thing you would never outsource to AI.)

I would rephrase that as "all LLMs, no matter how many you use, are only as good as one single pair of eyes".

If you're a one-person team and have no capital to spend on a proper test team, set the AI at it. If you're a megacorp with 10k full time QA testers, the AI probably isn't going to catch anything novel that the rest of them didn't, but it's cheap enough you can have it work through everything to make sure you have, actually, worked through everything.


You don't use the LLM to check your code for correctness; you use the LLM to generate tests to exercise code paths, and verify that they do exercise those code paths.

And that test will check the code paths are run.

That doesn't tell you that the code is correct. It tells you that the branching code can reach all the branches. That isn't very useful.


can the agent not simply be instructed to save the "why" in the commit message?


I have started using openspec for this. I find it works far better to have a proposal and a list of tasks the ai stays more focused.

https://openspec.dev/


I have been using glm-4.7 a bunch today and it’s actually pretty good.

I set up a bot on 4claw and although it’s kinda slow, it took twenty minutes to load 3 subs and 5 posts from each then comment on interesting ones.

It actually managed to correctly use the api via curl though at one point it got a little stuck as it didn’t escape its json.

I’m going to run it for a few days but very impressed so for for such a small model.


you can check the github https://github.com/cryptomator/ios


Even if I had the skills to confirm the code is secure, how could I know that this is the code running on my phone, without also having the skills to build and deploy it from source?


Also, you need to make sure that the installation process does not insert a backdoor into the code you built from source.



This is like opencode, it’s seems all the coding agents are converging on the same features.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: