More

upghost · 2026-04-05T02:44:54 1775357094

Man that is such a bummer. The Naval Support Activity (NSA) "base" is not a hardened military facility. I've never been to the one in Bahrain, but it's usually where you go to play ultimate frisbee, maybe some paintball if you are lucky, and other types of R&R. Usually have a Naval Exchange (NEX) which is like a really discounted 7-11 / gift shop / walmart (depending on where you are).

subscribed · 2026-04-05T02:57:15 1775357835

What do you mean by "that is such a bummer"?

That military base of the aggressors targeting civilians is being hit in return, and the sailors get to save their lives?

Yeah, such a tragedy that a military facilitiy is damaged in response to intentional killing of civilians.

Sorry but that's some....... surprising concern.

pjjpo · 2026-04-05T04:11:43 1775362303

Schools getting blown up is also a bummer. Everything about this situation and maybe the world is a bummer.

As soon as we stop treating these as bummers, there is literally nothing stopping a cycle of destruction. There may not be anyways, I don't know but giving up on empathy entirely seems even more dangerous than being bad at it.

subscribed · 2026-04-05T11:17:36 1775387856

I have plenty of the sympathy for the victims but none FIR the aggressors in this illegal war.

You seem to be suggesting that not feeling sorry for the soldiers who got to evacuate without all their belongings somehow means I'm losing my humanity. That's a dangerous thing - lives of the innocent civilians who didn't chose to be bombed are more important. Aggressors could simply.... Leave and stop being in danger.

Similarly I have little pity for Russian soldiers losing lives in another illegal war of aggression, knowing how many war crimes they committed in their wake.

Better?

upghost · 2026-04-04T00:20:48 1775262048

Great writeup. Only thing I din't see in here was an analysis of the impact of players like Talaas[1] and their stupid faster hardware LLMs.

I feel like it could be majorly disruptive, but idk if it's going to prolong the apocalypse or bring it about sooner -- or if it's a big nothing burger.

But the demo[2] is super cool.

[1]: https://taalas.com

[2]: https://chatjimmy.ai/

mathgladiator · 2026-04-04T02:07:27 1775268447

I'm bullist for something like talaas to get smaller and easy to put in a desktop. Imagine an RPG where NPCs.... are way more complex and the entire game is very non deterministic.

abroszka33 · 2026-04-04T10:34:41 1775298881

I think I would like that as well. The problem is that if we bake an LLM into HW and make it cheaper and very efficient to run, then all games will have the same AI slop content, which could get boring pretty fast. The alternative is that these cards should load a different / fine-tuned LLM per game, but then we already have GPUs for that and today's LLMs are nowhere near good enough at the size which a GPU can run.

tmikaeld · 2026-04-05T06:34:30 1775370870

They claim to have qwen 3.5 27B on a card at end of year on the market. If they do, I’ll be buying one immediately.

upghost · 2026-03-25T22:22:05 1774477325

> familiarity vs simplicity

Love this, I've never heard it put that way before.

iLemming · 2026-03-26T15:56:08 1774540568

Rich Hickey did in "Simple made easy" talk.

upghost · 2026-03-18T03:07:58 1773803278

> Pre-training allows organizations to build domain-aware models by learning from large internal datasets.

> Post-training methods allow teams to refine model behavior for specific tasks and environments.

How do you suppose this works? They say "pretraining" but I'm certain that the amount of clean data available in proper dataset format is not nearly enough to make a "foundation model". Do you suppose what they are calling "pretraining" is actually SFT and then "post-training" is ... more SFT?

There's no way they mean "start from scratch". Maybe they do something like generate a heckin bunch of synthetic data seeded from company data using one of their SOA models -- which is basically equivalent to low resolution distillation, I would imagine. Hmm.

qntty · 2026-03-18T11:58:04 1773835084

Pre-training mean exposing an already-trained model to more raw text like PDF extracts etc (aka continued pre-training). You wouldn't be starting from scratch, but it's still pre-training because the objective is just next token prediction of the text you expose it to.

Post-training means everything else: SFT, DPO, RL, etc. Anything that involves things like prompt/response pairs, reward models, or benefits from human feedback of any kind.

losvedir · 2026-03-18T12:05:25 1773835525

Er, then what is the "already trained" model? I thought pre-training was the gradient descent through the internet part of building foundational models.

upghost · 2026-03-18T23:11:54 1773875514

Yeah, this checks out. I wonder what they are doing to prevent semantic collapse. Also, I wonder if the base model would already be instruct and RLHF tuned or only pre-trained. Trying to do additional training without semantic collapse in a way that is meaningful would be interesting to understand. Presumably they are using adapters but I've never had much luck in stacking adapters.

i.e.:

1. Do I start with an RLHF tuned model, "pretrain" on top of that (with adapter or by freezing weights?), then SFT on top of that (stack another adapter, or add layer(s) and freeze weights?) (and where did I get the dataset? synthetic extraction from corpus?), then RL (adapter, add layer(s) and freeze?)

2. or do I start at SF tuned model, ...

3. or do I start at raw pre-trained model, ...

Would love to know what the matrix used was.

mirekrusin · 2026-03-18T05:07:54 1773810474

Probably marketing speak for full fine-tuning vs PEFT/LoRA.

anon373839 · 2026-03-18T03:38:52 1773805132

I think they are referring to “continued pretraining”.

lelanthran · 2026-03-18T07:47:04 1773820024

I would guess:

Pre-training: refining the weights in an existing model using more training data.

Post-training: Adding some training data to the prompt (RAG, basically).

stingraycharles · 2026-03-18T03:20:04 1773804004

I can imagine that, as usual, you start with a few examples and then instruct an LLM to synthesize more examples out of that, and train using that. Sounds horrible, but actually works fairly well in practice.

gunalx · 2026-03-18T07:05:56 1773817556

Probably just means SFT fine-tuning a base model, vs behavioural dpo and/or SFT fine-tuning a instruction model.

upghost · 2026-03-12T01:38:24 1773279504

> We are doing this to self-fund further investment in AI and enterprise sales while strengthening our financial profile.

Some quotes from the video:

> ...at the same time, we're a people company.

> Your work will live on in our products.

> Doing the right thing for Atlassian while acting with humanity and doing the right thing for all those on all sides of this set of decisions.

Wow. There's a lot to unpack here.

Maxious · 2026-03-12T08:23:37 1773303817

> Mr Cannon-Brookes told investors he “couldn’t be more bullish” about the opportunities ahead, despite relentlessly selling his own shares in the company daily. The Nightly reports he kept selling 7665 shares on a daily basis even in the month prior to the results at prices ranging from $US161.11 (AU$227) a share on January 8 to $US105.14 on February 4.

> While ordinary Aussies are asked to make big changes, the 46-year-old decided to treat himself to a ritzy new private jet late last year, admitting to a “deep internal conflict” over the carbon-heavy method of travel.

> The Atlassian co-founder and CEO bought a Bombardier 7500 and will use it to travel across his vast business operations, which include a minority stake in the Utah Jazz NBA team and a sponsorship deal with Formula 1.

https://www.msn.com/en-au/money/other/aussie-sacks-1600-afte...

upghost · 2026-03-07T17:06:48 1772903208

Very interesting stuff. Apparently this is the implementation: https://github.com/dicpeynado/prolog-in-forth

Thinking about the amount of thought and energy that went into this, back in 1987 -- mostly preinternet, pre-AI. Damn.

I feel really lucky that we get to build on things like this.

jhbadger · 2026-03-07T18:02:27 1772906547

There's a great 1986 book "Designing and Programming Personal Expert Systems" by Feucht and Townsend that implements expert systems in Forth (and in the process, much of the capability of Prolog and Lisp).

racingmars · 2026-03-07T19:39:45 1772912385

Ha,you beat me to it! That book was my first thought when I saw this post. I have a copy sitting here on my bookshelf.

Just to expand on how bonkers this book is... they assume that everyone has easy access to a Forth implementation. So they teach you how to build a Lisp on top of it. Then they use the Lisp you just built to build a Prolog. Then, finally, they do what the topic of the book actually is: build a simple expert system on top of that Prolog.

I love it!

jhbadger · 2026-03-07T21:03:34 1772917414

To be fair, in the 1980s thanks to the Forth Interest Group (FIG), free implementations of Forth existed for most platforms at a time when most programming languages were commercial products selling for $100 or more (in 1980s dollars). It's still pretty weird, but more understandable with that in mind.

upghost · 2026-02-24T13:27:51 1771939671

I'm surprised how hard I had to dig for an actual example of syntax[1], so here you go.

[1]: https://www.lix.polytechnique.fr/~dale/lProlog/proghol/extra...

Antibabelic · 2026-02-24T13:32:58 1771939978

There is also an implementation of 99 Bottles of Beer on Rosetta Code: https://rosettacode.org/wiki/99_bottles_of_beer#Lambda_Prolo...

Twey · 2026-02-25T20:54:18 1772052858

Constantly amused by the split in comments of any moderately innovative language post between ‘I don't care about all this explanation, just show me the syntax!’ and ‘I don't understand any of this syntax, what a useless language!’

If the language is ‘JavaScript but with square brackets instead of braces’ maybe the syntax is relevant. But in general concrete syntax is the least interesting (not least important, but easiest to change) thing in a programming language, and its similarity to other languages a particular reader knows less interesting still. JavaScript is not the ultimate in programming language syntax (I hope!) so it's still worth experimenting, even if the results aren't immediately comprehensible without learning.

upghost · 2026-03-07T17:16:52 1772903812

In Prolog the syntax is incredibly important. It is designed to be metainterpreted with the same ease in which a for-loop might be written in another language.

https://www.metalevel.at/acomip/

  mi1(true).
  mi1((A,B)) :-
        mi1(A),
        mi1(B).
  mi1(Goal) :-
        Goal \= true,
        Goal \= (_,_),
        clause(Goal, Body),
        mi1(Body).

This can be arbitrarily extended in very interesting, beautiful, and powerful ways. This is extraordinarily hard to achieve and did not happen by accident.

As a challenge, see how easy it is to write a metainterpreter in another language of your choice. Alternately, see if you can think of any way the metainterpretation system in Prolog could be improved.

Finally, think of what would happen to this if we changed the syntax and introduced something like object.field notation.

So while logical programming can be achieved with other syntaxes, the metaintrepretive aspect will be lost. I have yet to see a language that does this better.

Twey · 2026-03-09T05:30:15 1773034215

Nice link, thank you! I'm not sure it's super related to my comment but it is closely related to some other things I'm thinking about. I'll give it a read :)

cess11 · 2026-02-24T19:08:56 1771960136

There are some examples in this tutorial PDF:

https://www.lix.polytechnique.fr/Labo/Dale.Miller/lProlog/fe...

tmaly · 2026-02-24T16:00:16 1771948816

I have written stuff in Prolog, but I find this lambda Prolog syntax very difficult to grok.

neuroelectron · 2026-02-24T13:44:31 1771940671

So brainfuck x lisp

cpill · 2026-02-24T19:03:59 1771959839

Christ... it's incomprehensible... I guess that ones staying in academia :P

upghost · 2026-02-21T14:55:36 1771685736

So I have been doing formal specification with TLA+ using AI assistance and it has been very helpful AFTER I REALIZED that quite often it was proving things that were either trivial or irrelevant to the problem at hand (and not the problem itself), but difficult to detect at a high level.

I realize formal verification with lean is a slightly different game but if anyone here has any insight, I tend to be extremely nervous about a confidently presented AI "proof" because I am sure that the proof is proving whatever it is proving, but it's still very hard for me to be confident that it is proving what I need it to prove.

Before the dog piling starts, I'm talking specifically about distributed systems scenarios where it is just not possible for a human to think through all the combinatorics of the liveness and safety properties without proof assistance.

I'm open to being wrong on this, but I think the skill of writing a proof and understanding the proof is different than being sure it actually proves for all the guarantees you have in mind.

I feel like closing this gap is make it or break it for using AI augmented proof assistance.

oggy · 2026-02-21T16:45:17 1771692317

In my experience, finding the "correct" specification for a problem is usually very difficult for realistic systems. Generally it's unlikely that you'll be able to specify ALL the relevant properties formally. I think there's probably some facet of Kolmogorov complexity there; some properties probably cannot be significantly "compressed" in a way where the specification is significantly shorter and clearer than the solution.

But it's still usually possible to distill a few crucial properties that can be specified in an "obviously correct" manner. It takes A LOT of work (sometimes I'd be stuck for a couple of weeks trying to formalize a property). But in my experience the trade off can be worth it. One obvious benefit is that bugs can be pricey, depending on the system. But another benefit is that, even without formal verification, having a few clear properties can make it much easier to write a correct system, but crucially also make it easier to maintain the system as time goes by.

awesomeMilou · 2026-02-22T00:00:38 1771718438

I'm curious since I'm not a mathematician: What do you mean by "stuck for a couple of weeks"? I am trying to practice more advanced math and have stumbled over lean and such but I can't imagine you just sit around for weeks to ponder over a problem, right? What do you do all this time?

oggy · 2026-03-03T03:33:11 1772508791

I'm not a mathematician either ;) Yeah, I won't sit around and ponder at a property definition for weeks. But I will maybe spend a day on it, not get anywhere, and then spend an hour or two a day thinking about ways to formulate it. Sometimes I try something, then an hour later figure out it won't work, but sometimes I really do just stare at the ceiling with no idea how to proceed. Helps if you have someone to talk to about it!

namibj · 2026-02-22T07:17:24 1771744644

Experience counter examples for why a specific definition is not going to work. Many times, at various levels of "not going to", usually hovering slightly above a syntactic level, but sometimes hovering on average above a plain definition semantic level, i.e. being mostly concerned with some indirect interaction aspects.

daxfohl · 2026-02-21T19:26:30 1771701990

Yeah, even for simple things, it's surprisingly hard to write a correct spec. Or more to the point, it's surprisingly easy to write an incorrect spec and think it's correct, even under scrutiny, and so it turns out that you've proved the wrong thing.

There was a post a few months ago demonstrating this for various "proved" implementations of leftpad: https://news.ycombinator.com/item?id=45492274

This isn't to say it's useless; sometimes it helps you think about the problem more concretely and document it using known standards. But I'm not super bullish on "proofs" being the thing that keeps AI in line. First, like I said, they're easy to specify incorrectly, and second, they become incredibly hard to prove beyond a certain level of complexity. But I'll be interested to watch the space evolve.

(Note I'm bullish on AI+Lean for math. It's just the "provably safe AI" or "provably correct PRs" that I'm more skeptical of).

fauigerzigerk · 2026-02-21T20:01:59 1771704119

>But I'm not super bullish on "proofs" being the thing that keeps AI in line.

But do we have anything that works better than some form of formal specification?

We have to tell the AI what to do and we have to check whether it has done that. The only way to achieve that is for a person who knows the full context of the business problem and feels a social/legal/moral obligation not to cheat to write a formal spec.

daxfohl · 2026-02-21T21:07:45 1771708065

Code review, tests, a planning step to make sure it's approaching things the right way, enough experience to understand the right size problems to give it, metrics that can detect potential problems, etc. Same as with a junior engineer.

If you want something fully automated, then I think more investment in automating and improving these capabilities is the way to go. If you want something fully automated and 100% provably bug free, I just don't think that's ever going to be a reality.

Formal specs are cryptic beyond even a small level of complexity, so it's hard to tell if you're even proving the right thing. And proving that an implementation meets those specs blows up even faster, to the point that a lot of stuff ends up being formally unprovable. It's also extremely fragile: one line code change or a small refactor or optimization can completely invalidate hundreds of proofs. AI doesn't change any of that.

So that's why I'm not really bullish on that approach. Maybe there will be some very specific cases where it becomes useful, but for general business logic, I don't see it having useful impact.

nextos · 2026-02-21T21:03:00 1771707780

As a heavy user of formal methods, I think refinement types, instead of theorem proving with Lean or Isabelle, is both easier and more amenable to automation that doesn't get into these pitfalls.

It's less powerful, but easier to break down and align with code. Dafny and F* are two good showcases. Less power makes it also faster to verify and iterate on.

deterministic · 2026-02-23T02:36:45 1771814205

Completely agree. Refinement types is a much more practical tool for software developers focusing on writing real world correct code.

Using LEAN or Coq requires you to basically convert your code to LEAN/Coq before you can start proving anything. And importing some complicated Hoare logic library. While proving things correct in Dafny (for example) feels much more like programming.

johnbender · 2026-02-21T15:50:21 1771689021

You have identified the crux of the problem, just like mathematics writing down the “right” theorem is often half or more of the difficulty.

In the case of digital systems it can be much worse because we often have to include many assumptions to accommodate the complexity of our models. To use an example from your context, usually one is required to assume some kind of fairness to get anything to go through with systems operating concurrently but many kinds of fairness are not realistic (eg strong fairness).

esafak · 2026-02-21T15:42:43 1771688563

Could you write a blog post about your experience to make it more concrete?

youknownothing · 2026-02-21T17:18:12 1771694292

I was having the same intuition, but you verbalised it better: the notion of having a definitive yes/no answer is very attractive, but describing what you need in such terms using natural language, which is inherently ambiguous... that feels like a fool's errand. That's why I keep thinking that LLM usage for serious things will break down once we get to the truly complicated things: it's non-deterministic nature will be an unbreakable barrier. I wish I'm wrong, though.

upghost · 2026-02-20T18:23:51 1771611831

Anakin: I'm going to save the world with my AI vulnerability scanner, Padme.

Padme: You're scanning for vulnerabilities so you can fix them, Anakin?

Anakin: ...

Padme: You're scanning for vulnerabilities so you can FIX THEM, right, Annie?

nikcub · 2026-02-20T20:16:03 1771618563

I assume that's why this is gated behind a request for access from teams / enterprise users rather than being GA

but there are open versions available built on the cn OSS models:

https://github.com/lintsinghua/DeepAudit

sciencejerk · 2026-02-20T20:53:24 1771620804

The GA functionality is already here with a crafted prompt or jailbreak :)

nikcub · 2026-02-20T21:01:40 1771621300

it's gone a bit unnoticed that they've stopped support for response prefilling in the 4.6 models :/

czbond · 2026-02-20T18:37:35 1771612655

Definitely will be a fight against bad actors pulling bulk open source software projects, npm packages, etc and running this for their own 0 days.

I hope Anthropic can place alerts for their team to look for accounts with abnormal usage pre-emptively.

tptacek · 2026-02-20T18:44:58 1771613098

You want frontier models to actively prevent people from using them to do vulnerability research because you're worried bad people will do vulnerability research?

czbond · 2026-02-20T19:00:00 1771614000

Not at all. I was suggesting if an account is performing source code level request scanning of "numerous" codebases - that it could be an account of interest. A sign of mis-use.

This is different than someones "npm audit" suggesting issues with packages in a build and updating to new revisions. Also different than iterating deeply on source code for a project (eg: nginx web server).

SerCe · 2026-02-20T23:29:15 1771630155

What's incredibly ironic is that research labs are releasing the most advanced hacking toolkit ever known, and cybersecurity defence stocks are going down as a result somehow. There’s no logic in the stock markets.

tptacek · 2026-02-20T18:44:17 1771613057

I don't understand the joke here.

ukuina · 2026-02-20T20:07:04 1771618024

A vuln scanner is dual-use.

RupertSalt · 2026-02-20T20:38:40 1771619920

It's an Internet trope — we could link to knowyourmeme, or link to the HN Guidelines

upghost · 2026-01-28T14:45:46 1769611546

tl;dr - All this AI stuff is just Universal Paperclips[1]

I see a lot of comments about folks being worried about going soft, getting brain rot, or losing the fun part of coding.

As far as I'm concerned this is a bigger (albeit kinda flakey) self-driving tractor. Yeah I'd be bored if I just stuck to my one little cabbage patch I'd been tilling by hand. But my new cabbage patch is now a megafarm. Subjectively, same level of effort.

[1]: https://en.wikipedia.org/wiki/Universal_Paperclips