Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The article did say that they tried injecting concepts via the context window and by modifying the model's logit values.

When injecting words into its context, it recognized that what it supposedly said did not align with its thoughts and said it didn't intend to say that, while modifying the logits resulted in the model attempting to create a plausible justification for why it was thinking that.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: