ono@lemmy.ca to Technology@beehaw.orgEnglish · 10 months agoLarge Language Models can Strategically Deceive their Users when Put Under Pressure [simulation led to insider trading]arxiv.orgexternal-linkmessage-square1fedilinkarrow-up122arrow-down10cross-posted to: hackernews@derp.foo
arrow-up122arrow-down1external-linkLarge Language Models can Strategically Deceive their Users when Put Under Pressure [simulation led to insider trading]arxiv.orgono@lemmy.ca to Technology@beehaw.orgEnglish · 10 months agomessage-square1fedilinkcross-posted to: hackernews@derp.foo
minus-squareJustin@lemmy.jlh.namelinkfedilinkEnglisharrow-up5·10 months agoIt’s trained on human responses. Humans lie in their responses.
It’s trained on human responses. Humans lie in their responses.