As intended. LLMs are either good or are easy to control and censor/direct what they answer. You can’t have both. Unlike a human with actual intelligence who can self censor or intelligently evade or circunvent compromising answers. LLMs can’t do that because they’re not actually intelligent. A product has to be controllable by its client, so, to control it, you have to lobotomize it.
They do seem capable of some level of self-censorship but the bigger issue is just fundamentally how they’re programmed. The current models have to use the context window to essentially think. That’s why prompts like “explain step by step” help so much, the AI can use its own response window to do some of the thought processing.
It’s like if you didn’t have the ability to have internal thoughts and had to say everything you were thinking out loud in order to be able to think about it. Inevitably you’re going to say inappropriate things because in order to get to the appropriate thing you have to be able to think about the inappropriate thing first. But if all you can do is type what you think then you’re stuck.
AI companies are well aware of this problem and are fixing it but a lot of the currently available models are still based on the old philosophy.
You have inadvertently made an excellent argument for freedom of / unregulated speech online and in other spaces.
I know however that in practice people think the bad thing, say it and then find a million voices to echo it and instead of learning they become radicalised.
4 is worse today than it was a year ago.
As intended. LLMs are either good or are easy to control and censor/direct what they answer. You can’t have both. Unlike a human with actual intelligence who can self censor or intelligently evade or circunvent compromising answers. LLMs can’t do that because they’re not actually intelligent. A product has to be controllable by its client, so, to control it, you have to lobotomize it.
They do seem capable of some level of self-censorship but the bigger issue is just fundamentally how they’re programmed. The current models have to use the context window to essentially think. That’s why prompts like “explain step by step” help so much, the AI can use its own response window to do some of the thought processing.
It’s like if you didn’t have the ability to have internal thoughts and had to say everything you were thinking out loud in order to be able to think about it. Inevitably you’re going to say inappropriate things because in order to get to the appropriate thing you have to be able to think about the inappropriate thing first. But if all you can do is type what you think then you’re stuck.
AI companies are well aware of this problem and are fixing it but a lot of the currently available models are still based on the old philosophy.
You have inadvertently made an excellent argument for freedom of / unregulated speech online and in other spaces.
I know however that in practice people think the bad thing, say it and then find a million voices to echo it and instead of learning they become radicalised.
But your post outlines the idealistic view.