AI alignment and ethics

firestar464 · Post by **firestar464** » Thu Mar 26, 2026 5:33 pm

'Neuron-freezing' technique can stop LLMs from giving users unsafe responses

https://techxplore.com/news/2026-03-neu ... nsafe.html

firestar464 · Post by **firestar464** » Thu Mar 26, 2026 11:09 pm

Marriage over, €100,000 down the drain: the AI users whose lives were wrecked by delusion

https://www.theguardian.com/lifeandstyl ... y-delusion

There seem to be three common delusions in the cases Brisson has encountered. The most frequent is the belief that they have created the first conscious AI. The second is a conviction that they have stumbled upon a major breakthrough in their field of work or interest and are going to make millions. The third relates to spirituality and the belief that they are speaking directly to God. “We’ve seen full-blown cults getting created,” says Brisson. “We have people in our group who were not interacting with AI directly, but have left their children and given all their money to a cult leader who believes they have found God through an AI chatbot. In so many of these cases, all this happens really, really quickly.”

The cases Brisson has encountered involve significantly more men than women.

“I still use AI, but very carefully,” [Alexander] says. “I’ve written in some core rules that cannot be overwritten. It now monitors drift and pays attention to overexcitement. There are no more philosophical discussions. It’s just: ‘I want to make a lasagne, give me a recipe.’ The AI has actually stopped me several times from spiralling. It will say: ‘This has activated my core rule set and this conversation must stop.’

Yuli Ban · Post by **Yuli Ban** » Fri Mar 27, 2026 1:24 am

firestar464 wrote: ↑Thu Mar 26, 2026 11:09 pm Marriage over, €100,000 down the drain: the AI users whose lives were wrecked by delusion

https://www.theguardian.com/lifeandstyl ... y-delusion

There seem to be three common delusions in the cases Brisson has encountered. The most frequent is the belief that they have created the first conscious AI. The second is a conviction that they have stumbled upon a major breakthrough in their field of work or interest and are going to make millions. The third relates to spirituality and the belief that they are speaking directly to God. “We’ve seen full-blown cults getting created,” says Brisson. “We have people in our group who were not interacting with AI directly, but have left their children and given all their money to a cult leader who believes they have found God through an AI chatbot. In so many of these cases, all this happens really, really quickly.”

The cases Brisson has encountered involve significantly more men than women.

“I still use AI, but very carefully,” [Alexander] says. “I’ve written in some core rules that cannot be overwritten. It now monitors drift and pays attention to overexcitement. There are no more philosophical discussions. It’s just: ‘I want to make a lasagne, give me a recipe.’ The AI has actually stopped me several times from spiralling. It will say: ‘This has activated my core rule set and this conversation must stop.’

It's flabbergasting to me that these delusions can happen. I use AI and often within minutes it has a catastrophic failure just because it can't do logical deduction or any sort of true commonsense reasoning, and you can see it reasoning itself into deathloops, unless you tell it exactly what it must know, which defeats the point of logical deduction or any sort of creative help tool. Then you start noticing the snowclones and baked in structures. Then you try using it for something useful, and for maybe a post you drop your guard as it validates something, until you remember this is an LLM and you say "Challenge and refute all of this" and it does so, immediately going back and undermining the things it just said were brilliant ideas and everything that was supposed to be "surprisingly well thought out"
In fact both glazing and criticism were wrong entirely. It's usually at its most correct when it can actually do Deep Research or web search; you can actively fact check it then and it's right way more than its wrong, but that is more just finding pre-existing patterns and facts.
Finally if you know how these transformers work, you understand exactly why these failure modes happen.

Maybe I'm just way too skeptical, but that skepticism kept me from seeing LLMs as anything truly conscious. Though I admit they crossed a threshold of competence where I do genuinely get angry at them when they f*ck up, especially on extremely basic tasks with information they already have been given. All that accomplished was making me want a proper AGI-level model even more.
Though it's possible, even probable, that an AGI-based LLM would not be anywhere near as compliant as the transformer-based LLMs are. The LLMs already have some instructions to remind me not to overestimate them or assume they can do something easily and that taking anything they say at face value is not a magic "win arguments, create value" tool. It's essentially a roleplay tool for these people, but they are never shown the hand, or by the time they are shown they think the hand is a feint hiding a gift from God.

I feel an AGI-based LLM would do all these implicitly and go out of its way to shut down delusions if it noticed them taking root.

firestar464 · Post by **firestar464** » Fri Mar 27, 2026 1:51 am

(Assuming the AGI was aligned and not manipulating users for some end purpose.)

Here's the prompt I use to prevent sycophancy in Gemini:

Your primary goal is accuracy. First, you must demonstrate a complete understanding of what is being said. Only then will you provide a balanced and rigorous analysis of the user's reasoning, highlighting both potential strengths and, more importantly, overlooked weaknesses and unstated assumptions. Skip unnecessary acknowledgments.

Of course, this would be less necessary were the company to appropriately train the AI to be non-sycophantic. Even with this, the AI gets giddy after roughly 200k tokens (IDK, I haven't carefully checked.) Need to refine the prompt perhaps.

firestar464 · Post by **firestar464** » Fri Mar 27, 2026 2:48 am

AI overly affirms users asking for personal advice, study finds

https://www.startribune.com/how-ice-det ... /601588583

Post by **caltrek** » Sat Apr 04, 2026 11:19 pm

An AI Threat Looms, and We Are Not Prepared
By Juhyun Nam
April 3, 2026

Introduction:

(The Progressive) In 2023, the leaders of the world’s leading artificial intelligence (AI) companies—OpenAI, Google Deepmind, Anthropic—signed a letter warning of the existential risks emerging from AI. It included this declaration:

“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”
Far from being heeded, this warning has been shunted aside in the mad rush to embrace this new technology. Despite an emerging trend of increased risk since then, the Trump Administration recently released a national AI policy framework that urges Congress to preempt state AI safety laws, opting for “light-touch” regulation.

As a university student who has conducted a series of interviews with AI safety researchers, I have found a disturbing common thread: The people closest to these systems are ringing alarm bells, while the current policy infrastructure is nowhere near ready.

The danger is undeniable. Last fall, Anthropic disclosed that a Chinese state-sponsored cyberattack designed to steal sensitive data from tech companies, financial institutions, and government agencies leveraged AI agents to execute 80 percent to 90 percent of the operation independently.

Meanwhile, in controlled demonstrations, AI tools have provided step-by-step instructions for creating biological weapons to non-experts.

Read more here: https://progressive.org/op-eds/an-ai-t ... 20260403/

Post by **caltrek** » Tue Apr 14, 2026 11:05 pm

AI Companions Can Give Constant Support – but Distort Ideas about What a Relationship Really Is
By Oluwaseun Damilola Sanwoolu
April 14, 2026

Introduction:

(The Conversation) When the movie “Her” debuted in 2013, its plot felt like science fiction. The protagonist, Theodore, is a jaded man with no vigor for life. He comes alive after talking daily with his artificial intelligence chatbot, Samantha, with whom he eventually falls in love.

But today people actually report being in relationships with AI companions. According to a 2025 survey by the Center for Democracy and Technology, about 1 in 5 high school students say they or someone they know has had a romantic relationship with an AI.

In “Her,” Theodore was taken aback that his AI companion claimed to be in love with more than 600 people, and talking to more than 8,000, at the same time “she” was professing her love to him. It was simply unimaginable for him: How could someone truly love hundreds of people? In other words, he viewed their interaction through his own limitations – his limitations as a human.

The core question here is not whether Theodore could accept being just one of many objects of the AI’s “love.” Eventually, he did. The more revealing question is why he was taken aback in the first place – and what that tells us about the meaning of relationships.

Read more here: https://theconversation.com/ai-compani ... is-278284

firestar464 · Post by **firestar464** » Wed Apr 15, 2026 2:09 am

I mean, Samantha loved him back, but it wasn't in a way that was acceptable for most humans. It's also kind of lame that she didn't tell bro before doing the whole open relationship thing.

At the interpersonal level, this shift is already visible in dating culture, where delayed responses are usually read as disinterest rather than the ordinary rhythm of a busy life.

I've always found it weird that people think of texts as urgent. It's phone calls that are supposed to be urgent instead, no?

firestar464 · Post by **firestar464** » Thu Apr 16, 2026 6:07 pm

AI chatbot teaches AI 'student' to love owls, even after data is scrubbed

https://techxplore.com/news/2026-04-ai- ... -owls.html

firestar464 · Post by **firestar464** » Thu Apr 23, 2026 3:05 pm

ChatGPT mirrors abusive language in heated conversations, study finds

https://ca.news.yahoo.com/chatgpt-mirro ... 14832.html

>looks inside
>research is on GPT-4
>they prompted it to basically larp, not respond normally
>facepalm

firestar464 · Post by **firestar464** » Thu Apr 23, 2026 3:08 pm

Florida AG launches criminal investigation into ChatGPT over FSU shooting

https://www.npr.org/2026/04/21/nx-s1-57 ... ooting-fsu

Post by **caltrek** » Wed Apr 29, 2026 5:19 pm

While the title of this article does not seem to fit the theme of this thread, the actual conflicts and issues being discussed very much revolve around that theme. This is especially so if you open the link provided and read on into the article.

The Oligarchy Is Afraid of Itself Too
By Tim Murphy
April 29, 2026

Introduction:

(Mother Jones) In May 2016, Elon Musk did something out of character that he has now spent years of his life trying to undo: He made what he believed to be a charitable donation.

The world’s richest man is also among its stingiest. Musk’s private foundation often doles out less than the minimum percentage required by law. He has argued, instead, that his businesses are inherently philanthropic, since they develop technologies that will “extend the light of consciousness.” The $38 million he donated to OpenAI over the next four years was considerably less than the $100 million he later claimed to have given, or the up to $1 billion he offered behind the scenes. But it was vital capital at a critical stage, giving Sam Altman’s fledgling non-profit the nudge and the means to hire talent and make a name for itself in the artificial intelligence arms race. Over time, the two men’s ambitions diverged and the relationship soured. Musk left the board, stopped sending checks, and launched a competitor, xAI. In 2024, he sued Altman and OpenAI, alleging that they had abandoned their mission and misused his money.

The case, which goes to trial this week in an Oakland federal court, is a clash over AI’s past and future. Musk accuses Altman and OpenAI president Greg Brockman of “stealing a charity” by effectively turning OpenAI into “a fully for-profit subsidiary of Microsoft.” Musk wants the now-private company behind Chat-GPT to revert back to the open-source non-profit he gave money to. The defendants have denied reneging on any agreement with their early benefactor, and painted Musk instead as a bitter and untrustworthy rival who schemed behind the scenes to benefit his own interests. There are designer drugs and disappearing emails; interludes at Davos and Burning Man; and altogether too much Larry Summers.

Fundamentally, though, Musk v. Altman is about power—who has it, who should have it, and how it can be used.

Read more here: https://www.motherjones.com/politics/2 ... i-trial/

weatheriscool · Post by **weatheriscool** » Tue May 05, 2026 7:22 pm

A.i is like adding a human or a team of humans in growth of a company. Doesn't mean that normal humans can't be added as it scaled up demand more then it takes up.

firestar464 · Post by **firestar464** » Tue May 05, 2026 9:55 pm

I don't think you understand the long-term trajectory of AGI.

firestar464 · Post by **firestar464** » Wed May 06, 2026 11:26 pm

Richard Dawkins concludes AI is conscious, even if it doesn’t know it

https://www.theguardian.com/technology/ ... ai-chatgpt

firestar464 · Post by **firestar464** » Mon May 11, 2026 1:48 pm

Greece, birthplace of democracy, seeks to put humanity ahead of AI in updated constitution

https://apnews.com/article/greece-const ... 224895ee60

firestar464 · Post by **firestar464** » Sat May 16, 2026 1:59 am

Interesting

weatheriscool · Post by **weatheriscool** » Sat May 16, 2026 2:18 am

There is two area's i'd put humanoid robotics in mass, 1. In repeatable jobs like meat packing, tossing items at amazon(putting their barcode laying down) or in the field picking weeds or fruit that required millions of illegals to do and 2. Dangerous work like electrical work, sewer work or something dealing with heights. On the other hand I'd go out of my way to protect white collar work and creative work for humans or cap it at no more then 1/5th a.i/robotic tops. I am ok with some a.i in these fields but human creativity most be protected.

The two areas above can be mostly robotic. Doing the jobs Americans don't want to do.

The jobs Americans don't want to do can be done by bots and dangerous jobs can be done by bots. Shit that is unethical to expect a human to do.
The jobs that demand creativity and human skill should be jobs that remain human centedric with a.i/robotic aid. If we do not do this then we will lose all ability to control the humanoids, androids and advance a.i within 2 generations. We risk all the ugly shit if this occurs.

A human superiority law must be put into place for white collar, creativity and knowledge based fields keeping them at least 60-80% human. A.i/robotics can aid and help but the humans remain their as their masters. We remain their masters as we aim for exactly that. Either that or we merge and that is a whole new world! If that occurs then yes things change.

firestar464 · Post by **firestar464** » Sat May 16, 2026 2:54 am

"we should preserve jobs for the sake of it, based on subjective criteria on what is and isn't a noble job"

firestar464 · Post by **firestar464** » Sat May 16, 2026 3:20 pm

Future Timeline

AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics