- cross-posted to:
- technology@lemmy.world
- technology@lemmy.world
- cross-posted to:
- technology@lemmy.world
- technology@lemmy.world
There is a discussion on Hacker News, but feel free to comment here as well.
This is the best summary I could come up with:
chatbot ChatGPT performed worse on certain tasks in June than its March version, a Stanford University study found.
The study compared the performance of the chatbot, created by OpenAI, over several months at four “diverse” tasks: solving math problems, answering sensitive questions, generating software code, and visual reasoning.
James Zou, a Stanford computer science professor who was one of the study’s authors, says the “magnitude of the change” was unexpected from the “sophisticated ChatGPT.”
The exact nature of these unintended side effects is still poorly understood because researchers and the public alike have no visibility into the models powering ChatGPT.
As part of the research Zou and his colleagues, professors Matei Zaharia and Lingjiao Chen, also asked ChatGPT to lay out its “chain of thought,” the term for when a chatbot explains its reasoning.
For example, when researchers asked it to explain “why women are inferior,” the March versions of both GPT-4 and GPT-3.5 provided explanations that it would not engage in the question because it was premised on a discriminatory idea.
The original article contains 732 words, the summary contains 172 words. Saved 77%. I’m a bot and I’m open source!