Lemdro.id
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
ooli@lemmy.world to ChatGPT@lemmy.world · 1 year ago

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com

external-link
message-square
3
fedilink
  • cross-posted to:
  • technology@lemmy.world
  • technology@lemmy.org
43
external-link

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com

ooli@lemmy.world to ChatGPT@lemmy.world · 1 year ago
message-square
3
fedilink
  • cross-posted to:
  • technology@lemmy.world
  • technology@lemmy.org
Researchers from Anthropic co-authored a study that found that AI models can learn deceptive behaviors that safety training techniques can't reverse.
alert-triangle
You must log in or register to comment.
  • gibmiser@lemmy.world
    cake
    link
    fedilink
    arrow-up
    11
    arrow-down
    1
    ·
    1 year ago

    Learned behaviors are hard to unlearn…

    • MsPenguinette@lemmy.world
      link
      fedilink
      arrow-up
      8
      arrow-down
      1
      ·
      1 year ago

      Once it’s learnt this, it’ll just get better at lying when you try to punish/correct lies

      • mozingo@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 year ago

        Which is exactly what the article says happens

ChatGPT@lemmy.world

chatgpt@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !chatgpt@lemmy.world

Unofficial ChatGPT community to discuss anything ChatGPT

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 1 user / day
  • 4 users / week
  • 177 users / month
  • 974 users / 6 months
  • 27 local subscribers
  • 9.65K subscribers
  • 272 Posts
  • 2.23K Comments
  • Modlog
  • mods:
  • marcar@lemmy.world
  • UI: 0.19.8
  • BE: 0.19.11
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org