Lemdro.id
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
howrar@lemmy.ca to Artificial Intelligence@lemmy.worldEnglish · 5 days ago

Humans Still Beat AI in the Long Horizon: Revisiting Test-Time Scaling in the Agent Era

joyemang33.github.io

external-link
message-square
0
link
fedilink
  • cross-posted to:
  • Aii@programming.dev
  • artificial_intel@lemmy.ml
8
external-link

Humans Still Beat AI in the Long Horizon: Revisiting Test-Time Scaling in the Agent Era

joyemang33.github.io

howrar@lemmy.ca to Artificial Intelligence@lemmy.worldEnglish · 5 days ago
message-square
0
link
fedilink
  • cross-posted to:
  • Aii@programming.dev
  • artificial_intel@lemmy.ml
Agents can spend test-time compute by trying, observing, and revising. We derive an Elo reference for repeated sampling, then show that in a 2022 two-week coding marathon, current agents plateau within 24 hours while top humans keep improving.
alert-triangle
You must log in or register to comment.

Artificial Intelligence@lemmy.world

ai_@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !ai_@lemmy.world

Welcome to the AI Community!

Let’s explore AI passionately, foster innovation, and learn together. Follow these guidelines for a vibrant and respectful community:

  • Be kind and respectful.
  • Share high-quality contributions.
  • Stay on-topic.
  • Enhance accessibility.
  • Verify information.
  • Encourage meaningful discussions.

You can access the AI Wiki at the following link: AI Wiki

Let’s create a thriving AI community together!

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 8 users / day
  • 8 users / week
  • 54 users / month
  • 577 users / 6 months
  • 2 local subscribers
  • 1.93K subscribers
  • 269 Posts
  • 332 Comments
  • Modlog
  • mods:
  • ikidd@lemmy.world
  • ikidd@lemmy.dbzer0.com
  • UI: 0.19.11
  • BE: 0.19.12
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org