Lemdro.id
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
BoxedFenders [any, comrade/them]@hexbear.net to chapotraphouse@hexbear.netEnglish · 1 day ago

We are so cooked

hexbear.net

message-square
58
link
fedilink
192

We are so cooked

hexbear.net

BoxedFenders [any, comrade/them]@hexbear.net to chapotraphouse@hexbear.netEnglish · 1 day ago
message-square
58
link
fedilink
alert-triangle
You must log in or register to comment.
  • Seasonal_Peace [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 hours ago

    Why so much maps, do people ask LLMs for spacial info?

  • SexUnderSocialism [she/her]@hexbear.net
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    22 hours ago

    no-mouth-must-scream

  • axont [she/her, comrade/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    75
    arrow-down
    1
    ·
    edit-2
    1 day ago

    AI acolytes tell me their preferred AI has the advantage of access to all the world’s data, the full knowledge of mankind and yet 9.3% of its knowledge comes from walmart.com

    if 9.3% of a hypothetical humans’s knowledge came from walmart.com that person would be rightfully put in the pillory in the town square for the crime of demonic possession

    • Mindfury [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      30
      ·
      1 day ago

      Walmart hosts the Codex Astartes in the backend, hard to access using the website manually but you can crawl it

      • axont [she/her, comrade/them]@hexbear.net
        link
        fedilink
        English
        arrow-up
        24
        ·
        1 day ago

        the omnissiah manifesting physically in our universe through the machinations of retail backends. hail the motive force

    • LocalMaxima [she/her]@hexbear.net
      link
      fedilink
      English
      arrow-up
      22
      ·
      1 day ago

      The chart title is a bit misleading. This isn’t the source of training data, but the sites that are linked to in responses. Google AI overview was included in the results, which kind of explains why this is list is just the sites you would expect to be at the top of a Google search

  • FlakesBongler [they/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    86
    ·
    1 day ago

    reddit

    This explains why it’s so confidently wrong so often

    • LaGG_3 [he/him, comrade/them]@hexbear.net
      link
      fedilink
      English
      arrow-up
      49
      ·
      1 day ago

    • SchillMenaker [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      30
      ·
      1 day ago

      Even at sub-5% Quora is still doing some work here

      • FlakesBongler [they/them]@hexbear.net
        link
        fedilink
        English
        arrow-up
        23
        ·
        1 day ago

        Quora explains why it’s so horny

        Especially since half of Quora is just weird erotica

  • The_hypnic_jerk [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    49
    arrow-down
    1
    ·
    1 day ago

    They automated putting “reddit” at the end of a Google search and called it agi

    • LeeeroooyJeeenkiiins [none/use name]@hexbear.net
      link
      fedilink
      English
      arrow-up
      11
      ·
      1 day ago

      The llm itself admitted this!

  • The_Filthy_Commie@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    19
    ·
    1 day ago

    A scatological Ourobouros.

  • Evil_Shrubbery@thelemmy.club
    link
    fedilink
    English
    arrow-up
    5
    ·
    23 hours ago

    I’m sure there is plenty of non-official (ie illegal) content & their own users’ data (for the training too, not just searching).

  • FnordPrefect [comrade/them, he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    39
    ·
    1 day ago

    State Department like, “Yeah, look at all of those distinct and independent sources of information side-eye-1 side-eye-2”

    but at least with yahoo on there we can be confident that grok will have lots of quality details about pregnartcy

    • InevitableSwing [none/use name]@hexbear.net
      link
      fedilink
      English
      arrow-up
      22
      ·
      1 day ago

      I love that vid.

      “Dangerops prangent sex? will it hurt baby top of its head?” still the best one

      I don’t know if it’s best but def in the top three.

      • NephewAlphaBravo [he/him]@hexbear.net
        link
        fedilink
        English
        arrow-up
        8
        ·
        edit-2
        1 day ago

        “gregnant” and “pregnart” live in my brain rent free forever

    • HexReplyBot [none/use name]@hexbear.netB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 day ago

      I found a YouTube link in your comment. Here are links to the same video on alternative frontends that protect your privacy:

      • yewtu.be
      • inv.nadeko.net
      • yt.artemislena.eu
      • piped.video
  • Saymaz@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    5
    ·
    24 hours ago

    The next generation is gonna be somehow more rightwing than the previous two.

  • emdash [comrade/them, comrade/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    29
    ·
    1 day ago

    Why did they need to pirate every book on Anna’s Archive if they were just going to cite social media and product advertisements?

    • ElChapoDeChapo [he/him, comrade/them]@hexbear.net
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 day ago

      Well they had to do it quick before the FBI took them down on accout of these tech demons reporting them to the FBI after the API training

  • Beaver [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    13
    ·
    edit-2
    1 day ago

    I just despair when there’s so much digitized information that was written by actual academics and experts, but the LLMs and search engines clearly seem to give the most reddit-ass answers to questions.

    • Alisu [she/her, they/them]@hexbear.net
      link
      fedilink
      English
      arrow-up
      8
      ·
      1 day ago

      I’ve managed to get linked to university websites and academic sources, but you gotta ask the right questions in the right way.

      • Tyrq@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 day ago

        That’s kinda already the academic way, just with a new shitty flavour

  • GrouchyGrouse [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    25
    ·
    1 day ago

    Time to edit all 400,000 of my Reddit comments to be about the 1997 point-and-click videogame Star Wars: Yoda Stories

  • BeanisBrain [he/him, they/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    31
    ·
    1 day ago

    Allow me to propose an alternative input set:

    • 60% marxists.org (for historical theory)
    • 30% redsails.org (for contemporary criticism)
    • 5% youtube.com (only transcripts of Hakim and Luna Oi videos)
    • 5% hexbear.net (for flavor)
    • alexei_1917 [mirror/your pronouns, any]@hexbear.net
      link
      fedilink
      English
      arrow-up
      20
      ·
      edit-2
      1 day ago

      I think a chatbot trained only on ML theory would certainly be fun to play with. Ask a political or economic question, get something that sounds just like Lenin and makes about as much sense as some particularly dense parts of Capital.

      (And even though it’s a robot, I do feel a weird perverse thrill at the idea of taking a completely politically unconscious and blank slate mind and providing it only the Marxist-Leninist perspective, and never exposing it to any other political viewpoint until a strong ideological foundation is built. That’s kinda neat.)

      • BountifulEggnog [she/her]@hexbear.net
        link
        fedilink
        English
        arrow-up
        12
        ·
        1 day ago

        You need a big dataset to train a model, unfortunately Marxist-Leninists are too short spoken.

        • alexei_1917 [mirror/your pronouns, any]@hexbear.net
          link
          fedilink
          English
          arrow-up
          8
          ·
          1 day ago

          Short spoken? Some of our theory seems pretty damn long.

          • BountifulEggnog [she/her]@hexbear.net
            link
            fedilink
            English
            arrow-up
            8
            ·
            edit-2
            1 day ago

            That bit was a joke, although I would expect all theory to be much less then the amount of data needed to pretrain a model big enough to produce anything- coherent.

            Actually, here’s some math. SmolLM was trained on 600b tokens. Das Kapital is roughly 288k words, about 218k tokens. We’ll round to 250,000 tokens. Divided into 600,000,000,000 and we would need 2.4 million Das Kapitals worth of text to train SmolLM. V2 uses 2t tokens, 8 million Das Kapitals. There’s obviously a lot more theory then that, and you could probably throw forums like ours in, prolewiki, maybe some youtube subtitles. Synthetic data from theory. LLMs just need to eat a lot of text unfortunately. Qwen3 trained on 36 trillion tokens, 144 million Kapitals.

            • hotcouchguy [he/him]@hexbear.net
              link
              fedilink
              English
              arrow-up
              6
              ·
              1 day ago

              I believe there are methods to train on a large, general dataset, and then re-train on a small, focused dataset, but I’m not sure of any specifics

              • BountifulEggnog [she/her]@hexbear.net
                link
                fedilink
                English
                arrow-up
                6
                ·
                1 day ago

                Yes, lots of ways, and definitely the approach for something like this. You would still have to be picky about data though, pre training still effects its biases a lot. Especially if the hope is a blank slate that’s only seen ML thinking.

                • alexei_1917 [mirror/your pronouns, any]@hexbear.net
                  link
                  fedilink
                  English
                  arrow-up
                  3
                  ·
                  1 day ago

                  Yeah, absolutely. Creating a thing capable of at least appearing to think, that is literally unable to understand Western liberal nonsense because it’s been fed only ML aligned material to read and process, might not be possible. I just thought the concept was kinda neat.

            • alexei_1917 [mirror/your pronouns, any]@hexbear.net
              link
              fedilink
              English
              arrow-up
              2
              ·
              1 day ago

              Yeah, when you put it that way, one can see the issue. I was kind of joking myself, we have a lot of theory, and while it might be a drop in the bucket for a machine that needs to basically eat boatloads of text, when it comes to humans reading it, even just what a lot of orgs agree on as the core texts, is a lot of reading to do. And the theory itself is often… not short spoken or concise in any sense. Some of it can really feel like it’s long and complicated on purpose.

    • Saymaz@lemmygrad.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      24 hours ago

      deleted by creator

  • take_five_moments [any]@hexbear.net
    link
    fedilink
    English
    arrow-up
    36
    ·
    1 day ago

    target.com

    lmao

    • FlakesBongler [they/them]@hexbear.net
      link
      fedilink
      English
      arrow-up
      16
      ·
      1 day ago

      Home of some of the worst wannabe police-cop LP guys ever

  • varmint [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    27
    ·
    1 day ago

    Why does this add up to way more than 100%?

    • roux [they/them, xe/xem]@hexbear.net
      link
      fedilink
      English
      arrow-up
      29
      ·
      edit-2
      1 day ago

      They used AI to generate the chart.

    • XxFemboy_Stalin_420_69xX [none/use name]@hexbear.net
      link
      fedilink
      English
      arrow-up
      15
      ·
      1 day ago

      presumably bc the same prompt can generate citations from multiple sites

    • Rod_Blagojevic [none/use name]@hexbear.net
      link
      fedilink
      English
      arrow-up
      10
      ·
      1 day ago

      peltier-laugh

chapotraphouse@hexbear.net

chapotraphouse@hexbear.net

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !chapotraphouse@hexbear.net

Banned? DM Wmill to appeal.

No anti-nautilism posts. See: Eco-fascism Primer

Slop posts go in c/slop. Don’t post low-hanging fruit here.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 830 users / day
  • 2.01K users / week
  • 2.64K users / month
  • 5.19K users / 6 months
  • 4 local subscribers
  • 14.1K subscribers
  • 21.4K Posts
  • 319K Comments
  • Modlog
  • mods:
  • LENINSGHOSTFACEKILLA [he/him]@hexbear.net
  • MiraculousMM [he/him, undecided]@hexbear.net
  • corgiwithalaptop [any, love/loves]@hexbear.net
  • PorkrollPosadist [he/him, they/them]@hexbear.net
  • a_little_red_rat [he/him, comrade/them]@hexbear.net
  • khizuo [ze/zir]@hexbear.net
  • gaystyleJoker [she/her]@hexbear.net
  • thelastaxolotl [he/him]@hexbear.net
  • context [fae/faer, fae/faer]@hexbear.net
  • Infamousblt [any]@hexbear.net
  • Sulvy [he/him, comrade/them]@hexbear.net
  • UI: 0.19.11
  • BE: 0.19.12
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org