A new paper suggests diminishing returns from larger and larger generative AI models. Dr Mike Pound discusses.

The Paper (No “Zero-Shot” Without Exponential Data): https://arxiv.org/abs/2404.04125

  • Lvxferre@mander.xyz
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 months ago

    My personal take is that the current generation of generative models peaked, for the reasons stated in the video (diminishing returns). This current gen will be useful, but progress-wise it’ll be a dead end.

    In the future however I believe that models with a different architecture will cause a breakthrough, being able to perform better with less training. And probably less energy requirements, too.

    • olympicyes@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      Sam Altman gives a pretty good indication that your point is correct when he began asking for $7 trillion for new AI chip development.

    • CheesyFox@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      I’ve already thought that in terms of major progression AI has peaked as early as in 2022 when chatgpt and various diffusers were all hyped up. It was kinda obvious, since our silicon tech is already basically maxed out. There are lots of potential optimizations, but they are minor advancements compared to the raw compute power growth we’ve had till the near past. And in order to make the next revolution in the AI field, those moneybags will have to spend the colossal amount of money to basically reinvent either computers themselves or the ML architechture.

      • Lvxferre@mander.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 months ago

        I don’t think that reinventing computers will do any good. The issue that I see is not hardware, but software - the current generative models are basically brute force, you throw enough data and processing power at the problem until it becomes smaller, but at the end of the day you’re still relying too much on statistical patterns behind the wrong entities.

        Instead I think that the ML architecture will change. And this won’t be done by those tech bros full of money burning effigies, who have a nasty/stupid/disgraceful tendency to confuse symbolic representations with the things being represented. Instead it’ll be done by researchers in some random compsci or robotics lab, in a random place of the world. They’ll be doing some weird stuff like emulating the brain of a fruit fly, and someone will point out “hey, you see this feature? It has ML applications”. And that’ll be when they actually add some intelligence to those systems, i.e. the missing piece of the puzzle. It won’t be AGI but it’ll be better than now, at least.