• @GBU_28@lemm.ee
    link
    fedilink
    English
    310 months ago

    Off the shelf models do this, yes.

    Sophisticated local trained models on expensive private hardware are already dunking on publicly available versions. The problem of hallucination is generally resolved in those contexts

    • @amki@feddit.de
      link
      fedilink
      410 months ago

      Sure but until I see such a thing I chose not to believe in fairy tales.

      Decompiling arbitrary architecture machine code is quite a few levels above everything I’ve seen so far which is generally pretty basic pattern recognition paired with statistics and training reinforcement.

      I’d argue decompiling arbitrary machine code into either another machine code or legible higher level code is in a whol other league than what AO has proven to be capable of.

      Especially because with this being 90% accurate is useless.

      • @GBU_28@lemm.ee
        link
        fedilink
        English
        210 months ago

        Again you aren’t seeing this because these models are being developed for private enterprise purposes.

        Regarding deep machine code analysis, sure, that’s gonna take work but the whole hallucination thing is an off the shelf, rookie problem these days

        • Rikudou_Sage
          link
          fedilink
          English
          110 months ago

          It’s not, though. Hallucinations are inherent to the technology, it’s not a matter of training. Good training can greatly reduce the likelihood, but cannot solve it.

    • @sacredfire@programming.dev
      link
      fedilink
      110 months ago

      Why does a pre-trained model need expensive private hardware after it was trained, other than to handle API requests faster? Is Open AI training chat-GPT on inferior hardware compared to these sophisticated private versions you mentioned?

      • @GBU_28@lemm.ee
        link
        fedilink
        English
        310 months ago

        The fine tuning, while much more efficient than starting fresh, can still be a large amount of work.

        Then consider that your target corpus of data may also be large.

        Then consider to do your reasoning tasks across that corpus also takes strong hardware to get production ready response times.

        No, openai isn’t using inferior hardware, but their model goals, token chunking strategies and overall corpus are generalist in nature.

        There are then processing strategies teams are using to go beyond the “memory” limitations gpt 4 has, that provide massive benefits to coherency, essentially anti hallucination and better overall reasoning