• cheese_greater@lemmy.world
    link
    fedilink
    English
    arrow-up
    56
    arrow-down
    3
    ·
    edit-2
    2 years ago

    I would be in trouble if this was a thing. My writing naturally resembles the output of a ChatGPT prompt when I’m not joke answering or shitposting.

    • TropicalDingdong@lemmy.world
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      2
      ·
      edit-2
      2 years ago

      I would be in trouble if this was a thing. My writing naturally resembles the output of a ChatGPT prompt when I’m not joke answering.

      It’s not unusual for well-constructed human writing to resemble the output of advanced language models like ChatGPT. After all, language models like GPT-4 are trained on vast amounts of human text, and their main goal is to replicate and generate human-like text based on the patterns they’ve observed.

      /gpt-4

  • ReallyKinda@kbin.social
    link
    fedilink
    arrow-up
    37
    ·
    2 years ago

    I know a couple teachers (college level) that have caught several gpt papers over the summer. It’s a great cheating tool but as with all cheating in the past you still have to basically learn the material (at least for narrative papers) to proof gpt properly. It doesn’t get jargon right, it makes things up, it makes no attempt to adhere to reason when it’s making an argument.

    Using translation tools is extra obvious—have a native speaker proof your paper if you attempt to use an AI translator on a paper for credit!!

      • bioemerl@kbin.social
        link
        fedilink
        arrow-up
        40
        arrow-down
        2
        ·
        2 years ago

        Because you’re training a detector on something that is designed to emulate regular languages closest possible, and human speech has so much incredible variability that it’s almost impossible to identify if someone or something has been written by an AI.

        You can detect maybe your typical generic chat GPT type outputs, but you can characterize a conversation with chat GPT or any of the other much better local models (privacy and control are aspects which make them better) and after doing that you can get radically human seeming outputs that are totally different from anything chat GPT will output.

        In short, given a static block of text it’s going to be nearly impossible to detect if it’s coming from an AI. It’s just too difficult to problem, and if you’re going to solve it it’s going to be immediately obsolete the next time someone fine tunes their own model

        • stevedidWHAT@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          arrow-down
          1
          ·
          2 years ago

          Yeah this makes a lot of sense considering the vastness of language and it’s imperfections (English I’m mostly looking at you, ya inbred fuck)

          Are there any other detection techniques that you know of? Wb forcing AI models to have a signature that is guaranteed to be indentifiable, permanent, and unique for each tuning produced? It’d have to be not directly noticeable but easy to calculate in order to prevent any “distractions” for the users.

          • Grimy@lemmy.world
            link
            fedilink
            English
            arrow-up
            12
            arrow-down
            1
            ·
            2 years ago

            The output is pure text so you would have to hide the signature in the response itself. On top of being useless since most users slightly modify the text after receiving it, it would probably have a negative effect on the quality. It’s also insanely complicated to train that kind of behavior into an llm.

            • stevedidWHAT@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              1
              ·
              2 years ago

              Your implementation of my concept might be useless, but that doesn’t mean the concept is.

              One possible solution would be to look at how responses are structured, letter frequencies, etc. The flexibility/ambiguous nature natural language is that you can word things in many many different ways which allows for some creative meta techniques to accomplish a fingerprint.

              • Balder@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                2 years ago

                The idea itself is valid, but wouldn’t that just make it more dangerous when malicious agents use the technology without fingerprinting?

                • stevedidWHAT@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  2 years ago

                  Cats out of the bag my friend. Just like the nuke, the ideas are always out there. Once it’s been discovered and shared that’s that.

                  We can huff and puff and come up with all the cute little laws we want but the fact of the matter is we know the recipe now. All we can do is dive deeper into the technology to understand it even better, make new findings and adapt as we always do.

          • bioemerl@kbin.social
            link
            fedilink
            arrow-up
            4
            ·
            2 years ago

            forcing AI models to have a signature that is guaranteed to be indentifiable, permanent, and unique for each tuning produced

            Either AI remains entirely in the hands of fucks like open AI or this is impossible and easily removed. AI should be a free common use tool, not an extension of corporate control.

              • bioemerl@kbin.social
                link
                fedilink
                arrow-up
                3
                ·
                2 years ago

                It’s no different than owning your computer. Something is absolutely a central and productivity boosting is artificial intelligence should not be kept in the hands of the few.

                The only way that it could be is through government intervention, you don’t need an anarchist to be against an open AI monopoly.

            • stevedidWHAT@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              ·
              2 years ago

              Agreed, such power should belong to everyone or has yet to be discovered. Even Oppenheimer knew, once the cats out of the bag…

      • sebi@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        3
        ·
        2 years ago

        Because generative Neural Networks always have some random noise. Read more about it here

          • PetDinosaurs@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            2
            ·
            2 years ago

            It almost certainly has some gan-like pieces.

            Gans are part of the NN toolbox, like cnns and rnns and such.

            Basically all commercial algorithms (not just nns, everything) are what I like to call “hybrid” methods, which means keep throwing different tools at it until things work well enough.

              • PetDinosaurs@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                arrow-down
                2
                ·
                2 years ago

                It doesn’t matter. Even the training process makes it pretty much impossible to tell these things apart.

                And if we do find a way to distinguish, we’ll immediately incorporate that into the model design in a GAN like manner, and we’ll soon be unable to distinguish again.

                • stevedidWHAT@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  arrow-down
                  1
                  ·
                  2 years ago

                  Which is why hardcoded fingerprints/identifications are required to identify the individual as a speaker rather than as an AI vs Human. Which is what we’re ultimately agreeing on here outside of the pedantics of the article and scientific findings:

                  Trying to find the model who is supposed to be human as an AI is counter intuitive. They’re direct opposites if one works, both can’t be exist in this implementation.

                  The hard part will obviously be making sure that such a “fingerprint” wouldn’t be removable which will take some wild math and out of the box thinking I’m sure.

                  Tough problem!

      • learningduck@programming.dev
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 years ago

        Typically for generative AI. I think during their training of the Nobel, they must have developed another model that detect if GPT produce a more natural language. I think that other model may reached the point where it couldn’t flag it with acceptable false positive.

  • Boddhisatva@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    2 years ago

    OpenAI discontinued its AI Classifier, which was an experimental tool designed to detect AI-written text. It had an abysmal 26 percent accuracy rate.

    If you ask this thing whether or not some given text is AI generated, and it is only right 26% of the time, then I can think of a real quick way to make it 74% accurate.

    • notatoad@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 years ago

      it seemed like a really weird decision for OpenAI to have an AI classifier in the first place. their whole business is to generate output that’s good enough that it can’t be distinguished from what a human might produce, and then they went and made a tool to try and point out where they failed.

      • Boddhisatva@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 years ago

        That may have been the goal. Look how good our AI is, even we can’t tell if its output is human generated or not.

  • HelloThere@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    15
    ·
    2 years ago

    Regardless of if they do or don’t, surely it’s in the interests of the people making the “AI” to claim that their tool is so good it’s indistinguishable from humans?

    • stevedidWHAT@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      2 years ago

      Depends if they’re more researchers or a business imo. Scientists generally speaking are very cautious about making shit claims bc if they get called out that’s their career really.

      • BetaDoggo_@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        2 years ago

        OpenAI hasn’t been focused on the science since the Microsoft investment. A science focused company doesn’t release a technical report that doesn’t contain any of the specs of the model they’re reporting on.

      • Zeth0s@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        edit-2
        2 years ago

        Few decades ago probably, nowadays “scientists” make a lot of bs claims to get published. I was in the room when a “scientist” publishing several nature per year asked to her student to write a paper for a research without any result in a way that it looked like it had something important for a relatively good IF publication.

        That day I decided I was done with academia. I had seen enough.

    • pewter@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      2 years ago

      Yes, but it’s such a falsifiable claim that anyone is more than welcome to prove them wrong. There’s a lot of slightly different LLMs out there. If you or anyone else can definitively show there’s a machine that can identify AI writing vs human writing, it will either result in better AI writing or it would be an amazing breakthrough in understanding the limits of AI.

      • HelloThere@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 years ago

        People like to view the problem as a paradox - can an all powerful God create a rock they cannot lift? - but I feel that’s too generous, it’s more marking your own homework.

        If a system can both write text, and detect whether it or another system wrote that text, then “all” it needs to do is change that text to be outside of the bounds of detection. That is to say, it just needs to convince itself.

        I’m not wanting to imply that that is easy, because it isn’t, but it’s a very different thing to convincing someone else, especially a human, that understands the topic.

        There is also a false narrative involved here, that we need an AI to detect AI which again serves as a marketing benefit to OpenAI.

        We don’t, because they aren’t that good, at least, not yet anyway.

  • Matriks404@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    ·
    2 years ago

    Did human-generated content really become so low quality that it is distinguishable from AI-generated content?

  • shameless@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    2 years ago

    I just realised that especially in teaching, people are treating these LLM’s the same way that I remember teachers in school used to treat computers and later the internet.

    “Now class you need a 5 page essay on Hamlet by next Friday, it should be hand written and no copying from the internet!! It needs to be hand written because you can’t always rely on computers to be there…”

  • irotsoma@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    2 years ago

    A lot of these relied on common mistakes that “AI” algorithms make but humans generally don’t. As language models are improving, it’s harder to detect.

  • Jargus@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    2 years ago

    So Democracy is basically fucked and countries without freedom of expression/speech have a advantage while our social media will be a cesspool and will divide and weaken our societies. The future looks bright /s

  • nucleative@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    edit-2
    2 years ago

    We need to embrace AI written content fully. Language is just a protocol for communication. If AI can flesh out the “packets” for us nicely in a way that fits what the receiving humans need to understand the communication then that’s a major win. Now I can ask AI to write me a nice letter and prompt it with a short bulleted list of what I want to say. Boom! Done, and time is saved.

    The professional writers who used to slave over a blank Word document are now obsolete, just like the slide rule “computers” of old (the people who could solve complicated mathematics and engineering problems on paper).

    Teachers who thought a hand written report could be used to prove that “education” has happened are now realizing that the idea was a crutch (it was 25 years ago too when we could copy/paste Microsoft Encarta articles and use as our research papers).

    The technology really just shows us that our language capabilities really are just a means to an end. If a better means asrises we should figure out how to maximize it.