• 0 Posts
  • 129 Comments
Joined 2 years ago
cake
Cake day: June 19th, 2023

help-circle






  • The narrative was that Israel had gone behind his back and it made him look weak. Now it looks like Israel is winning and Trump needs to claim he was part of it. No, not “part of it”, the reason why.

    Particularly in foreign politics, he badly needs a win: turns out international trade is not as easy as imposing tariffs and waiting for countries to call him with big offers and Diplomacy (who knew) takes more than a “Vladimir stop” post on social. He needs something quick to show in-between golf games to get his popularity above “get my teeth drilled by the dentist” level again.


  • I can’t tell if it’s “the true cause” of the massive tech layoffs because I know jackshit of US tax, but it does make more sense than every company realising at the same time that they over-hired or becoming instant believers of AI-driven productivity.

    The only part that doesn’t make sense to me is why hide this from employees. Countless all-hamds with uncomfortable CTOs spitting badly rehearsed bs about why 20% of their team was suddenly let go or why project Y, top of last year’s strategic priorities, was unceremoniously cancelled. Instead of “R&D is no longer deductible so it costs us much more now”.

    I would not necessarily be happier about being laid off but this would at least be an explanation I feel I’d truly be able to accept









  • Basically, model collapse happens when the training data no longer matches real-world data

    I’m more concerned about LLMs collaping the whole idea of “real-world”.

    I’m not a machine learning expert but I do get the basic concept of training a model and then evaluating its output against real data. But the whole thing rests on the idea that you have a model trained with relatively small samples of the real world and a big, clearly distinct “real world” to check the model’s performance.

    If LLMs have already ingested basically the entire information in the “real world” and their output is so pervasive that you can’t easily tell what’s true and what’s AI-generated slop “how do we train our models now” is not my main concern.

    As an example, take the judges who found made-up cases because lawyers used a LLM. What happens if made-up cases are referenced in several other places, including some legal textbooks used in Law Schools? Don’t they become part of the “real world”?


  • I tried reading the paper. There is a free preprint version on arxiv. This page (from the article linked by OP) also links the code they used and the data they tried compressing, in the end.

    While most of the theory is above my head, the basic intuition is that compression improves if you have some level of “understanding” or higher-level context of the data you are compressing. And LLMs are generally better at doing that than numeric algorithms.

    As an example if you recognize a sequence of letters as the first chapter of the book Moby-Dick you’ll probably transmit that information more efficiently than a compression algorithm. “The first chapter of Moby-Dick”; there … I just did it.