@mm_maybe

mm_maybe@sh.itjust.works · 2 months ago

This caused me to delete Google Maps, but their removal of Black History month from Google Calendar is what’s making me contemplate migrating everything else away from them…

mm_maybe@sh.itjust.works · 2 months ago

I am partial to the rest of what you said but Census data were already publicly available and actually what they did was make it less accessible to data scientists and researchers like me working on the normal kind of regional planning stuff.

mm_maybe@sh.itjust.works · 2 months ago

Seriously, why the fuck is he still CEO of that company? He’s actively undermining them in every way on a global scale. Tesla shareholders are idiots…

mm_maybe@sh.itjust.works · 2 months ago

I’d say that violent incursion into blue states with the help of federal forces and local brownshirt provocateurs is almost a goal of Project 2025 and MAGA more broadly so I wouldn’t count on it not happening wherever you are. We’ve seen plenty of that kind of thing on the West Coast particularly. Local police unfortunately have a large membership overlap with said brownshirts.

mm_maybe@sh.itjust.works · 3 months ago

The problem with this is that Trump acting on his own, or in pure MAGA mode, is even worse than him acting under Musk’s influence. I mean I absolutely hate Musk and the bad name he’s given EV’s, but his influence on Trump is literally my only glimmer of hope that the American vehicle fleet will electrify enough–and quickly enough–to stave off the very worst version of climate catastrophe. Sadly Musk either doesn’t seem to give a shit about his own company, or is too busy making the cynical play that in a subsidy-free market Tesla wins due to sheer scale, as long as tariffs keep out cheap import EVs… it wouldn’t be the first time he had screwed the EV market at large in order to be the top dog in a smaller luxury niche.

But again, with immigration, Musk and Vivek are the only dissenting voices in a sea of xenophobia, even though, again, I hate the cynical anti-labor motivations behind their advocacy for H1B visas. Still, the alternative is Stephen Miller and full-on white supremacy with no exceptions for smart hard-working brown people.

It absolutely sucks that our glimmer of hope is that the billionaires who used to sound more liberal will feel some weird compulsion to act consistent with their past statements, and it’s a very slim chance that this will happen anyways. But given the state of affairs, it’s what we’ve got.

mm_maybe@sh.itjust.works · 3 months ago

Yes, you’re absolutely right. The first StarCoder model demonstrated that it is in fact possible to train a useful LLM exclusively on permissively licensed material, contrary to OpenAI’s claims. Unfortunately, the main concerns of the leading voices in AI ethics at the time this stuff began to really heat up were a) “alignment” with human values / takeover of super-intelligent AI and b) bias against certain groups of humans (which I characterize as differential alignment, i.e. with some humans but not others). The latter group has since published some work criticizing genAI from a copyright and data dignity standpoint, but their absolute position against the technology in general leaves no room for re-visiting the premise that use of non-permissively licensed work is inevitable. (Incidentally they also hate classification AI as a whole; thus smearing AI detection technology which could help on all fronts of this battle. Here again it’s obviously a matter of responsible deployment; the kind of classification AI that UHC deployed to reject valid health insurance claims, or the target selection AI that IDF has used, are examples of obviously unethical applications in which copyright infringement would be irrelevant.)

mm_maybe@sh.itjust.works · 4 months ago

Yeah I would be fine with this IF he also used the expanded powers granted to him by Trump’s Supreme Court to block the incoming fascist/monarchist takeover. Or, fine, don’t try to block them with anything that gives the courts a chance to clarify that ruling, but also don’t transfer power smoothly and peacefully to these bastards in any way shape or form, you know? If you’re saying “fuck it”, then fuck ALL of it; not just the parts of it that affect you personally.

mm_maybe@sh.itjust.works · 5 months ago

this is learning completely the wrong lesson. it has been well-known for a long time and very well demonstrated that smaller models trained on better-curated data can outperform larger ones trained using brute force “scaling”. this idea that “bigger is better” needs to die, quickly, or else we’re headed towards not only an AI winter but an even worse climate catastrophe as the energy requirements of AI inference on huge models obliterate progress on decarbonization overall.

mm_maybe@sh.itjust.works · 5 months ago

Well, maybe we need a movement to make physical copies of these games and the consoles needed to play them available in actual public libraries, then? That doesn’t seem to be affected by this ruling and there’s lots of precedent for it in current practice, which includes lending of things like musical instruments and DVD players. There’s a business near me that does something similar, but they restrict access by age to high schoolers and older, and you have to play the games there; you can’t rent them out.

mm_maybe@sh.itjust.works · 6 months ago

We don’t. It probably is. Mastodon is the way, but they need to fix a few things themselves.

mm_maybe@sh.itjust.works · 6 months ago

'tis true that women’s bodies hold great power, and not irrelevant at all to the discussion at hand. rather than reiterate and attempt to paraphrase jaron Lanier on the topic of how male obsession with creating artifical people is linked to womb envy, I’ll just link to a talk in which he explains it himself:

https://youtu.be/rGqiswuJuQI?si=oAKvWrtlji4yrfpd&t=42m05s

mm_maybe@sh.itjust.works · 6 months ago

Like any occupation, it’s a long story, and I’m happy to share more details over DM. But basically due to indecision over my major I took an abnormal amount of math, stats, and environmental science coursework even through my major was in social science, and I just kind of leaned further and further into that quirk as I transitioned into the workforce. bear in mind that data science as a field of study didn’t really exist yet when I graduated; these days I’m not sure such an unconventional path is necessary. however I still hear from a lot of junior data scientists in industry who are miserable because they haven’t figured out yet that in addition to their technical skills they need a “vertical” niche or topic area of interest (and by the way a public service dimension also does a lot to help a job feel meaningful and worthwhile even on the inevitable rough day here and there).

mm_maybe@sh.itjust.works · 6 months ago

My “day job” is doing spatial data science work for local and regional governments that have a mandate to addreas climate change in how they allocate resources. We totally use AI, just not the kind that has received all the hype… machine learning helps us recognize patterns in human behavior and system dynamics that we can use to make predictions about how much different courses of action will affect CO2 emissions. I’m even looking at small GPT models as a way to work with some of the relevant data that is sequence-like. But I will never, I repeat never, buy into the idea of spending insane amounts of energy attempting to build an AI god or Oracle that we can simply ask for the “solution to climate change”… I feel like people like me need to do a better job of making the world aware of our work, because the fact that this excuse for profligate energy waste has any traction at all seems related to the general ignorance of our existence.

mm_maybe@sh.itjust.works · 6 months ago

Me: I’ve cut my coffee intake down to one cup a day! Look how disciplined and restrained I am!

Also me: drinks 1.5 cans of Celsius per day

mm_maybe@sh.itjust.works · 7 months ago

I think that there are some people working on this, and a few groups that have claimed to do it, but I’m not aware of any that actually meet the description you gave. Can you cite a paper or give a link of some sort?

mm_maybe@sh.itjust.works · 7 months ago

Y’all should really stop expecting people to buy into the analogy between human learning and machine learning i.e. “humans do it, so it’s okay if a computer does it too”. First of all there are vast differences between how humans learn and how machines “learn”, and second, it doesn’t matter anyway because there is lots of legal/moral precedent for not assigning the same rights to machines that are normally assigned to humans (for example, no intellectual property right has been granted to any synthetic media yet that I’m aware of).

That said, I agree that “the model contains a copy of the training data” is not a very good critique–a much stronger one would be to simply note all of the works with a Creative Commons “No Derivatives” license in the training data, since it is hard to argue that the model checkpoint isn’t derived from the training data.

mm_maybe@sh.itjust.works · 7 months ago

Model sizes are larger than their training sets

Excuse me, what? You think Huggingface is hosting 100’s of checkpoints each of which are multiples of their training data, which is on the order of terabytes or petabytes in disk space? I don’t know if I agree with the compression argument, myself, but for other reasons–your retort is objectively false.

mm_maybe@sh.itjust.works · 7 months ago

The problem with your argument is that it is 100% possible to get ChatGPT to produce verbatim extracts of copyrighted works. This has been suppressed by OpenAI in a rather brute force kind of way, by prohibiting the prompts that have been found so far to do this (e.g. the infamous “poetry poetry poetry…” ad infinitum hack), but the possibility is still there, no matter how much they try to plaster over it. In fact there are some people, much smarter than me, who see technical similarities between compression technology and the process of training an LLM, calling it a “blurry JPEG of the Internet”… the point being, you wouldn’t allow distribution of a copyrighted book just because you compressed it in a ZIP file first.

mm_maybe@sh.itjust.works · 7 months ago

yes, I’ve extensively written about Phi and other related issues in a blog post which I’ll share here: https://medium.com/@matthewmaybe/data-dignity-is-difficult-64ba41ee9150

mm_maybe@sh.itjust.works · 7 months ago

I’m not proposing anything new, and I’m not here to “pitch” anything to you–read Jaron Lanier’s writings e.g. “Who Owns the Future”, or watch a talk/interview given by him, if you’re interested in a sales pitch for why data dignity is a problem worth addressing. I admire him greatly and agree with many of his observations but am not sure about his proposed solution (mainly a system of micro-payments to creators of the data used by tech companies)–I’m just here to point out that copyright infringement isn’t in fact, the main nor the only thing that is bothering so many people about generative AI, so settling copyright disputes isn’t going to stop all those people from being upset about it.

As to your comments about “feelings”, I would turn it around to you and ask why it is important to society that we prioritize the feelings (mainly greed) of the few tech executives and engineers who think that they will profit from such practices over the many, many people who object to them?