this latest study examined the chat logs of 19 real users of chatbots — primarily OpenAI’s ChatGPT — who reported experiencing psychological harm as a result of their chatbot use.
Pretty small sample size despite being a large dataset that they pulled from, its still the dataset of just 19 people.
AI sucks in a lot of ways sure, but this feels like fud.
I remember reading my old states book that said a minimum of 30 points needed for normal distribution. Also typically these small sets about proof of concept, so yeah you still got a point.
It’s about 300 samples for an estimate of the distribution with a 95% confidence iirc. That’s assuming the samples are representative (unbiased) and 95% confidence doesn’t mean it’s within 95% of reality, but that 5% of tests run in such a way would be expected to be inaccurate (and there’s no way of knowing for sure which one this particular sample is because even a meta study will have such an error rate, though you can increase the confidence with more samples or studies, just never to 100% unless you study every possible sample, including future ones).
fud: Fear, Uncertainty and Doubt. A tactic for denigrating a thing, usually by implication of hypothetical or exaggerated harms, often in vague language that is either tautological or not falsifiable.
"We received chat logs directly from people who self-identified as having some psychological harm related to chatbot usage (e.g. they felt deluded) via an IRB-approved Qualtrics survey "
*Looks inside
Pretty small sample size despite being a large dataset that they pulled from, its still the dataset of just 19 people.
AI sucks in a lot of ways sure, but this feels like fud.
I remember reading my old states book that said a minimum of 30 points needed for normal distribution. Also typically these small sets about proof of concept, so yeah you still got a point.
It’s about 300 samples for an estimate of the distribution with a 95% confidence iirc. That’s assuming the samples are representative (unbiased) and 95% confidence doesn’t mean it’s within 95% of reality, but that 5% of tests run in such a way would be expected to be inaccurate (and there’s no way of knowing for sure which one this particular sample is because even a meta study will have such an error rate, though you can increase the confidence with more samples or studies, just never to 100% unless you study every possible sample, including future ones).
…fud?
fud: Fear, Uncertainty and Doubt. A tactic for denigrating a thing, usually by implication of hypothetical or exaggerated harms, often in vague language that is either tautological or not falsifiable.
It’s not really ethical to just yoink people’s chats and study them
"We received chat logs directly from people who self-identified as having some psychological harm related to chatbot usage (e.g. they felt deluded) via an IRB-approved Qualtrics survey "
I wonder if the headline was written by an AI