#Audio Data Transcription | Explore Tumblr posts and blogs

vague-humanoid · 7 months ago

Text

At the California Institute of the Arts, it all started with a videoconference between the registrar’s office and a nonprofit.

One of the nonprofit’s representatives had enabled an AI note-taking tool from Read AI. At the end of the meeting, it emailed a summary to all attendees, said Allan Chen, the institute’s chief technology officer. They could have a copy of the notes, if they wanted — they just needed to create their own account.

Next thing Chen knew, Read AI’s bot had popped up inabout a dozen of his meetings over a one-week span. It was in one-on-one check-ins. Project meetings. “Everything.”

The spread “was very aggressive,” recalled Chen, who also serves as vice president for institute technology. And it “took us by surprise.”

The scenariounderscores a growing challenge for colleges: Tech adoption and experimentation among students, faculty, and staff — especially as it pertains to AI — are outpacing institutions’ governance of these technologies and may even violate their data-privacy and security policies.

That has been the case with note-taking tools from companies including Read AI, Otter.ai, and Fireflies.ai.They can integrate with platforms like Zoom, Google Meet, and Microsoft Teamsto provide live transcriptions, meeting summaries, audio and video recordings, and other services.

Higher-ed interest in these products isn’t surprising.For those bogged down with virtual rendezvouses, a tool that can ingest long, winding conversations and spit outkey takeaways and action items is alluring. These services can also aid people with disabilities, including those who are deaf.

But the tools can quickly propagate unchecked across a university. They can auto-join any virtual meetings on a user’s calendar — even if that person is not in attendance. And that’s a concern, administrators say, if it means third-party productsthat an institution hasn’t reviewedmay be capturing and analyzing personal information, proprietary material, or confidential communications.

“What keeps me up at night is the ability for individual users to do things that are very powerful, but they don’t realize what they’re doing,” Chen said. “You may not realize you’re opening a can of worms.“

The Chronicle documented both individual and universitywide instances of this trend. At Tidewater Community College, in Virginia, Heather Brown, an instructional designer, unwittingly gave Otter.ai’s tool access to her calendar, and it joined a Faculty Senate meeting she didn’t end up attending. “One of our [associate vice presidents] reached out to inform me,” she wrote in a message. “I was mortified!”

24K notes · View notes

silverwolf242 · 4 months ago

Text

I had my fifth interview for my thesis today and the person talked for two and a half hours. Why. Why are you doing this to me. Do you know how long it will take me to transcribe that.

(I knew their interview would be longer. But. Two and half hours! And they could have talked longer! My 4 other interviews were between 25-45 minutes. For reference.)

#it is good because. yay data #but also they repeated points a lot #but always with like. a little new thing somewhere in between #and just omg it will take SO. LONG. to transcribe that #i use transcription software but i have to correct what that gets me #so it usually takes at least twice the time of the audio file

0 notes

haivoai · 2 years ago

Text

Every Detail About the Data Annotation Service

An essential stage in the development of artificial intelligence (AI) is now data annotation. Data annotation is the practice of labeling and categorizing data to make data understandable and helpful for AI models. Among the many different forms of data annotation services available, Audio Annotation Services are crucial for assisting AI systems in handling and comprehending audio data.

The Divisions Of Data Annotation:

Audio Section:

The practice of labeling or describing audio recordings to classify and organize the data is known as audio Annotation. Professional businesses provide simple audio annotation services to assist organizations in accurately and quickly annotating their audio files. By outsourcing audio Annotation, it is possible to provide useful audio data for analysis rapidly and precisely.

Geospatial Service:

Datasets that are acceptable for AI are incorporated with suitable satellite and aerial imagery through geospatial Annotation. An internal real-time dataset is produced as a result, which may be utilized to assess and provide businesses with essential, actionable data. Mapping expansive fields, construction sites, mines, real estate projects, disaster recovery scenarios, and geographical characteristics are a few instances of geospatial imagery commonly annotated. Geospatial Annotation is a priceless source of input data for machine learning tools regarding algorithms. That allows efficient access and retrieval of images from large geographical datasets.

Polygon Annotation:

A set of coordinates is drawn around a picture using the exact approach of polygon annotation. These coordinates are intended to encircle a particular object in an image closely.

Lidar Annotation:

Labeling the scene’s elements, such as the vehicles, people, and traffic signs, is required. Lidar mainly relies on machine learning algorithms to deliver real-time interpretations of point cloud data.

Keypoint Annotation:

By identifying the locations of key points, keypoint Annotation is a more thorough method of picture annotation used to find small objects and form variations. Keypoint annotations describe an object’s shape by labeling a single pixel in the image.

Data Validation:

Data Validation for AI is crucial to ensure that data from various sources will adhere to business standards and not become damaged owing to inconsistencies in type or context while moving and combining data. To avoid data loss and errors during migration, the objective is to create consistent, accurate, and complete data.

Waste Management:

The Waste Annotation technique aids in training AI models to identify waste materials and properly handle them. Waste management AI firms can achieve the accurate semantic segmentation of datasets using data annotation technologies.

Conclusion:

It is an essential step in developing and refining a versatile and practical ML algorithm. It can be skipped when only a small portion of the algorithm is required. Data Annotation Services, however, becomes vital in the age of huge data and intense competition because it trains machines to see, hear, and write as people do.

#Data Annotation Services #Text Annotation #Audio Annotation #Transcript Annotation

0 notes

dragonnarrative-writes · 3 months ago

Text

Generative AI Is Bad For Your Creative Brain

In the wake of early announcing that their blog will no longer be posting fanfiction, I wanted to offer a different perspective than the ones I’ve been seeing in the argument against the use of AI in fandom spaces. Often, I’m seeing the arguments that the use of generative AI or Large Language Models (LLMs) make creative expression more accessible. Certainly, putting a prompt into a chat box and refining the output as desired is faster than writing a 5000 word fanfiction or learning to draw digitally or traditionally. But I would argue that the use of chat bots and generative AI actually limits - and ultimately reduces - one’s ability to enjoy creativity.

Creativity, defined by the Cambridge Advanced Learner’s Dictionary & Thesaurus, is the ability to produce or use original and unusual ideas. By definition, the use of generative AI discourages the brain from engaging with thoughts creatively. ChatGPT, character bots, and other generative AI products have to be trained on already existing text. In order to produce something “usable,” LLMs analyzes patterns within text to organize information into what the computer has been trained to identify as “desirable” outputs. These outputs are not always accurate due to the fact that computers don’t “think” the way that human brains do. They don’t create. They take the most common and refined data points and combine them according to predetermined templates to assemble a product. In the case of chat bots that are fed writing samples from authors, the product is not original - it’s a mishmash of the writings that were fed into the system.

Dialectical Behavioral Therapy (DBT) is a therapy modality developed by Marsha M. Linehan based on the understanding that growth comes when we accept that we are doing our best and we can work to better ourselves further. Within this modality, a few core concepts are explored, but for this argument I want to focus on Mindfulness and Emotion Regulation. Mindfulness, put simply, is awareness of the information our senses are telling us about the present moment. Emotion regulation is our ability to identify, understand, validate, and control our reaction to the emotions that result from changes in our environment. One of the skills taught within emotion regulation is Building Mastery - putting forth effort into an activity or skill in order to experience the pleasure that comes with seeing the fruits of your labor. These are by no means the only mechanisms of growth or skill development, however, I believe that mindfulness, emotion regulation, and building mastery are a large part of the core of creativity. When someone uses generative AI to imitate fanfiction, roleplay, fanart, etc., the core experience of creative expression is undermined.

Creating engages the body. As a writer who uses pen and paper as well as word processors while drafting, I had to learn how my body best engages with my process. The ideal pen and paper, the fact that I need glasses to work on my computer, the height of the table all factor into how I create. I don’t use audio recordings or transcriptions because that’s not a skill I’ve cultivated, but other authors use those tools as a way to assist their creative process. I can’t speak with any authority to the experience of visual artists, but my understanding is that the feedback and feel of their physical tools, the programs they use, and many other factors are not just part of how they learned their craft, they are essential to their art.

Generative AI invites users to bypass mindfully engaging with the physical act of creating. Part of becoming a person who creates from the vision in one’s head is the physical act of practicing. How did I learn to write? By sitting down and making myself write, over and over, word after word. I had to learn the rhythms of my body, and to listen when pain tells me to stop. I do not consider myself a visual artist - I have not put in the hours to learn to consistently combine line and color and form to show the world the idea in my head.

But I could.

Learning a new skill is possible. But one must be able to regulate one’s unpleasant emotions to be able to get there. The emotion that gets in the way of most people starting their creative journey is anxiety. Instead of a focus on “fear,” I like to define this emotion as “unpleasant anticipation.” In Atlas of the Heart, Brene Brown identifies anxiety as both a trait (a long term characteristic) and a state (a temporary condition). That is, we can be naturally predisposed to be impacted by anxiety, and experience unpleasant anticipation in response to an event. And the action drive associated with anxiety is to avoid the unpleasant stimulus.

Starting a new project, developing a new skill, and leaning into a creative endevor can inspire and cause people to react to anxiety. There is an unpleasant anticipation of things not turning out exactly correctly, of being judged negatively, of being unnoticed or even ignored. There is a lot less anxiety to be had in submitting a prompt to a machine than to look at a blank page and possibly make what could be a mistake. Unfortunately, the more something is avoided, the more anxiety is generated when it comes up again. Using generative AI doesn’t encourage starting a new project and learning a new skill - in fact, it makes the prospect more distressing to the mind, and encourages further avoidance of developing a personal creative process.

One of the best ways to reduce anxiety about a task, according to DBT, is for a person to do that task. Opposite action is a method of reducing the intensity of an emotion by going against its action urge. The action urge of anxiety is to avoid, and so opposite action encourages someone to approach the thing they are anxious about. This doesn’t mean that everyone who has anxiety about creating should make themselves write a 50k word fanfiction as their first project. But in order to reduce anxiety about dealing with a blank page, one must face and engage with a blank page. Even a single sentence fragment, two lines intersecting, an unintentional drop of ink means the page is no longer blank. If those are still difficult to approach a prompt, tutorial, or guided exercise can be used to reinforce the understanding that a blank page can be changed, slowly but surely by your own hand.

(As an aside, I would discourage the use of AI prompt generators - these often use prompts that were already created by a real person without credit. Prompt blogs and posts exist right here on tumblr, as well as imagines and headcannons that people often label “free to a good home.” These prompts can also often be specific to fandom, style, mood, etc., if you’re looking for something specific.)

In the current social media and content consumption culture, it’s easy to feel like the first attempt should be a perfect final product. But creating isn’t just about the final product. It’s about the process. Bo Burnam’s Inside is phenomenal, but I think the outtakes are just as important. We didn’t get That Funny Feeling and How the World Works and All Eyes on Me because Bo Burnham woke up and decided to write songs in the same day. We got them because he’s been been developing and honing his craft, as well as learning about himself as a person and artist, since he was a teenager. Building mastery in any skill takes time, and it’s often slow.

Slow is an important word, when it comes to creating. The fact that skill takes time to develop and a final piece of art takes time regardless of skill is it’s own source of anxiety. Compared to @sentientcave, who writes about 2k words per day, I’m very slow. And for all the time it takes me, my writing isn’t perfect - I find typos after posting and sometimes my phrasing is awkward. But my writing is better than it was, and my confidence is much higher. I can sit and write for longer and longer periods, my projects are more diverse, I’m sharing them with people, even before the final edits are done. And I only learned how to do this because I took the time to push through the discomfort of not being as fast or as skilled as I want to be in order to learn what works for me and what doesn’t.

Building mastery - getting better at a skill over time so that you can see your own progress - isn’t just about getting better. It’s about feeling better about your abilities. Confidence, excitement, and pride are important emotions to associate with our own actions. It teaches us that we are capable of making ourselves feel better by engaging with our creativity, a confidence that can be generalized to other activities.

Generative AI doesn’t encourage its users to try new things, to make mistakes, and to see what works. It doesn’t reward new accomplishments to encourage the building of new skills by connecting to old ones. The reward centers of the brain have nothing to respond to to associate with the action of the user. There is a short term input-reward pathway, but it’s only associated with using the AI prompter. It’s designed to encourage the user to come back over and over again, not develop the skill to think and create for themselves.

I don’t know that anyone will change their minds after reading this. It’s imperfect, and I’ve summarized concepts that can take months or years to learn. But I can say that I learned something from the process of writing it. I see some of the flaws, and I can see how my essay writing has changed over the years. This might have been faster to plug into AI as a prompt, but I can see how much more confidence I have in my own voice and opinions. And that’s not something chatGPT can ever replicate.

#fandom and ethics #writing meta #anti ai #creativity #mental health #dragonnarrativewrites essays

151 notes · View notes

rqbossman · 4 months ago

Note

I have a question about the Protocol transcripts. In Ep. 31, the transcript says upload data sulpher.becher, but the audio says discard data sulpher.becher. Is this a typo, or do the transcripts intentionally include details that differ from the audio?

Hey, Been getting this one a lot. To be clear: There was just a mix-up and we'll be fixing the transcripts in due course, we're not getting all deliberately meta on you here. It's a fun mistake I'll probably talk more about later but not worth hanging your entire red-string theory on. When in doubt audio reigns supreme and transcripts are intended as a tool to assist listening.

#alexander j newall #underwhelming honesty #the magnus protocol #meta

279 notes · View notes

lakecountylibrary · 11 months ago

Text

A behind-the-scenes look...

Music credit:

Lord of the Land by Kevin MacLeod (incompetech.com) Licensed under Creative Commons Attribution 4.0 license: https://creativecommons.org/licenses/by/4.0/ Source: http://incompetech.com/music/royalty-free/index.html?isrc=USUAN1400022

Video description and audio transcript continue under the cut:

[Description: A get ready with me video narrated by a library employee, comprised of several short scenes.

Narration: Get ready with me to open a local library. My day typically starts at 8:30 and first I turn on the lights. Simple, but essential in banishing the dark spirits from the stacks.

The narrator walks into the library and turns on the lights. Several shadowy figures disappear behind the shelves as the lights come up.

Narration: Next I head down to book up the computers. Libraries require a lot of data, so we always hack into a few government databases to provide top-tier reference work.

He logs into his computer and begins typing furiously, then turns to the camera with his hand on his chin and an intent look on his face.

Narration: After that, I tend to our Guardian Tree that protects the library from evil spirits like censorship and sentence fragments.

A shot of a tree in a large planter in the middle of the library.

Narration: It's been really into cozy mysteries lately, so we do our best to provide. Thank you, Tree Spirit!

The librarian lays out three cozy mysteries on the planter's rim, then bows to the tree with his hands pressed together.

Narration: Today's a bit special, since it's the monthly taming of the library bookworm. So I grab my Library of Congress blessed sword and my favorite cardigan - plus two to my AC - and head down to the dungeon.

The librarian reaches down to grab a sword and cardigan from under his desk. He shrugs on the cardigan then takes the sword into an elevator and walks through a basement hall lined with book boxes.

Narration: Down in the dungeon we've got lots of damaged items and overstocked James Patterson books to keep the worm sated. But sometimes extra care is needed. A well scourged dragon is the key to any good collection development policy. Thanks for hanging out. Tell us how your bookstore or library gets ready. Bye!

He pulls out the sword and prepares to leap into battle in a darkened room with a flowery, cheerful sign on the door reading Sorting Room. The video ends mid leap. /description]

#get ready with me #grwm #library life #public libraries #librarians #libraries #captioned video #described video #video #tumblarians #tumblrarians #fantasy #LCPL recs

255 notes · View notes

sgiandubh · 3 months ago

Text

A clarification

As @bat-cat-reader already posted and according to C herself, McGill wasn't there and that should be enough for us. She said it loud, on BBC Four's Woman's Hour - you can listen to it here: https://www.bbc.co.uk/sounds/play/m0029hlm.

Nuala McGovern, the show's host, specifically mentions the premiere event at the Leicester Square Odeon Luxe cinema in London. The segment that interests us is roughly running between 11:08-13:30 and it could not be clearer. If he was there, why not mention him, but mention her sister, her sister's husband and some friends? I mean, how odd is that, anyways?

Listen for yourself. For obvious reasons related to size, I could not post the entire audio file, so I made a clip out of the relevant part and, as always, transcripted it:

Nuala McGovern (N): 'But I have another guest, who has just made her way into the studio. She won a BAFTA for her performance in Kenneth Branagh's film Belfast, she's known to many fans of time-travel drama Outlander as Claire, but the Irish actress Caitriona Balfe is joining me to talk about her latest role, this is playing a Russian spy in the new film The Amateur, starring Rami Malek. Welcome to Woman's Hour!'

Caitriona Balfe (C): 'Hi, Nuala, thank you for having me, just to say it, I didn't win the BAFTA, I would love to have, but I was nominated [laughs].'

'N: ' We just elevate it a bit, maybe we're sending all of that to the Universe, have it happen next time [? unclear, both laugh], but I went to see you last night, I went to the premiere. I mean, I think this is the first premiere I've ever gone to.'

C: 'Oh, well, I am very glad you've made it! I hope you had fun!'

'N:' I really did! It's such a glitzy, glamorous event, I was wondering what must it be like to be in the eye of the storm and for anybody who hasn't been, like I haven't before, apart from seeing it on TV, you know, you have these pens of journalists and fans, and you walk down a white carpet, not a red carpet last night, and people are just roaring at you and looking for attention, what does that feel like?'

C: 'Ahem, I mean it's kind of fun, I don't know. I mean, I don't think I ever had a premiere there before...'

N: ' It was Leicester Square, just to let people know, in London, on a kind of a warm evening....'

C: ' It was gorgeous, I mean, sunshine and blue skies and all of that... ahem, you know, it's kind of overwhelming, but it's also, I think, once in a while, to be able to kind of get dressed up and celebrate, you know, the hard work of a lot of people, especially when our business is sort of struggling at the moment, it feels really good.

N: 'So, ahem, and also, the crowd that was there last night, they were a very vocal crowd, I don't know, do you watch the film, or do you come out at the beginning...?

C:' No, I sat at the beginning, I watched it, my sister was there with her husband and some friends, so we all sat together, uhm... and it's fun, I mean people were laughing, people were...'

N: ' Hollering! Whooping...'

C:' ...so, it's good, that's always a good sign when the film gets people engaged, like that.'

Nothing left to comment, even if some would still like to cling to the absurd premise he was still somehow there and not mentioned at all, perhaps on purpose. Now why would that be? For all it's worth, she always mentioned McGill as a convenient prop of sorts every single time questions were probably hitting way too close to home than she thought suitable or comfortable. Mentioning her sister's husband and not 'her own'? Wow. Really wow, here.

I am very glad to be able to give more substance to these positive news, which were, I think, much needed in here.

And that's all I will comment about it. I absolutely own my varying position on the matter of McGill's presence at the event. Despite what some might want to think, I sometimes also work with the data and information some of you are kindly sending me. While I may have tips, that was not the case yesterday - just a blogger who thought she saw McGill there and felt the need to tell me and others. I now think she was honest, but very probably wrong, given what C just publicly declared on a major public media outlet.

As we know, there are no coincidences. What happened today starts to sketch a very interesting story, keeping in mind that McGill's entrance was operated in pretty much the same way, with allusions inserted in interviews, and so on.

#fandom #shitshow #The Amateur London premiere event March 2025 #verbatim

117 notes · View notes

spr1ngpvrinbwunnie · 18 days ago

Note

Another headcanon request: How would Harley do his interviews with the test subjects (children)? Is he gentle with them? What is he like? Like with the paper recording his and Quinn’s interactions, especially with y/n in the room

🧠 Harley Sawyer’s Interview Style With Test Subjects (Children) - Headcanon 👁️

📽️ Setting: Clinical but “friendly” façade

The interview rooms are always monitored with cameras and audio.

A child-friendly set design: warm lights, toys scattered subtly, maybe even posters.

On the surface, it’s meant to look like a safe space — to build trust. But it’s all fabricated. Every element in that room was calculated by Harley to manipulate response and compliance.

🧊 His Demeanor When Alone with a Subject

Unnaturally calm, with a slow and measured tone.

He smiles — but it’s too perfect. Too practiced. Like a predator learning the mask of a father.

Speaks in simplified language, almost as if reading off a script, but his eyes are too focused — not on the child, but on the results.

Often takes notes during their speech, but not in response to what they say emotionally — only in reaction to useful data: "vocal strain," "emotional resistance level," "immediate trust factor."

If the child seems nervous or shy, he’ll lean in and drop his voice to something soothing, almost fatherly. But it’s mimicry — he’s studied how empathy looks. He doesn't feel it.

🧪 When Testing Psychological Boundaries

Subtly introduces unsettling or leading questions:

“Do you ever feel lonely here?”

“Would you like it if you could stay like this forever?”

“Do you think people forget children who don’t do special things?”

He’s not just looking for answers — he’s measuring attachment styles, emotional vulnerabilities, and how far he can push loyalty.

🧍‍♀️ When You Are in the Room

And this is where things really change.

His tone becomes noticeably more performative.

He watches you more than the child — as if your perception of him is more important than anything the subject says.

If you disapprove or flinch, he’ll cover his more manipulative lines with sarcasm or dry humor:

“Don’t give me that look, I’m just asking questions. You’re the one who said I needed to work on my people skills.”

He’ll reign in his darker impulses if you’re visibly uncomfortable — for the moment.

You are the only person who’s ever made him question if he’s gone too far. And even then… he gets defensive.

“I’m not hurting them, Y/N. I’m understanding them. If you want to make something perfect, you have to take it apart first.”

🧒 Harley + Quinn (Yarnaby) Interactions on Paper

Quinn’s case file is thick, and most interviews with him were one-on-one, without oversight — except for a few where you insisted on being present.

In those earlier transcripts:

Harley’s questions with Quinn are oddly encouraging, even doting in a way: “You’re doing so well, Quinn. See? I knew you were special.”

Quinn often responds hesitantly at first, then more eagerly over time — Harley feeds him praise like candy, deliberately making himself the only source of validation in Quinn’s life.

Subtle red flags litter the files: isolating language, dependency conditioning, manipulation cloaked as mentorship.

If you’re in the room during those interactions:

Quinn often looks at you for reassurance, sensing something is off. Harley gets tense when that happens, his smile tightens.

“Eyes on me, Quinn. We’re working. Y/N’s just observing.”

If you challenge him after, he’ll deflect:

“You want me to stop now? After how far he’s come? Don’t act like this is cruel, Y/N. You’ve seen how happy he gets when he feels useful.”

💔 When Harley Is Feeling the Pressure

If his methods are questioned by higher-ups — or even by you — his interviews become sloppier, more emotionally volatile...

He might snap if a child doesn’t answer correctly. His voice sharpens. He might end the session abruptly.

He WON'T hurt them during interviews — but the psychological pressure rises fast.

If you confront him afterward, he’s either:

Coldly detached: “They’ll survive. The data’s clean.”

Or explosively defensive: “If you don’t like what you see, leave. But don’t stand there and pretend you understand what I’m doing.”

🧸 Personal Notes in His Files (Private)

Hidden between the formal recordings are pages of deeply personal, conflicting thoughts about certain subjects (especially Quinn).

Notes scribbled in a rush: “Why is he still scared of me?” / “Dependency reached. Don’t fuck this up.”

Mentions of you: “Y/N distracted subject. Too soft. Too… much.”

One margin note reads:

“If I’d had someone like them when I was his age... Would I have turned out the same?”

Harley is not gentle — but he knows how to act gentle. His interviews are manipulative, emotionally strategic, and designed to gain loyalty or extract data.

With you in the room, he modulates himself — sometimes even pretends to care — but it’s not fully altruistic; it’s because you see through him and that unnerves him more than he admits.

Despite himself, part of him wants you to believe he’s good. That he’s not a monster. But under that mask, it’s still Harley: desperate for recognition, control, and the illusion of love through obedience.

58 notes · View notes

sassy-pistachy · 4 months ago

Text

TMAGP 31 Spoilers

Fatal programmer error Extension BECHER compromised Administrator privilege revoked Extension BECHER isolated/resolved Upload data <sulphur.BECHER> complete .jmj error not resolved New administrator permission assigned

So... did Freddie just upload Colin's soul (sulphur in alchemy represents the soul) to the servers? Imma shit myself if Colin's voice starts reading out incidents.

Also ".jmj erorr not resolved". Is Freddie having trouble processing Jon, Martin and Jonah? They're from a different universe so the coding is all different and not compatible or something? lololol

EDIT: I just noticed it's only the Transcript saying "upload" sulphur data. The audio says "discard". Idk what to think of that :(

#tmagp spoilers #also i fucking loved the sound design of colin getting swallowed by the servers #the screams turning into computer noises #chefs kiss #the magnus protocol #tmagp #spoilers #tmagp season 2 #tmagp 31 #the magnus protocol spoilers

65 notes · View notes

mariacallous · 8 months ago

Text

On Saturday, an Associated Press investigation revealed that OpenAI's Whisper transcription tool creates fabricated text in medical and business settings despite warnings against such use. The AP interviewed more than 12 software engineers, developers, and researchers who found the model regularly invents text that speakers never said, a phenomenon often called a “confabulation” or “hallucination” in the AI field.

Upon its release in 2022, OpenAI claimed that Whisper approached “human level robustness” in audio transcription accuracy. However, a University of Michigan researcher told the AP that Whisper created false text in 80 percent of public meeting transcripts examined. Another developer, unnamed in the AP report, claimed to have found invented content in almost all of his 26,000 test transcriptions.

The fabrications pose particular risks in health care settings. Despite OpenAI’s warnings against using Whisper for “high-risk domains,” over 30,000 medical workers now use Whisper-based tools to transcribe patient visits, according to the AP report. The Mankato Clinic in Minnesota and Children’s Hospital Los Angeles are among 40 health systems using a Whisper-powered AI copilot service from medical tech company Nabla that is fine-tuned on medical terminology.

Nabla acknowledges that Whisper can confabulate, but it also reportedly erases original audio recordings “for data safety reasons.” This could cause additional issues, since doctors cannot verify accuracy against the source material. And deaf patients may be highly impacted by mistaken transcripts since they would have no way to know if medical transcript audio is accurate or not.

The potential problems with Whisper extend beyond health care. Researchers from Cornell University and the University of Virginia studied thousands of audio samples and found Whisper adding nonexistent violent content and racial commentary to neutral speech. They found that 1 percent of samples included “entire hallucinated phrases or sentences which did not exist in any form in the underlying audio” and that 38 percent of those included “explicit harms such as perpetuating violence, making up inaccurate associations, or implying false authority.”

In one case from the study cited by AP, when a speaker described “two other girls and one lady,” Whisper added fictional text specifying that they “were Black.” In another, the audio said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.” Whisper transcribed it to, “He took a big piece of a cross, a teeny, small piece … I’m sure he didn’t have a terror knife so he killed a number of people.”

An OpenAI spokesperson told the AP that the company appreciates the researchers’ findings and that it actively studies how to reduce fabrications and incorporates feedback in updates to the model.

Why Whisper Confabulates

The key to Whisper’s unsuitability in high-risk domains comes from its propensity to sometimes confabulate, or plausibly make up, inaccurate outputs. The AP report says, "Researchers aren’t certain why Whisper and similar tools hallucinate," but that isn't true. We know exactly why Transformer-based AI models like Whisper behave this way.

Whisper is based on technology that is designed to predict the next most likely token (chunk of data) that should appear after a sequence of tokens provided by a user. In the case of ChatGPT, the input tokens come in the form of a text prompt. In the case of Whisper, the input is tokenized audio data.

The transcription output from Whisper is a prediction of what is most likely, not what is most accurate. Accuracy in Transformer-based outputs is typically proportional to the presence of relevant accurate data in the training dataset, but it is never guaranteed. If there is ever a case where there isn't enough contextual information in its neural network for Whisper to make an accurate prediction about how to transcribe a particular segment of audio, the model will fall back on what it “knows” about the relationships between sounds and words it has learned from its training data.

According to OpenAI in 2022, Whisper learned those statistical relationships from “680,000 hours of multilingual and multitask supervised data collected from the web.” But we now know a little more about the source. Given Whisper's well-known tendency to produce certain outputs like "thank you for watching," "like and subscribe," or "drop a comment in the section below" when provided silent or garbled inputs, it's likely that OpenAI trained Whisper on thousands of hours of captioned audio scraped from YouTube videos. (The researchers needed audio paired with existing captions to train the model.)

There's also a phenomenon called “overfitting” in AI models where information (in this case, text found in audio transcriptions) encountered more frequently in the training data is more likely to be reproduced in an output. In cases where Whisper encounters poor-quality audio in medical notes, the AI model will produce what its neural network predicts is the most likely output, even if it is incorrect. And the most likely output for any given YouTube video, since so many people say it, is “thanks for watching.”

In other cases, Whisper seems to draw on the context of the conversation to fill in what should come next, which can lead to problems because its training data could include racist commentary or inaccurate medical information. For example, if many examples of training data featured speakers saying the phrase “crimes by Black criminals,” when Whisper encounters a “crimes by [garbled audio] criminals” audio sample, it will be more likely to fill in the transcription with “Black."

In the original Whisper model card, OpenAI researchers wrote about this very phenomenon: "Because the models are trained in a weakly supervised manner using large-scale noisy data, the predictions may include texts that are not actually spoken in the audio input (i.e. hallucination). We hypothesize that this happens because, given their general knowledge of language, the models combine trying to predict the next word in audio with trying to transcribe the audio itself."

So in that sense, Whisper "knows" something about the content of what is being said and keeps track of the context of the conversation, which can lead to issues like the one where Whisper identified two women as being Black even though that information was not contained in the original audio. Theoretically, this erroneous scenario could be reduced by using a second AI model trained to pick out areas of confusing audio where the Whisper model is likely to confabulate and flag the transcript in that location, so a human could manually check those instances for accuracy later.

Clearly, OpenAI's advice not to use Whisper in high-risk domains, such as critical medical records, was a good one. But health care companies are constantly driven by a need to decrease costs by using seemingly "good enough" AI tools—as we've seen with Epic Systems using GPT-4 for medical records and UnitedHealth using a flawed AI model for insurance decisions. It's entirely possible that people are already suffering negative outcomes due to AI mistakes, and fixing them will likely involve some sort of regulation and certification of AI tools used in the medical field.

87 notes · View notes

ghost-bxrd · 1 year ago

Text

— last image captured by drone315 seconds before disaster, image recovered by accessing sunken servers via diving team, incident referenced: Odyssey’s Sinking, access file J16T for further information, section: confirmed incidents

“-obviously the radar’s hung up. There’s no rock field in the area big enough to-

Did you hear that?

… No, I don’t know.

It sounded like we might have hit something. Go look… then turn on the lights. Jesus, it’s like we’re short on tech or something…

Huh? What do you mean the anchor’s gone? How can it be gone-?

Shit! Turn on the lights! Hurry! The radar’s going crazy!

Wh- No it’s not whales you dumbass! Now turn on the goddamn lights! Maybe we can see what’s out there with-

Oh holy mother of- are you seeing this? The hell is that thing? No it’s not a squid- where the fuck did you get your degree, Harold? Squid don’t have scales. Start the drone, we need to get an idea of how large it is. This is phenomenal. I’m estimating a good twenty meters with that scale size. We could be looking at an undiscovered species of sea snake here, if we bring this back to the- ha! Got it! Drone’s up and about! Harold, tell Jess to flip the beamers up to a hundred. Maybe we can spot the thing’s head—

Hold on, hey, Ed, are you seeing this-?

Oh- oh my god. Oh, Jesus Christ, oh fuck, no, fuck- Everyone, abandon ship! Get off the ship! Get off the ship right now! I repeat, get off-!”

— Voyage Data Recorder, audio transcript, recovered from ship wrecked remains of “The Oddyssey”, time stamp: 01:32, no survivors

File Notes:

Second notable incident with Subject J16T (category class: Leviathan) responsible. For first notable incident, access file: [REDACTED] Research Facility

#mer au #mermaid au #log entry #lore #world building #mer Jason Todd #abyssal mer Jason Todd #leviathan Jason Todd #au #alternate universe #merfolk #deep sea horror #digital art #found footage #subnautica vibes #batfamily #sketch #drawing #wip

152 notes · View notes

ralfmaximus · 2 months ago

Text

Meta is taking after Amazon by no longer allowing Ray-Ban Meta owners to opt out of having their voice recordings stored in the cloud. “The option to disable voice recordings storage is no longer available, but you can delete recordings anytime in settings,” the company wrote. In its voice privacy notice, Meta states that “voice transcripts and stored audio recordings are otherwise stored for up to one year to help improve Meta’s products.”

Meta (Facebook) is now forcing users of its smart glasses to store voice recordings in their cloud so they can train their AI with it. Customers no longer have a say in the matter, and the feature cannot be turned off.

THIS IS A HUGE PROBLEM because (1) the glasses record everything not just the wearer's voice, (2) random strangers you interact with have not given consent to be recorded, and (3) certainly have not given consent to be used as training data for an AI.

And yes, the wearer must choose to activate the recording feature; the glasses do not record 24/7. But (as we learned with the google glasses debacle of 2014) assholes absolutely will walk around in public recording everyfuckingthing without consent.

Be very careful around anyone wearing Ray-Ban glasses.

#meta #ai #facebook #ray ban

38 notes · View notes

bonzos-number-1-fan · 4 months ago

Text

TMAGP 31 Thoughts: Extended Sounds of Brutal Crowbar Damage

And we're back again, after quite a wait, but it's a nice easy one to get back into the swing of things. Nothing explosive happened this episode really but a lot of foundation setting. However we've finally hit the part of the show that is now a sequel to The Magnus Archives. So, if any of you have somehow not listened to that and are interested to hear why things are so fucked, that would be how you go about it.

Spoilers for TMA, and TMP episode 31 below the cut.

I didn't cover it elsewhere so I'm going to start with Season 2's trailer. It's a nice, short, and sweet trailer so there isn't a whole lot to get into. There are a few bits in the transcript that are worth pointing out though. Firstly, it's referred to as the "London Exclusion Zone, Primeline" and "Primeline" doesn't appear anywhere else in this trailer nor episode one. That's likely a portmanteau of Prime and Timeline which I would take to mean this is the universe from Archives. Given the warden's worry about tapes and a few other notable bits of text from the premieres transcript I would say it's all but confirmed. The only other thing I think is worth mentioning here is that the scuttling creatures are described as having "too many legs". Which isn't incredibly relevant but does at least show they're supernatural in some sense.

Okay, onto the episode proper and now we can all say goodbye to the number 3 blorbo, Colin. I'll always remember the way he called me a gobshite because I sent him an email during the ARG, and the way he lost his mind because gays were in the computer. RIP, Colin, rest in processors.

There isn't really a load to say on this ep is general IMO. I think it's all pretty surface level but as with the trailer there are some interesting bits and pieces to pick out of it. In general though, I thought it was a very solid start to a season. Picks up right where things left off and lays a lot of groundwork for what's to come and isn't a load of info dumping.

So there are a couple of things to pick out from Colin's very messy and unearned death. During the long string of "Discard data"s there is one that reads "upload data" in the transcript which is for sulphur. Sulphur being one of the tria prima and an incredibly important element to alchemy. Now, the actual audio does say "discard data" and it might not be anything more than a mistake but it's an interesting coincidence if that's all it is. The elements listed are also in order of abundance in the human body.

hardware damage_crowbar/DPHW 4600

I believe this joke was written purely for me. No one can convince me otherwise. It's going in the masterdoc.

I don't think there is much to say on Gwen's, Alice's, or Celia's showing in the episode. They're all more or less doing "normal" stuff. The only thing I would point out is that Celia does do some lying in the episode without the usual distortions around those in the audio. At least not that I heard.

Sam is bringing the wet cat energy the Primeline was missing since TMA's finale. It's being met with mixed reception. Most of what goes on here is all pretty obvious I think. We meet yet another version of Georgie who is a little more rugged and generally done with everyone's shit. She's introduced in the text as "Georgie P" which I can only assume is Georgie Prime. This is further reinforced by Heidi's statement describing exactly what we saw of London post-Change. With the additional talks of domains and circuses I think it's fairly hard to argue this isn't TMA's universe post-season 5. Which has some fairly strong implications for exactly how that all went down and how much the world both remembers and has changed, but I feel like that might be bet to get into elsewhere. And likely by other people. Them naming a van after Gertrude is very sweet tho.

I think that's about all I've got to say on this one. Nothing to mindblowing and not a lot of crumbs to follow but it's a great start to a season.

------------------------------------------------------------------------------

Incident/CAT#R#DPHW Master Sheet and Terminology Sheet

DPHW Theory: 5555 sounds about right to me. It's not exceptionally spooky in any single sense but is pretty broad spectrum. Pretty standard stuff. Might as well mention that Hardware Damage (Crowbar) being at 4600 also lines up very well.

CAT# Theory: Our very first 123 which is something I've personally been waiting on. I've been very vocal about how I don't think the Person/Place/Object theory makes a lot of sense. However, this is one of the ones I wouldn't argue for there if you want to stretch it to Colin still being a person after "Integration", or you want to say that JMJ also count. Not that I buy the idea any more. Although it should be noted that Johnny says in the Q&A that the first few cases are wrong. Which means if it is P/P/O it should match up perfectly if you start from the bottom until you hit a point where the wrong ones end. I don't think it would from what I recall on my essay about why it's not P/P/O but it might. I was supposed to use the break to do some more work on CAT# but then I didn't. So I've got no real insights into this one.

R# Theory: B lines up pretty well. It would be confirmable that Colin is at least missing, but getting eaten by a server rack isn't particularly likely to be why.

Header talk: Integration (organic) -/- Computer (Hardware) is a fairly standard description IMO. I can't see much to really dig into there.

#the magnus protocol #tmagp #tmagp theory #tmagp speculation #klaus = kl4-u5

44 notes · View notes

threestarsaboveclouds · 5 months ago

Text

AMETHYST PEARL

[ NEOCITIES MIRROR ]

TSAC: ...

TSAC: Little animal.

TSAC: I'm rather busy at the moment. Unless you have something to show me, please leave.

TSAC: ...

TSAC: Ah, a pearl. Where did you find this one, I wonder? Hopefully you didn't take it from my Data Archives.

TSAC: ...

TSAC: This one appears to be from my city. You seem to be spending a lot of time in Zenith, considering your last pearl was also from there.

TSAC: Hm... this contains another data buffer from a dying citizen ID drone. I suppose it makes sense that so many drones malfunctioned, considering their owners were no longer around to maintain them.

TSAC: The drone this data is from belonged to an individual named Six Motes, Ten Tranquil Mists. Aside from basic census data, I can't find any more information about them in my archives. They must not have interacted with me or the House of Spheres at any point.

TSAC: There's a section of discernible audio here... I'll repeat it out loud for you.

[ AUDIO TRANSCRIPTION - AMETHYST PEARL ]

...

???: Ouch!

...

???: Damn blackouts. They always catch me off guard.

???: Greetings, Mists! Can't sleep?

Ten Tranquil Mists: Agh!!! Three Winds, is that you? Void below, don't scare me like that!

Fourteen Flowers in Three Winds: Hah! If you don't want to be startled by figures in the dark, perhaps you shouldn't be wandering the streets late at night.

TTM: Speak for yourself. I'm not the only one out past midnight.

TW: At least I have a proper excuse!

TTM: Late night nectar cravings?

TW: Ha, not tonight. Come, follow me.

TTM: Where?

TW: Up! I was just about to climb up to the roof. You should join me.

TTM: To do what?

TW: You'll see!

...

TTM: And there he goes.

TTM: ...well, if I go home I know I won't be able to sleep. I might as well climb up ten flights of stairs. [sigh]

...

TW: There you are! I was beginning to think you'd just up and left.

TTM: huff... no... huff... of course not...

TW: Come, sit.

...

[rustling]

TTM: What's that glowing thing you've got?

TW: Sliced lilypuck! They're fresh! Here, try one.

...

TTM: A bit too sweet for me. What, did you dip this in nectar?

TW: Maybe!

...

TTM: You still haven't told me why you climbed all the way up here. And late at night, no less.

TW: Look! You can see the Spires clearly from up here.

TTM: So? You can see them much more easily in the daytime.

TW: That right there!

TTM: Hm?

TW: You're not looking! Blink and you'll miss it.

...

TTM: What, that green light?

TW: Yes!

TTM: What's so special about that?

TW: Do you know what that is?

TTM: Well, it's coming from the Spires, so I guess it's something those stuffy old scholars are amusing themselves with late at night. I never paid too much attention. This late at night I'm usually staring down at the ground so I don't trip on things in the dark. There's only so far ahead you can see with such a dim lantern...

TW: A friend of mine who interns at the Institute explained it to me once. Those lights are called laser guide stars, and they help calibrate the telescopes!

TTM: Ah, so it is the scholars after all.

TW: Not quite! Those lights are actually the iterator!

TTM: The iterator?

TW: Yes! Three Stars Above Clouds controls the telescopes. And the lasers, too. If you watch the laser beams, you can tell where the iterator is looking!

TTM: Looks like they're pretty busy.

TW: Yes. They have a lot of work to do, I'm sure.

...

TTM: ... have you ever met the iterator?

TW: Not personally, no. Though, I was once fortunate enough to attend one of their public lectures.

TTM: And? What were they like?

TW: Three Stars Above Clouds is very intelligent. Though, I guess one can assume that about an iterator. They appeared to be very passionate as well, but they were also... distant.

TTM: Distant? In what way?

TW: I'm not sure how to describe it. They speak just like you or I, but there's something... missing... behind their words.

TTM: What do you mean, "missing"?

TW: I... mean no offense to the iterator, but their behavior seemed... artificial. Detached.

TW: An iterator's mind is nothing like our own, their thoughts enigmatic and unknowably complex. Their manner of speech seemed almost like a mimicry, an attempt to impersonate something they are not.

...

TTM: What does this have to do with the lights?

TW: I suppose the lights are a reminder that, despite our differences, we're both looking at the same night sky.

TW: Your eyes should be sufficiently dark adapted now, look up.

...

TTM: Oh, wow.

TW: That's the reason for the nightly blackouts. It's not just the House of Spheres trying to inconvenience us all; without the streetlights, you can see the stars with incredible clarity.

TTM: ...

TW: That cloudy thing up there is the Great River. It wraps all around the sky.

TTM: ... I guess I can understand why people would stay up late to see this.

TW: Iterators might be nothing like us, but I think a fascination with the night sky is something we both share. When I sit up here, I like to think that we're stargazing together.

TTM: The Iterator probably has no idea you're even up here, you know.

TW: I wouldn't be so sure. Hello!

TTM: Wh- who are you talking to?

TW: Ha! There was an overseer looking over your shoulder. It's gone now. Maybe Three Stars is feeling shy.~

TTM: You need to stop scaring me like that, Winds.

TW: And you need to stop wandering around at night all alone! Come, I'll help you back to your block. I can tell you're tired now.

TTM: Hmph.

TTM: ... I guess this did help. Thanks.

...

TSAC: The planned blackouts in my city were a nightly occurrence. They allowed me to collect data without the interference of light pollution. The blackouts were supposed to occur when the majority of the population was meant to be asleep, but I suppose that didn't stop some of them from walking around at night.

TSAC: The House of Spheres received many complaints about this; some cited the darkness as inconvenient or even dangerous. Streetlights were turned off, lights in homes dimmed, and citizens walking the streets were required to use the dimmest settings on the lanterns of their drones. Very few of the complaints ever made their way to me; the House of Spheres usually attempted to respond to them as professionally as possible.

TSAC: Despite the pushback, many enjoyed the opportunity to see the stars. Some would even visit Zenith specifically for this purpose.

TSAC: My city is eternally dark, now. There's no need to light the streets if my citizens are gone.

TSAC: That said, I would advise against wandering around my streets at night, little animal. You never know what could be lurking in the dark.

TSAC: ...now then, I have work I must attend to. Farewell.

TSAC: ...

TSAC: ...and stay safe.

#pearl archives #rain world #iterator oc #rw iterator #rw oc #slugcat #three stars above clouds #fauna

49 notes · View notes

kaddyssammlung · 27 days ago

Text

A very young Vessel talking about the music that he likes.

You find a few more words and an audio file under the cut with a transcription if you don't want to hear his voice :) And also his real name is not in this!

Taken from one of those rediscovered YouTube videos that he used to have uploaded to his channel. If that is something that you would like to listen to, then do it :) If not, that's okay as well. I love his voice so I'm sharing this. It is taken from an audition that he uploaded to YouTube. Someone wanted to make some sort of band project and Vessel applied. Thank God that no one took him XD. Who knows if we would have Sleep Token then.

He was 18 when he recorded that. Idk what he listens to now. If someone wants the link to this then I can send it to you. Please don't send asks with his real name in it something. I can't answer those.

Transcipt:

"In terms of music, music is, a lot of people say this, but music is effectively my life in the sense that I live in this small room in my house and I just sort of do it every day or most of the time. I like a lot of different music, like I'm quite into a lot of what you describe as progressive metal really, so like tall and a lot of modern stuff like periphery, but some other stuff you might be more familiar with that I like, I like things like Radiohead and more in common with your tastes, I like a lot of bands like Data Remember and Blink-182, you know stufflike that, a lot of stuff really.

#young vessel

23 notes · View notes

floret-affini-research · 8 months ago

Text

RESEARCH LOG 007

RESEARCHER M. Florez

AUDIO FILE DETECTED, LOADING TRANSCRIPT...

"Well, let's see if this thing will really work. This is Maria Florez, researcher on the relationships between the Affini and the Terrans they keep as pets. I am here interviewing an Affini on how they treat their florets, difficulties in ensuring sophont safety, and the daily life they lead together. I know I asked you before I began recording, but I need to make sure I get formal consent to record. Do I have consent to record you and include you in my research? This will include giving the details of your name and the names of the florets you have dominion over."

Oh but of course cutie. This is all so very exciting.

"Yes, quite so. In that case, please state your name for the record as well as the names of your florets."

Well, my name is Tritoma Iberis, Fifth Bloom, and these are my darling little pets Coralline Iberis, First Floret, and Blair Iberis, Second Floret.

"Hi! You're really really pretty!"

"I would prefer they not interrupt the interview if possible, I must get all this data collected properly."

I understand completely. Go along now my darlings, have some fun playing with each other~

"Hehe, ok Mistress! C'mon! Let's play some games!"

"Thank you. Now, I would like to start by asking you about what a typical day is like for you with your florets. Don't need too much detail, just wanna get a basis for understanding."

Well, on a normal day, my Florets will ask me to cuddle with them, to have my vines all over them as they squirm and moan~ They're so very cute when they do that~ I like to administer a Class-M for each of them so I can make them my cute little dolls~ Just thinking about it now makes me wish they could be here so I could give you a proper demonstrat-

"That is quite enough please. You have made it clear it is enjoyable for you, but do they truly find it as wonderful as you describe?"

Oh but of course! I would never do anything to harm them or do anything they wouldn't wish for~

"I see. Well, that is rather reassuring for my research then. Now, I'd like to move onto talking abo-

Are you sure it's just your research you're reassured about?~

"I'm unsure what you mean."

You weren't worried that if you decided to give into those wonderful little thoughts of yours that you would be at the complete mercy of a beautiful Affini like Ms. Verdianthos?~

"Wh-what kind of terran do you take me for!? I am here strictly on research, not to end up as some mindless dumbass pet to a damn we-"

What did you just say about my Florets?

"...s-sorry, I didn't mean it like that. I just don't want to talk about Verdia- Ms. Ve- the other Affini right now. I'm stressed about it and I don't want to think about her right now."

...I see. I apologize for my outburst, but I do not take kindly to anyone insulting my darling florets. You may continue with your questions.

"Thank you. Only got one question left anyway since I now know how you treat your florets. What can you tell me about the difficulties that come with ensuring a sophont is truly safe and taken care of?"

Well, we first evaluate the risks that the sophonts are in. We check for any dangers they can put themselves in and do our best to remove that risk as soon as possible. We then take the sophonts with us. For them to come willingly is preferable, but as you have seen, there are many instances where we have to use xenodrugs to ensure their safety.

"I see. What about once you have captured them? Is there anything that is normally done to keep them in check then? I can't imagine most are happy after being knocked out cold and waking up somewhere strange."

After they have had a chance to acclimate to their new environment, we simply take care of them with a combination of xenodrugs, our biorhythms, and all the love and affection we can give~

"This is very fascina- Wait, I'm sorry, I don't think I've heard that term before. You said biorhythms?"

Why yes! All Affini produce biorhythms that we follow. I have also heard that they are rather hypnotic to sophonts, particularly to Terrans. It helps in the domestication process and allows our darling Florets know we are always with them~

"...oh. Oh no. No no no no, this can't be it, right? I-it has to be some kinda joke or something."

Are you alright? You seem to be getting rather anxious darling, perhaps I could help you rela-

"Shut up! I can't be here any longer! If what you're saying is true, then I really can't stay anywhere near here any longer. I have to go, I have to- ow!"

Oh my, are you alright? Please, allow me to help you back u-

"Please! I'm fine, just tripped. I need to go, this has given me too much to think about. Take care Ms. Tritoma Iberis, and let your Florets know that I have taken my leave. Terminate audio recording."

#human domestication guide #hdg #affini #relationship research #affini interview

35 notes · View notes