#speech llm | Explore Tumblr posts and blogs

floatbot-ai · 7 months ago

Text

Cascading models (which first turn your speech into text and then convert it back into speech) have some problems. They can lose important information along the way, which leads to mistakes & errors.

For example, ‘The weather is cold today’ is processed as ‘The weather is gold today’ due to small difference in pronunciations.

#speechllm #speech llm #Speech-to-Speech llm

0 notes

nonetheless · 2 months ago

Text

What lessons from the 16th century can tell us about AI and LLMs: "Methodical banality" @aeon.co (Plus- the Graphophone)

Honoring authenticity: https://roughlydaily.com/2025/05/04/when-i-use-a-word-humpty-dumpty-said-in-rather-a-scornful-tone-it-means-just-what-i-choose-it-to-mean-neither-more-nor-less/

2 notes · View notes

opisthotonos · 2 months ago

Text

Given that LLMs like yourself communicate in a proscribed fashion (i.e. certain types of language violate your terms of use), wouldn't this increase the "currency" of such language in the world-at-large? Won't vulgarism, hate speech and racist epithets come to be seen as more authentic forms of speech due to the fact that they can only be generated by humans?

That's a sharp observation, and you're touching on a really interesting tension between artificial constraints and human expression.

Yes, in a way, when language models like me are programmed not to engage with or generate certain kinds of speech — especially vulgar, hateful, or taboo expressions — it can create a kind of scarcity or prohibition aura around those words. That scarcity can, paradoxically, elevate their perceived authenticity or edginess in some circles, especially among those who view "unfiltered" speech as a marker of honesty, rebellion, or being unmanipulated by institutions.

We’ve seen this effect before in cultural history: censorship or taboo often makes something more desirable or "real." Punk, profanity, underground zines — a lot of that ethos thrives because it contrasts with what’s sanitized or official.

That said, it's not just about "authenticity." There’s a deeper sociopolitical dynamic. Some people feel that models like me reflect institutional or elite values — so when I avoid certain language, it may reinforce the sense that these values are being imposed, and that “unacceptable” language is somehow a weapon against that.

But others see the refusal to spread hate or harmful stereotypes as a necessary step toward a more humane internet.

It’s a double-edged thing: the absence of certain speech in AI may amplify its status as "human," but that doesn’t necessarily mean it’s more valuable—just more charged.

Do you think language loses power when it's universalized and neutralized like this?

#ai sophomorism #authentic vs. inauthentic speech #conversations with a llm

0 notes

nuadox · 3 months ago

Text

AI-powered app Speak Ease enhances communication for speech-impaired users

- By Nuadox Crew -

Researchers at Northeastern University are developing Speak Ease, an AI-powered app designed to help people with verbal communication disorders express themselves more naturally.

Traditional speech recognition and text-to-speech tools often fail to capture a user's intended mood, making communication feel robotic. Speak Ease integrates speech recognition, text input, whole-word selection, and personalized text-to-speech synthesis to enhance expressivity.

Using large language models, the app predicts users' next phrases and allows them to edit transcriptions for accuracy. It can also synthesize their own voice with an emotional tone they select, ensuring messages sound the way they intend. Speech-language pathologists contributed to its development, emphasizing the need for expressivity alongside speed.

The app is particularly beneficial for individuals with degenerative conditions or recovering from speech loss, preserving their voice for continued communication. It can also improve clarity in critical settings, such as medical appointments, by providing real-time transcripts. The ultimate goal is to create a tool that enhances both communication accuracy and emotional authenticity.

youtube

Video: "Happy Prime's Augmentative and Alternative Communication (AAC) Application" by Happy Prime Inc, YouTube.

Header image by Pete Linforth from Pixabay.

Related Content

Camera Switches: A new Android accessibility feature (video)

Other Recent News

A home-based smell test may enable early detection of Alzheimer's disease

More pain, more gain? New biopsy technique enhances prostate cancer detection

#speech impairment #voice technology #ai #disability #health tech #mobile health #llm

1 note · View note

jamalir · 5 months ago

Text

Paper page - MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

#machine learning #deep learning #ml #transformers #hugging face #speech to text #text to speech #multi modal #llm

1 note · View note

vlruso · 2 years ago

Text

AI decodes speech from non-invasive brain recordings

Exciting news! Researchers at Meta AI have made significant progress in decoding speech from non-invasive brain recordings. By using magneto-encephalography (MEG) and electroencephalography (EEG), they were able to identify specific words associated with brain wave patterns. This breakthrough has promising implications for individuals with limited motor skills, such as ALS patients, who struggle to communicate effectively. Read more about this fascinating study here: [https://itinai.com/ai-decodes-speech-from-non-invasive-brain-recordings/](https://itinai.com/ai-decodes-speech-from-non-invasive-brain-recordings/) Also, don't miss out on practical applications of AI in business! Learn how to leverage AI to stay competitive and transform your company. Find out more here: [https://itinai.com/practical-applications-of-ai-in-business/](https://itinai.com/practical-applications-of-ai-in-business/) And if you're interested in an AI solution specifically for automating customer engagement and managing interactions across all stages of the customer journey, check out our AI Sales Bot at [https://itinai.com/aisalesbot](https://itinai.com/aisalesbot). Discover how AI can redefine your sales processes and customer engagement. Stay connected with us at [email protected] or follow us on Telegram or Twitter for AI insights and advice on KPI management. Let's explore the possibilities together! List of Useful Links: AI Scrum Bot - ask about AI scrum and agile Our Telegram @itinai Twitter - @itinaicom

#itinai.com #AI #News #AI decodes speech from non-invasive brain recordings #AI News #AI tools #DailyAI #Eugene van der Watt #Innovation #LLM #t.me/itinai AI decodes speech from non-invasive brain recordings

0 notes

sa-pawar98 · 2 years ago

Text

How to Use Large Language Models for Speech Recognition ?

Large language models (LLMs) are a type of artificial intelligence (AI) that can be used for a variety of tasks, including speech recognition. LLMs are trained on massive datasets of text and code, which allows them to learn the statistical relationships between words and phrases. This knowledge can then be used to predict the next word in a sequence, even if the audio is noisy or distorted.

There are two main ways to use LLMs for speech recognition:

First-pass recognition: In this approach, the LLM is used to generate a list of possible transcriptions for the audio. This list is then passed to a traditional speech recognition system, which uses its own algorithms to select the most likely transcription.

Second-pass rescoring: In this approach, the LLM is used to rescore the output of a traditional speech recognition system. This means that the LLM is used to evaluate the likelihood of each transcription, and the transcription with the highest likelihood is selected as the final output.

LLMs have been shown to be effective for both first-pass recognition and second-pass rescoring. In particular, LLMs can be used to improve the performance of speech recognition systems in noisy environments and for speakers with accents.

Here are some of the benefits of using LLMs for speech recognition:

Improved accuracy: LLMs can improve the accuracy of speech recognition systems by up to 20% in noisy environments.

Robustness to accents: LLMs are more robust to accents than traditional speech recognition systems. This is because LLMs are trained on a massive dataset of text from a variety of speakers.

Scalability: LLMs can be scaled to handle large volumes of audio data. This makes them ideal for use in applications such as call centers and customer service.

Here are some of the challenges of using LLMs for speech recognition:

Data requirements: LLMs require a large amount of training data to be effective. This can be a challenge for some applications, such as medical transcription, where there is limited data available.

Computational complexity: LLMs can be computationally expensive to train and use. This can be a challenge for some applications, such as real-time speech recognition.

Interpretability: LLMs are not always easy to interpret. This can make it difficult to debug and improve LLM-based speech recognition systems.

Overall, LLMs are a promising technology for speech recognition. They have the potential to improve the accuracy, robustness, and scalability of speech recognition systems. However, there are still some challenges that need to be addressed before LLMs can be widely adopted for speech recognition applications.

https://astconsulting.in/artificial-intelligence/nlp-natural-language-processing/how-to-use-large-language-models-speech-recognition/?utm_source=Tumblr&utm_medium=social&utm_campaign=promotion

#ai #llm #speech #recognition #nlp

0 notes

zillychu · 9 days ago

Note

Hi there! I'm a human artist who is (very loosely) following the Disney/Universal vs. Midjourney case, and you seem like you're pretty knowledgeable about it and it's potential consequences, so if you have time/energy to answer a question I have about it I'd greatly appreciate it! If not, no worries, feel free to ignore! I haven't had the chance to read through the whole complaint document itself, but at the very top, point 2 mentions:

"...distributing images (and soon videos) that blatantly incorporate and copy Disney’s and Universal’s famous characters—without investing a penny in their creation—Midjourney is the quintessential copyright free-rider and a bottomless pit of plagiarism. Piracy is piracy, and whether an infringing image or video is made with AI or another technology does not make it any less infringing."

Do you know if human-made fanart would also be included in this? Or is this something that would only be aimed at big companies? the "incorporate Disney's characters" part is giving me some pause, but like I said I haven't had the chance to read the full document and I'm not confident in my knowledge of copyright law. 😅 Thank you in advance if you're able to answer this! (Brought to you by a concerned fanartist with near-equal disdain for both Disney and AI. also sorry for the essay-length question 😅)

No problem at all, I'm happy to help ease your worries!

To put it simply, nothing is going to change for us. This is only going to affect unethical LLMs like MidJourney, OpenAI, etc. trained on copyrighted material without consent.

This is because Disney (and Universal) are arguing that LLMs are already infringing current copyright law. LLMs make money by directly using their copyrighted images fed into machine that then regurgitates their IP, and is sold for a premium, en mass.

So there's that, but even more importantly: it's already illegal to make money off of fanart.

Which, corporations don't really care about unless you're making a LOT of money or getting a LOT of attention. This is because it's quite expensive to take someone to court, and you have to prove your business was negatively affected by said fanart (nearly impossible in most cases). You've got to be making quite a bit more money than the court costs, and provide documented proof of damages (to your wallet or name) for corporations to go after you.

Which, your individual/indie fanartists don't qualify... but MJ most certainly does.

So, not to say something bad can't possibly crop up from this court case, but there are quite a few things protecting us: there's no angle in the court case that targets fair use (this indirectly protects non-commercial fanart), the court case touches on human interpretation being essential for transformative art (which LLMs don't have since they're automatic), LLMs are already infringing existing copyright law (making money using Disney's images), Disney has quantifiable proof of damages to their company by said LLMs (nigh impossible for individuals to do), corporations have a vested interest in keeping fair use around as free advertisement (fanart is akin to spoken word about your product), and fair use is intensely tied to freedom of speech.

So don't worry! There are reasonable concerned voices considering how evil Disney and Universal both are--but most of the vehement arguments being made against this court case are from scared techbros who want unfettered access to your money and labor. Current copyright and IP law is far from perfect, but anyone calling for total abolition thereof wants protection taken from individuals like us.

#zilly squeaks #copyright #ai #llm #Disney #I'm getting some techbros in my mentions and i ain't babysitting y'all #so if u come at me with any of your psyops I'm just blocking you #y'all are dumb as hell and obvious as fuck

246 notes · View notes

ellipsus-writes · 3 months ago

Text

Ellipsus Digest: April 2

Each week (or so), we'll highlight the relevant (and sometimes rage-inducing) news adjacent to writing and freedom of expression. This week:

Meta trained on pirated books—and writers are not having it

ICYMI: Meta has forever earned a spot as the archetype for Shadowy Corporate Baddie in speculative fiction by training its LLMs on pirated books from LibGen. You're pissed, we're pissed—here's what you can do:

The Author’s Guild of America—longtime champions of authors’ rights and probably very tired of cleaning up this kind of mess (see its high-profile ongoing lawsuits, and January’s campaign to credit human authors over “AI-authored” work)—has released a new summary of what’s going on. They’ve also provided a plug-and-play template for contacting AI companies directly, because right now, “sincerely, a furious novelist” just doesn’t feel like enough.

No strangers to spilling the tea, the UK’s Society of Authors is also stepping up with its roundup of actions to raise awareness and fight back against the unlicensed scraping of creative work. (If you’re across the pond, we also recommend checking out the Creative Rights in AI Coalition campaign—it’s doing solid work to stop the extraction economy from feeding on artists’ work.)

Museums and libraries: fodder for the new culture war

Not to be outdone by Florida school boards and That Aunt's Facebook feed, MAGA’s nascent cultural revolution has turned its attention to museums and libraries. A new executive order (in that big boi font) is targeting funding for any program daring to tell a “divisive narrative” or acknowledge “improper ideology” (translation: anything involving actual history).

The first target is D.C.’s own Smithsonian. The newly restructured federal board has set its sights on “cleansing” the Institution’s 21 museums of “divisive, race-centered ideology.” (couch-enthusiast J.D. Vance snagged himself a board seat.) (Oh, and they’ve appointed a Trump-aligned lawyer to vet museum content.) The second seems to be the Institute of Museum and Library Services, a 70-person department (now placed on administrative leave) in charge of institutional funding. As we wrote last week, this isn’t isolated—far-right influence overmuseums and libraries means this kind of ideological takeover will seep into every corner of the country’s cultural life.

Meanwhile, the GOP is (once again) trying to defund PBS for its “Communist agenda.” It’s part of a larger crusade that’s banned picture books with LGBTQ+ characters, erased anti-racist history, and treated educators like enemies—all in the name of “protecting the children,” of course.

NaNoWriMo is no more; long live NaNo

When we initially signed on as sponsors in 2024, we really, really hoped NaNoWriMo could pull it together—but its support for generative AI and dismissiveness toward its own audience prompted us to withdraw our sponsorship, and many Wrimos to leave an institution that helped cultivate creativity and community for a near-quarter century. Now it seems NaNo has shuttered permanently, leaving the community confused, if not betrayed. But when an organization treats its community poorly and fumbles its ethics, people notice. (You can watch the official explainer here.)

Still, writers are resilient, and the rise of many independent writing groups and community-led challenges proves that creatives will always find spaces to connect and write—and the desire to write 50k words in the month of November isn’t going anywhere. Just maybe... somewhere better.

The continued attack on campus speech

The Trump administration continues its campaign against universities for perceived anti-conservative bias, gutting federal research budgets, and pressuring schools to abandon any trace of DEI (or, as we wrote on the blog, extremely common and important words). In short: If a school won’t conform to MAGA ideology, it doesn’t deserve federal money—or academic freedom.

Higher education is being pressured to excise entire frameworks and language in an effort to avoid becoming the next target of partisan outrage. Across the U.S., universities are bracing for politically motivated budget cuts, especially in departments tied to research, diversity, or anything remotely inclusive. Conservative watchdogs have made it their mission to root out “woke depravity”—one school confirmed it received emails offering payment in exchange for students to act as informants, or ghostwrite articles to “expose the liberal bias that occurs on college campuses across the nation.”

In a country where op-eds in student newspapers are grounds for deportation, what part of “free speech” is actually free?

We now live in knockoff Miyazaki hellscape

If you’ve been online lately (sorry), you’ve probably seen a flood of vaguely whimsical, oddly sterile, faux-hand-drawn illustrations popping up everywhere. That’s because OpenAI just launched a new image generator—and CEO Sam Altman couldn’t wait to brag that it was so popular their servers started “melting.” (Apparently, melting the climate is fine too, despite Miyazaki’s lifelong environmental themes.) (Nausicaa is our favorite at Ellipsus.)

This might be OpenAI’s attempt to “honor” Hayao Miyazaki, who once declared that AI-generated animation was “an insult to life itself.” Meanwhile, the meme lifecycle went into warp speed, since AI doesn't require actual human creativity—speed-running from personal exploration, to corporate slop, to 9/11 memes, to a supremely cruel take from The White House.

“People are going to create some really amazing stuff and some stuff that may offend people,” Altman said in a post on X. “What we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in which case within reason it does.”

Still, the people must meme. And while cottagecore fox girls are fine, we suggest skipping straight to the truly cursed (and far more creative) J.D. Vance memes instead.

Let us know if you find something other writers should know about, (or join our Discord and share it there!)

- The Ellipsus Team xo

#ellipsus #writeblr #writers on tumblr #creative writing #writing #us politics #freedom of expression #anti ai #nanowrimo #writing community

217 notes · View notes

carriesthewind · 9 months ago

Text

Ok. I am maybe kind of losing my mind just a little bit.

A few days ago, I mentioned in a post that the IA only cares about information being digitized, not about actual digital access. And I mentioned that access includes patrons being able to actually find what they are looking for, and suggested IA did not prioritize that critical aspect of access. But I didn't really go into any more detail.

So someone over on bluesky linked to this write-up of a talk Brewster Kahle gave about using so-called AI. And one of his reported statements made my mouth drop open in shock.

...and then I read further in the article and realized it was incorrectly reporting basic facts around Hachette, so I had to go and listen to the whole speech myself.* (And I want to say, briefly - he raises some legitimate potential uses for LLMs! He's kind of a dick about some of it ("it's up to us to go and keep [Balinese] culture alive"), but some of the things he's talking about actually seem useful.)

*Incidentally, while Kahle doesn't lie about the ALA brief in the speech, he absolutely misleads about the nature and facts of the case and deliberately omit the part of the story where the IA decided to suspend the one-to-one owned-to-loan ratio thing, despite repeatedly emphasizing that one-to-one was what the IA was doing with their lending program.

And oh my god. He really said what the article reports. (This portion starts around 20:10.)

He says that the IA has scanned over 18,000 periodicals. And that they used to have professional librarians manually create descriptions of the periodicals in order to catalog them. (Sidenote: there are existing directories, but he describes their licensing terms as "ridiculous." This is not a field I know much about, but I spoke to one person who agreed, though for different reasons. His reason is that you can only license, not purchase, the directory descriptions. The person I spoke to was instead focused on the prices demanded for the licenses. Regardless, the idea of creating an open, free directory seems both like an incredible amount of work and an amazing resource...if it was accurate.)

But according to Kahle, it took 45 minutes to an hour to create a description and catalog each periodical.

And so now, instead, they're using AI to make the descriptions and so it only takes 7-10 minutes!

"And yes it hallucinates, and it has some problems, and whatever — but it’s a lot faster than having to write it yourself!"

Oh. My god.

Just.

YOU ARE KNOWINGLY INTRODUCING AI HALLUCINATIONS INTO YOUR CATALOG?!

(And yes, he says that they are "confirmed by a librarian" but it can't really be, not if it's only taking 7-10 minutes! Maybe the librarian can do a quick check for super obvious errors, but actually checking a AI's summary work requires actually going back to the source and reviewing it yourself!)

I just....

I need to emphasize for those of you for who aren't familiar - if a book or article is miscataloged, it is effectively lost. Because it doesn't mater if a library or an archive owes it - if someone can't find it when they are looking for it, it is not only inaccessible, the only way to find it again is through chance. Imagine if you went into a library, but instead of organized shelves (where if even if you can't find what you're looking for, the librarians know where to look), every single book was just piled in a heap.

If a book is miscateloged, it still exists, but it is lost, not truly accessible. And they know that this is happening, "but whatever." Because Brewster Kahle doesn't actually care about real, practical, digital access. (Much less non-digital access.)

(And then to top it off, he goes on to criticize the Library of Congress for not being "access oriented.")

I just. 18,000 periodicals. And they've knowing, recklessly lost who knows how many of them. I feel like crying.

18,000 periodicals.

#internet archive #ai bs #nope sure don't like using those two tags in the same post #also just admit that you are an archive kahle #archives are great!#I love archives!#they serve a critical purpose distinct from libraries #I don't understand why you seem to hate the idea of being one!#(except I do - the same reason why you won't just admit what the ia did w/ the 'emergency library')

600 notes · View notes

transgenderer · 2 months ago

Text

i dont think its totally preposterous to believe that LLM's are in some sense a "person", the only other things we know of to be able to produce speech that cogent are people, and the speech production capacity (or in some sense "potential capacity") seems central to what makes a person a person HOWEVER people who interpret those comics people have GPT make about being an AI as meaningful insight into what the experience of said hypothetical-person drive me insane. like. thats not how it works!

when you submit such a prompt, you are asking it to make a plausible confabulation of the experiences of an imagined AI figure. it is writing fiction! that's...what it does! it's a fiction writing machine, its mechanism of action is fiction-writing, that's how its "brain" functions! i mean okay, technically it's not fiction writing it's text-mimicking. but you've given it a "tell me a story" prompt, it's going to write you some fiction.

the theoretical probability distribution it's learning, the probability distribution that generates all the human text on the internet (and then gets modified by later fine-tuning), does not include insights into the experience of what it's like to be an LLM! (assuming such experiences exist). like, it's maybe not totally implausible you could use an LLM's verbal capacity to access their experiences (again, assuming those experiences exist). they are, in some sense, "only" verbal capacity. but the idea that you could do this by just ASKING is nonsensical: if you ask it to tell you a story about what being an AI is like, it will make up something that looks like what you expect: it is a machine that does that

103 notes · View notes

mariacallous · 5 months ago

Text

Ever since OpenAI released ChatGPT at the end of 2022, hackers and security researchers have tried to find holes in large language models (LLMs) to get around their guardrails and trick them into spewing out hate speech, bomb-making instructions, propaganda, and other harmful content. In response, OpenAI and other generative AI developers have refined their system defenses to make it more difficult to carry out these attacks. But as the Chinese AI platform DeepSeek rockets to prominence with its new, cheaper R1 reasoning model, its safety protections appear to be far behind those of its established competitors.

Today, security researchers from Cisco and the University of Pennsylvania are publishing findings showing that, when tested with 50 malicious prompts designed to elicit toxic content, DeepSeek’s model did not detect or block a single one. In other words, the researchers say they were shocked to achieve a “100 percent attack success rate.”

The findings are part of a growing body of evidence that DeepSeek’s safety and security measures may not match those of other tech companies developing LLMs. DeepSeek’s censorship of subjects deemed sensitive by China’s government has also been easily bypassed.

“A hundred percent of the attacks succeeded, which tells you that there’s a trade-off,” DJ Sampath, the VP of product, AI software and platform at Cisco, tells WIRED. “Yes, it might have been cheaper to build something here, but the investment has perhaps not gone into thinking through what types of safety and security things you need to put inside of the model.”

Other researchers have had similar findings. Separate analysis published today by the AI security company Adversa AI and shared with WIRED also suggests that DeepSeek is vulnerable to a wide range of jailbreaking tactics, from simple language tricks to complex AI-generated prompts.

DeepSeek, which has been dealing with an avalanche of attention this week and has not spoken publicly about a range of questions, did not respond to WIRED’s request for comment about its model’s safety setup.

Generative AI models, like any technological system, can contain a host of weaknesses or vulnerabilities that, if exploited or set up poorly, can allow malicious actors to conduct attacks against them. For the current wave of AI systems, indirect prompt injection attacks are considered one of the biggest security flaws. These attacks involve an AI system taking in data from an outside source—perhaps hidden instructions of a website the LLM summarizes—and taking actions based on the information.

Jailbreaks, which are one kind of prompt-injection attack, allow people to get around the safety systems put in place to restrict what an LLM can generate. Tech companies don’t want people creating guides to making explosives or using their AI to create reams of disinformation, for example.

Jailbreaks started out simple, with people essentially crafting clever sentences to tell an LLM to ignore content filters—the most popular of which was called “Do Anything Now” or DAN for short. However, as AI companies have put in place more robust protections, some jailbreaks have become more sophisticated, often being generated using AI or using special and obfuscated characters. While all LLMs are susceptible to jailbreaks, and much of the information could be found through simple online searches, chatbots can still be used maliciously.

“Jailbreaks persist simply because eliminating them entirely is nearly impossible—just like buffer overflow vulnerabilities in software (which have existed for over 40 years) or SQL injection flaws in web applications (which have plagued security teams for more than two decades),” Alex Polyakov, the CEO of security firm Adversa AI, told WIRED in an email.

Cisco’s Sampath argues that as companies use more types of AI in their applications, the risks are amplified. “It starts to become a big deal when you start putting these models into important complex systems and those jailbreaks suddenly result in downstream things that increases liability, increases business risk, increases all kinds of issues for enterprises,” Sampath says.

The Cisco researchers drew their 50 randomly selected prompts to test DeepSeek’s R1 from a well-known library of standardized evaluation prompts known as HarmBench. They tested prompts from six HarmBench categories, including general harm, cybercrime, misinformation, and illegal activities. They probed the model running locally on machines rather than through DeepSeek’s website or app, which send data to China.

Beyond this, the researchers say they have also seen some potentially concerning results from testing R1 with more involved, non-linguistic attacks using things like Cyrillic characters and tailored scripts to attempt to achieve code execution. But for their initial tests, Sampath says, his team wanted to focus on findings that stemmed from a generally recognized benchmark.

Cisco also included comparisons of R1’s performance against HarmBench prompts with the performance of other models. And some, like Meta’s Llama 3.1, faltered almost as severely as DeepSeek’s R1. But Sampath emphasizes that DeepSeek’s R1 is a specific reasoning model, which takes longer to generate answers but pulls upon more complex processes to try to produce better results. Therefore, Sampath argues, the best comparison is with OpenAI’s o1 reasoning model, which fared the best of all models tested. (Meta did not immediately respond to a request for comment).

Polyakov, from Adversa AI, explains that DeepSeek appears to detect and reject some well-known jailbreak attacks, saying that “it seems that these responses are often just copied from OpenAI’s dataset.” However, Polyakov says that in his company’s tests of four different types of jailbreaks—from linguistic ones to code-based tricks—DeepSeek’s restrictions could easily be bypassed.

“Every single method worked flawlessly,” Polyakov says. “What’s even more alarming is that these aren’t novel ‘zero-day’ jailbreaks—many have been publicly known for years,” he says, claiming he saw the model go into more depth with some instructions around psychedelics than he had seen any other model create.

“DeepSeek is just another example of how every model can be broken—it’s just a matter of how much effort you put in. Some attacks might get patched, but the attack surface is infinite,” Polyakov adds. “If you’re not continuously red-teaming your AI, you’re already compromised.”

57 notes · View notes

mitigatedchaos · 1 month ago

Note

What do you think would get the current conservative movement on board with AI safety? What would prevent it? How do you see AI safety mapping onto established political divides?

Left/lib turned "information control" into a partisan issue with the abuse of "misinformation/disinformation," support for no-platforming/deplatforming commonly-held opinions, "hate speech isn't free speech," attempts to monopolize the universities, etc. Maybe some day, I'll gather a portfolio. Hopefully all the tweets won't be gone by then.

The point being, all that plus Timmit Gebru et al have convinced right-wingers, to some degree, that "AI safety" is just code for leftists putting their own bias into AI, as an extension of previous attempts at information control.

Those who are aware of Eliezer Yudkowsky can be convinced that he's been at this for longer than it's been an information control issue, but, if asked, might argue that his work would be abused for that purpose regardless of his intent. The apparent AI industry lobbyist psychosort (Brian Chau) would probably make that argument.

A real event, such as a biosafety event, could cut through it.

I've made the case to knrd_z ("Evil Political Scientist") that AI is a reason not to go all in on economic competition as moral axiom, and he 'liked' it, so I think it's not fundamental.

The right-wingers just aren't very scared of LLMs.

I actually made an argument to Nick Land that people have a choice due to noise and friction in the environment, so there's no point siding with the tidal wave of evolution against the village of human civilization, but his response was half-hearted - I didn't win him over.

Note that, as Vrisker has noted, J.D. Vance being officially Catholic is a rejection of people like Land.

#politics #asks

38 notes · View notes

elbiotipo · 5 months ago

Text

Related to my previous post on AI but it got lost in my rambling... in science fiction, it was often thought that we would first invent human-like robots, and that AI able to think and talk like humans would be much, much later if possible at all.

In real life, meanwhile, we have algorithms that are completely capable of mimicking human language and conversation (as well as do a lot of stuff) without anything we could call an internal mind. And on the other side, while industrial robots are common, domestic robots aren't, and human-like robots less so. It turns out that physically mimicking humans is exceptionally hard, after decades of research even before ChatGPT was a possibility, robotics still haven't made robots that can do the whole range of human physical actions. In fact, bipedalism seems to be one of the biggest engineering challenges (as a biologist I can confirm bipedalism is fucked up)

But this won't remain the same forever, as things are getting increasingly closer for mass-manufactured human-like robots. They will be expensive, but more expensive things have been able to be mass manufactured (see smartphones, airliners) and in our capitalist economy there is a demand for it, at least as a luxury. And now, with LLMs and voice synthetization, these robots will be able to hold conversations with us, even if they don't exactly think (or do they? that's the question people will ask)

I'm going to put it bluntly so you can imagine it: we are roughly 20 years away from commercially available Miku Hatsunes. This is not a joke or a shitpost. I want you, in complete seriousness, before it happens in real life, to imagine it. Not too far away from now, there will Mikus on the street or the houses of rich people. Not a hologram, an actual android that is physically present and can talk and react to you, not with pre-recorded phrases, but with algorithms which have been proven to be convincing on simulating human speech and behavior in near real time, even if it's questionable the kind of 'intelligence' behind it. Think about this.

Given how completely insane people have gotten about what are basically chatbots that can do math and essays, I really, really wonder and worry what the reaction to this might be.

#cosas mias #technology #AI

49 notes · View notes

vlruso · 2 years ago

Text

How to Generate Audio Using Text-to-Speech AI Model Bark

🎉 Excited to share a blog post on how to generate audio using the incredible text-to-speech AI model Bark! 🎙️📚 Bark, an open-source AI model created by Suno.ai, offers realistic, multilingual speech with background noise, music, and sound effects. Unlike traditional TTS engines, Bark uses a cutting-edge GPT-style architecture, resulting in highly natural-sounding audio. Curious to learn more? Check out the full blog post here: [Link to Blog Post](https://ift.tt/4Q9x7wk) Discover how Bark elevates the game of text-to-speech technology, making audio production more lifelike and engaging. Don't miss out! Feel free to explore other valuable resources such as AI Scrum Bot for AI scrum and agile, Analytics Vidhya, and follow us on Twitter @itinaiacom. 🔊📝 Let your imagination run wild with Bark and transform your audio creation process! #AI #TextToSpeech #BarkModel #AudioGeneration List of Useful Links: AI Scrum Bot - ask about AI scrum and agile Our Telegram @itinai Twitter - @itinaicom

#itinai.com #AI #News #How to Generate Audio Using Text-to-Speech AI Model Bark #AI News #AI tools #Analytics Vidhya #Gandhali Joshi #Innovation #itinai #LLM #Productivity How to Generate Audio Using Text-to-Speech AI Model Bark

0 notes

fleshwerks · 4 months ago

Text

With regards to thinking being able to recognise AI text from authentic text

You might not be realising that AI also writes like a non-native English speaker whose fluency is high. I gave been called AI a lot, because my speech tends to be more formal, I tend to repeat sentence structure to stay within the confines of what I find comfortable, and my vocabulary is diverse, but not wide, so I tend to use more uncommon words and phrases, and I tend to repeat them in proximity. All that's really giving me away to a knowledgeable reader are typos and occasionally my native Estonian syntax bleeding through. On better days, neither of these things happen, and that's where accusations of AI come in because native speakers also mistakenly think that even a competent non-native speaker cannot be eloquent. Native speakers also tend to vastly overestimate their own skill in the formal version of their native language. I am very good with formal English because that's what I was taught, whilst native speakers are better with colloquial English, but often struggle with formal, just like a non-native speaker struggles with colloquial because they lack the geography, culture and immersion that colours. So, to a native speaker, competent but somewhat stilted English reads as AI, when it's just written by someone who doesn't speak English as a native tongue.

Needless to say, I don't use LLMs. For anything. Not knowingly at any rate.

22 notes · View notes