for once i'm not gonna talk about fanfic today, i just want to rant a bit, sorry it's going to be a big one
this year, 2023, we're having the worst summer of all times, every day it's the hottest day of the year, we're breaking so many records of temperature, water levels and etc, it's quite tough to be living
i live in manaus, in the state of amazonas in brazil, it's in the middle of the amazon rainforest, a big city of 2 million people and very few trees (yes.), we basically have 2 seasons (influenced by the equator line that runs above us) that is raining season (between december and may) and summer (june to november), the average temperature during these seasons are, i think, between 23°C and 29°C, and 35-40°C, besides high humidity
well, not anymore!!! because of el niño, climate change etc, things changed and now we're living through hell with little to no action from local government
the temperatures are high (39°C one of these days, and the feeling is way, way higher than that, like 47°C) and ok, we're kinda used to it, even if it's not healthy at all, BUT the humidity it's really really low (45% right at this moment, i'm used to more or less 70-80% on average), and the river are the lowest of all times since they began to record it
you can see how low it is here
the sun is so hot it's making the water hotter, and it's literally killing fish and our river dolphins! and some places the waters are so shallow, they can't breath and are dying too
not only that, but we're having to deal with forest fires, that are not natural, it's NOT normal. we don't have spontaneous fires. we don't. it's usually humans who puts it on fire either accidentally (very rare) because since i don't know when people burn their trash (it's actually against the law now, but...) and they throw cigarette butts through the car window etc, or it can be for capitalism purposes (owners of farms, illegal tree cutting, etc etc), and also, people have no sense of environmental protection, so they throw their garbage anywhere - river, side walk, forest, you name it - so we have stray pieces of glass that, with this damn heat and low humidity, it's starting fires.
and the smoke isn't going anywhere. it's here, around us, in the city (and we're not the only city going through it), we're breathing smoke. all i can see through my window right now is smoke. i can barely see. (here)
lol i just received a text from the government, it's the first since the smoke started a month or so ago
lit. translation: "civil defense: forest fire alert, with impacts on air quality in the metropolitan area of manaus. follow the instructions from the local defense."
btw this is from google few minutes ago:
36°C = 96,8°F
air quality of this morning, from local journalist, Mário Adolfo:
apparently, because of el niño, there's no wind, so the smoke isn't moving, that highly impacts quality of air and our health, i don't know how's to live a day without feeling like i'm either about to faint or throw up, my nose hurts, my throat hurts, my eyes are dry, my lips are cracking....
HOW CAN WE LIVE LIKE THIS FOR FUCKS SAKE
i want to cry, i really want to cry, or throw myself through the window. i can't, i just can't live like this anymore, my whole apartment smells like smoke and the windows are closed fUCK
btw, big national newspaper last week: how's the dry season of the amazon is going to impact your black friday shopping
what a joke
4 notes
·
View notes
How AI Solves the ‘Cocktail Party Problem’ and Its Impact on Future Audio Technologies
New Post has been published on https://thedigitalinsider.com/how-ai-solves-the-cocktail-party-problem-and-its-impact-on-future-audio-technologies/
How AI Solves the ‘Cocktail Party Problem’ and Its Impact on Future Audio Technologies
Imagine being at a crowded event, surrounded by voices and background noise, yet you manage to focus on the conversation with the person right in front of you. This ability to isolate a specific sound amidst the noisy background is known as the Cocktail Party Problem, a term first coined by British scientist Colin Cherry in 1958 to describe this remarkable ability of the human brain. AI experts have been striving to mimic this human capability with machines for decades, yet it remains a daunting task. However, recent advances in artificial intelligence are breaking new ground, offering effective solutions to the problem. This sets the stage for a transformative shift in audio technology. In this article, we explore how AI is advancing in addressing the Cocktail Party Problem and the potential it holds for future audio technologies. Before delving into how AI tends to solve it, we must first understand how humans solve the problem.
How Humans Decode the Cocktail Party Problem
Humans possess a unique auditory system that helps us navigate noisy environments. Our brains process sounds binaural, meaning we use input from both ears to detect slight differences in timing and volume, helping us detect the location of sounds. This ability allows us to orient toward the voice we want to hear, even when other sounds compete for attention.
Beyond hearing, our cognitive abilities further enhance this process. Selective attention helps us filter out irrelevant sounds, allowing us to focus on important information. Meanwhile, context, memory, and visual cues, such as lip-reading, assist in separating speech from background noise. This complex sensory and cognitive processing system is incredibly efficient but replicating it into machine intelligence remains daunting.
Why It Remains Challenging for AI?
From virtual assistants recognizing our commands in a busy café to hearing aids helping users focus on a single conversation, AI researchers have continually been working to replicate the ability of the human brain to solve the Cocktail Party Problem. This quest has led to developing techniques such as blind source separation (BSS) and Independent Component Analysis (ICA), designed to identify and isolate distinct sound sources for individual processing. While these methods have shown promise in controlled environments—where sound sources are predictable and do not significantly overlap in frequency—they struggle when differentiating overlapping voices or isolating a single sound source in real time, particularly in dynamic and unpredictable settings. This is primarily due to the absence of the sensory and contextual depth humans naturally utilize. Without additional cues like visual signals or familiarity with specific tones, AI faces challenges in managing the complex, chaotic mix of sounds encountered in everyday environments.
How WaveSciences Used AI to Crack the Problem
In 2019, WaveSciences, a U.S.-based company founded by electrical engineer Keith McElveen in 2009, made a breakthrough in addressing the cocktail party problem. Their solution, Spatial Release from Masking (SRM), employs AI and the physics of sound propagation to isolate a speaker’s voice from background noise. As the human auditory system processes sound from different directions, SRM utilizes multiple microphones to capture sound waves as they travel through space.
One of the critical challenges in this process is that sound waves constantly bounce around and mix in the environment, making it difficult to isolate specific voices mathematically. However, using AI, WaveSciences developed a method to pinpoint the origin of each sound and filter out background noise and ambient voices based on their spatial location. This adaptability allows SRM to deal with changes in real-time, such as a moving speaker or the introduction of new sounds, making it considerably more effective than earlier methods that struggled with the unpredictable nature of real-world audio settings. This advancement not only enhances the ability to focus on conversations in noisy environments but also paves the way for future innovations in audio technology.
Advances in AI Techniques
Recent progress in artificial intelligence, especially in deep neural networks, has significantly improved machines’ ability to solve cocktail party problems. Deep learning algorithms, trained on large datasets of mixed audio signals, excel at identifying and separating different sound sources, even in overlapping voice scenarios. Projects like BioCPPNet have successfully demonstrated the effectiveness of these methods by isolating animal vocalizations, indicating their applicability in various biological contexts beyond human speech. Researchers have shown that deep learning techniques can adapt voice separation learned in musical environments to new situations, enhancing model robustness across diverse settings.
Neural beamforming further enhances these capabilities by utilizing multiple microphones to concentrate on sounds from specific directions while minimizing background noise. This technique is refined by dynamically adjusting the focus based on the audio environment. Additionally, AI models employ time-frequency masking to differentiate audio sources by their unique spectral and temporal characteristics. Advanced speaker diarization systems isolate voices and track individual speakers, facilitating organized conversations. AI can more accurately isolate and enhance specific voices by incorporating visual cues, such as lip movements, alongside audio data.
Real-world Applications of the Cocktail Party Problem
These developments have opened new avenues for the advancement of audio technologies. Some real-world applications include the following:
Forensic Analysis: According to a BBC report, Speech Recognition and Manipulation (SRM) technology has been employed in courtrooms to analyze audio evidence, particularly in cases where background noise complicates the identification of speakers and their dialogue. Often, recordings in such scenarios become unusable as evidence. However, SRM has proven invaluable in forensic contexts, successfully decoding critical audio for presentation in court.
Noise-canceling headphones: Researchers have developed a prototype AI system called Target Speech Hearing for noise-canceling headphones that allows users to select a specific person’s voice to remain audible while canceling out other sounds. The system uses cocktail party problem based techniques to run efficiently on headphones with limited computing power. It’s currently a proof-of-concept, but the creators are in talks with headphone brands to potentially incorporate the technology.
Hearing Aids: Modern hearing aids frequently struggle in noisy environments, failing to isolate specific voices from background sounds. While these devices can amplify sound, they lack the advanced filtering mechanisms that enable human ears to focus on a single conversation amid competing noises. This limitation is especially challenging in crowded or dynamic settings, where overlapping voices and fluctuating noise levels prevail. Solutions to the cocktail party problem can enhance hearing aids by isolating desired voices while minimizing surrounding noise.
Telecommunications: In telecommunications, AI can enhance call quality by filtering out background noise and emphasizing the speaker’s voice. This leads to clearer and more reliable communication, especially in noisy settings like busy streets or crowded offices.
Voice Assistants: AI-powered voice assistants, such as Amazon’s Alexa and Apple’s Siri, can become more effective in noisy environments and solve cocktail party problems more efficiently. These advancements enable devices to accurately understand and respond to user commands, even during background chatter.
Audio Recording and Editing: AI-driven technologies can assist audio engineers in post-production by isolating individual sound sources in recorded materials. This capability allows for cleaner tracks and more efficient editing.
The Bottom Line
The Cocktail Party Problem, a significant challenge in audio processing, has seen remarkable advancements through AI technologies. Innovations like Spatial Release from Masking (SRM) and deep learning algorithms are redefining how machines isolate and separate sounds in noisy environments. These breakthroughs enhance everyday experiences, such as clearer conversations in crowded settings and improved functionality for hearing aids and voice assistants. Still, they also hold transformative potential for forensic analysis, telecommunications, and audio production applications. As AI continues to evolve, its ability to mimic human auditory capabilities will lead to even more significant advancements in audio technologies, ultimately reshaping how we interact with sound in our daily lives.
0 notes
Here is this week's 90's Fest Amazon Music Preferred Artists...
1. Guns N' Roses, Nirvana (9 appearances)
2. Soundgarden, Jon Bon Jovi (8 appearances)
3. Third Eye Blind, Collective Soul, The Proclaimers (7 appearances)
4. Lonestar, Scorpions, 4 Non Blondes, Rob Thomas, Wilson Phillips, Chris Isaak, Tom Petty & The Heartbreakers (6 appearances)
5. Depeche Mode, Extreme, Savage Garden, TLC, Sinéad O'Connor, Bryan Adams, Sir Mix-A-Lot, Metallica, Vanilla Ice, Aerosmith (5 appearances)
#Amazon #amazonmusic #90s #90sfest #durandurantulsas4thannual90sfest #gunsnroses #nirvana #RIPKurtCobain #soundgarden #ripchriscornell #jonbonjovi #thirdeyeblind #collectivesoul, #theproclaimers #lonestar #Scorpions #4nonblondes #robthomas #wilsonphillips #chrisisaak #tompetty #RIPTomPetty #tompettyandtheheartbreakers #depechemode #extreme #savagegarden #TLC #riplisalefteyelopes #sineadoconnor #ripsineadoconnor #bryanadams #sirmixalot #Metallica #vanillaice #Aerosmith
1 note
·
View note