#AVAudioRecorder
Explore tagged Tumblr posts
Text
Record audio on a real iOS device
To record audio on a real iOS device, remember to config audio session before a recording is started.
An audio session is the intermediary between an app and iOS used to configure the app’s audio behaviour. If category "AVAudioSessionCategoryPlayAndRecord" is set, it defines iOS audio behaviour to allow audio input (recording) and output (playback).
To record audio in iLBC format, noted to remove AVSampleRateKey key in settings and change file extension to "ilbc" is necessary.
let audioSession = AVAudioSession.sharedInstance() do { try audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord) } catch let error as NSError { print(error.description) } let recordSettings = [AVSampleRateKey : NSNumber(float: Float(44100.0)), AVFormatIDKey : NSNumber(int: Int32(kAudioFormatMPEG4AAC)), AVNumberOfChannelsKey : NSNumber(int: 1), AVEncoderAudioQualityKey : NSNumber(int: Int32(AVAudioQuality.Max.rawValue))] do{ try recorder = AVAudioRecorder(URL: getFileURL(), settings: recordSettings) recorder.delegate = self recorder.prepareToRecord() } catch let error as NSError { error.description }
0 notes
Text
How my app, Tomorrow, records and play audios using AVAudioRecorder and AVAudioPlayer (Part I)
The name of my next app is going to be Tomorrow. Record inspiring messages today, get it tomorrow. The next day, it's gone. Very excited. I've created a quick and dirty website: http://tomorrow.gives My brother, David, will be redesigning the splash page to make it much better.
Today shall be a day where we discuss AVAudioRecorder. There are quite a few unique situations with AVFoundation. While AVAudioRecorder and AVAudioPlayer are very easy to use, it took a really long time to make sure that I was doing everything correctly and everything was performing the way I wanted it to. On stack overflow, you'll find code of it being in the viewcontroller and in it's simplest form. I'll show you a refactored version of AVAudioRecorder through a singleton design pattern.
First, let's set up some private properties:
@interface AudioController () @property (nonatomic, strong) AVAudioRecorder *recorder; @property (nonatomic, strong) AVAudioPlayer *player; @end
We want to make sure these are in the .m file and not the .h file because other classes do not need to know what's happening with these particular properties.
The next thing that needs to be done, initialize the AVAudioRecorder. This is the press and hold of the green button in ther previous gif.
The way that I accomplished this by using a singleton handler:
+ (AudioController *)sharedInstance { static AudioController *sharedInstance = nil; static dispatch_once_t onceToken; dispatch_once(&onceToken, ^{ sharedInstance = [[AudioController alloc] init]; }); return sharedInstance; }
The way that AVAudioRecorder is initialized: - initWithURL:settings:error:
The URL is the file system location is recorded to. The settings is the settings for the recording session The error returns, by-reference, a description of the error, if an error occurs. It is best to make sure to preset NSError *error = nil and pass in &error into the parameter to make sure that you are able to detect an error if one exists.
So first what I did was I added an NSURL to the file:
@property (nonatomic, strong) NSURL *url;
The reason for this is because we're going to be using the same url for the start of the recording and stopping of the recording. One way to initialize an NSURL is -fileURLWithPathComponents. This is a class method that returns a newly created NSURL object as a file URL with specified path components. This is what we need because the path components are separated by forward-slashes (/) in the returned URL.
So here's a lot of private methods that I used create an easy name for me to distinguish recordings, get a directory for the file to be a part of, and set up the recorder settings:
- (NSString *)nowString { NSDate *now = [NSDate date]; NSDateFormatter *formatter = [[NSDateFormatter alloc] init]; [formatter setDateFormat:@"MMMdyyyy+HHMMss"]; NSString *nowString = [formatter stringFromDate:now]; NSString *destinationString = [NSString stringWithFormat:@"%@.aac", nowString]; return destinationString; } - (NSArray *)documentsPath { NSArray *documentsPath = [NSArray arrayWithObjects:[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject], [self nowString], nil]; return documentsPath; } -(NSDictionary *)getRecorderSettings { NSMutableDictionary *recordSettings = [[NSMutableDictionary alloc] init]; [recordSettings setValue:[NSNumber numberWithInt:kAudioFormatMPEG4AAC] forKey:AVFormatIDKey]; [recordSettings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey]; [recordSettings setValue:[NSNumber numberWithInt:2] forKey:AVNumberOfChannelsKey]; [recordSettings setValue:[NSNumber numberWithInt:AVAudioQualityHigh] forKey:AVEncoderAudioQualityKey]; [recordSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey]; [recordSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey]; return recordSettings; }
Let's break it down:
I wanted my audio files to be named by "Month/Day/Year-Hour/Min/Sec.aac." While I could've used a UUID, this gives me a much simpler, easier way to distinguish if there are any timing issues or delays in other parts of my code as it is a timestamp of when a recording has occurred.
The documents path was confusing for me initially, and I'm not quite certain I've fully grasped it yet. There are numerous spots where one can save on the iPhone. It can be in a temporary directory or on the home screen, etc.
When you look in the documentation regarding NSSearchPathForDirectoryInDomain, it says:
"Creates a list of directory search paths. Creates a list of path strings for the specified directories in the specified domains. The list is in the order in which you should search the directories. If expandTilde is YES, tildes are expanded as described in stringByExpandingTildeInPath."
I wanted to put it in the home dirctory so I used : NSDocumentDirectory, NSUserDomainMask
And the other object we're putting into the array is the date filename string we created earlier.
Finally, settings. We want to make sure we are using key-value coding. So create a dictionary that can contain a bunch of values. The two things that were very crucial in making sure that it worked correctly:
[recordSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey]; [recordSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey];
Then I create a public method that records the audio to a directory:
- (AVAudioRecorder *)recordAudioToDirectory { NSError *error = nil; self.url = [NSURL fileURLWithPathComponents:[self documentsPath]]; self.recorder = [[AVAudioRecorder alloc] initWithURL:self.url settings:[self getRecorderSettings] error:&error]; [self.recorder prepareToRecord]; self.recorder.delegate = self; self.recorder.meteringEnabled = YES; [[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayAndRecord error:nil]; [[AVAudioSession sharedInstance] setActive:YES error:&error]; [self.recorder record]; return self.recorder; }
Hooray! All of that just to record the audio. Now the stopping has a bunch more and I'll do a part two because it uses another singleton handler for saving into Core Data! :)
0 notes
Text
Audio Processing silly mistake -.-
So I'm making this app, that records the player's voice, using AVAudioRecorder, and then process the audio using Dirac LE, and then play it, the the played file always seem cut off.
My silly mistake was the recording's sample rate does not match the processed audio's!
Short post to remind to always check the sample rates.
3 notes
·
View notes
Text
Tutorial: The step two to making a 'Talking' iPhone app, when to record and when to stop recording
This post is related to the following posts:
Tutorial: The first step to making a ‘Talking’ iPhone app, chipmunkifying your voice!
Tutorial: Other ways to chipmunkify your voice
The 'Talking' app, you say something and an animal repeats what you say in a cute voice.
Well, we can't really ask the player to tap the animal to make it record, we want the animal to simply record something when the player say something, and then stop recording when the player stopped talking, and then play it. So how do we detect if the player stopped talking?
How to start recording when detecting sound, and stop recording when detect silence?
From Stack Overflow:
Perhaps you could use the AVAudioRecorder's support for audio level metering to keep track of the audio levels and enable recording when the levels are above a given threshold. You'd need to enable metering with:
[anAVAudioRecorder setMeteringEnabled:YES];
and then you could periodically call:
[anAVAudioRecorder updateMeters]; power = [anAVAudioRecorder averagePowerForChannel:0]; if (power > threshold && anAVAudioRecorder.recording==NO) { [anAVAudioRecorder record]; } else if (power < threshold && anAVAudioRecorder.recording==YES) { [anAVAudioRecorder stop]; }
Or something like that.
Source: http://stackoverflow.com/questions/3855919/ios-avaudiorecorder-how-to-record-only-when-audio-input-present-non-silence
According to the API, averagePowerForChannel returns the average power of the sound being recorded. If it returns 0 that means that recording is at its full scale, the maximum power (like when someone shouts really really loudly into the mic?), while -160 is the minimum power or near silence (which is what we want right, near silence?).
Another tutorial (Tutorial: Detecting When a User Blows into the Mic by Dan Grigsby), you can also use peakPowerForChannel. He made an algorithm to get the lowPassResults of the audio input:
From the tutorial:
Each time the timer’s callback method is triggered the lowPassResults level variable is recalculated. As a convenience, it’s converted to a 0-1 scale, where zero is complete quiet and one is full volume.
We’ll recognize someone as having blown into the mic when the low pass filtered level crosses a threshold. Choosing the threshold number is somewhat of an art. Set it too low and it’s easily triggered; set it too high and the person has to breath into the mic at gale force and at length. For my app’s need, 0.95 works.
- (void)listenForBlow:(NSTimer *)timer { [recorder updateMeters]; const double ALPHA = 0.05; double peakPowerForChannel = pow(10, (0.05 * [recorder peakPowerForChannel:0])); lowPassResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * lowPassResults; if (lowPassResults > 0.95) NSLog(@"Mic blow detected"); }
Source: http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/
So I am using this Dan's algorithm, except the the threshold number, I'm still testing it out, it really is somewhat of an art.
Okay, now we know when the player STOPS talking, what about when the user starts talking? We wouldn't be able to know that since we stopped recording after the player stops talking, right? We won't be able to get the power for the channel with a stopped recorder.
And StackOverflow comes to the rescue again, I read somewhere that you should have TWO AVAudioRecorders, instead of ONE. One AVAudioRecorder to monitor the power for channel at all times and one to actually record your player's voice.
So we have:
NSURL *monitorTmpFile; NSURL *recordedTmpFile; AVAudioRecorder *recorder; AVAudioRecorder *audioMonitor;
And some booleans to keep track of when it is recording or playing:
BOOL isRecording; BOOL isPlaying;
We have to initialize both controllers, somewhere in your init add:
[self initAudioMonitor]; [self initRecorder];
The functions:
-(void) initAudioMonitor { NSMutableDictionary* recordSetting = [[NSMutableDictionary alloc] init]; [recordSetting setValue :[NSNumber numberWithInt:kAudioFormatAppleIMA4] forKey:AVFormatIDKey]; [recordSetting setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey]; [recordSetting setValue:[NSNumber numberWithInt: 1] forKey:AVNumberOfChannelsKey]; NSArray* documentPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES); NSString* fullFilePath = [[documentPaths objectAtIndex:0] stringByAppendingPathComponent: @"monitor.caf"]; monitorTmpFile = [NSURL fileURLWithPath:fullFilePath]; audioMonitor = [[ AVAudioRecorder alloc] initWithURL: monitorTmpFile settings:recordSetting error:&error]; [audioMonitor setMeteringEnabled:YES]; [audioMonitor setDelegate:self]; [audioMonitor record]; } -(void) initRecorder { NSMutableDictionary* recordSetting = [[NSMutableDictionary alloc] init]; [recordSetting setValue :[NSNumber numberWithInt:kAudioFormatAppleIMA4] forKey:AVFormatIDKey]; [recordSetting setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey]; [recordSetting setValue:[NSNumber numberWithInt: 1] forKey:AVNumberOfChannelsKey]; NSArray* documentPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES); NSString* fullFilePath = [[documentPaths objectAtIndex:0] stringByAppendingPathComponent: @"in.caf"]; recordedTmpFile = [NSURL fileURLWithPath:fullFilePath]; recorder = [[ AVAudioRecorder alloc] initWithURL: recordedTmpFile settings:recordSetting error:&error]; [recorder setMeteringEnabled:YES]; [recorder setDelegate:self]; [recorder prepareToRecord]; }
And then we have a function that will be called all the time, to monitor your AVAudioRecorders, call it somewhere in your update:
-(void) monitorAudioController: (ccTime) dt { if(!isPlaying) { [audioMonitor updateMeters]; // a convenience, it’s converted to a 0-1 scale, where zero is complete quiet and one is full volume const double ALPHA = 0.05; double peakPowerForChannel = pow(10, (0.05 * [audioMonitor peakPowerForChannel:0])); double audioMonitorResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * audioMonitorResults; NSLog(@"audioMonitorResults: %f", audioMonitorResults); if (audioMonitorResults > AUDIOMONITOR_THRESHOLD) { NSLog(@"Sound detected"); if(!isRecording) { [audioMonitor stop]; [self startRecording]; } } else { NSLog(@"Silence detected"); if(isRecording) { if(silenceTime > MAX_SILENCETIME) { NSLog(@"Next silence detected"); [audioMonitor stop]; [self stopRecordingAndPlay]; silenceTime = 0; } else { silenceTime += dt; } } } if([audioMonitor currentTime] > MAX_MONITORTIME) { [audioMonitor stop]; [audioMonitor record]; } } }
Okay, lemme explain...
You have to call [audioMonitor updateMeters], because (according to AVAudioRecorder class reference):
Refreshes the average and peak power values for all channels of an audio recorder.
And then, do you see Dan's algorithm?
const double ALPHA = 0.05; double peakPowerForChannel = pow(10, (0.05 * [audioMonitor peakPowerForChannel:0])); double audioMonitorResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * audioMonitorResults;
NSLog(@"audioMonitorResults: %f", audioMonitorResults);
If audioMonitorResults is greater than our threshold AUDIOMONITOR_THRESHOLD (to get this value, requires many hours of testing and monitoring, that's why I have a NSLog there), that means we have detected sound. And we start recording!
if(!isRecording) { [audioMonitor stop]; [self startRecording]; }
If it isn't already recording, we stop the audio monitor and start recording:
-(void) startRecording { NSLog(@"startRecording"); isRecording = YES; [recorder record]; }
Okay then, if the audioMonitorResults is less than the AUDIOMONITOR_THRESHOLD and we are recording, it means that silence has been detected, but but but, we do not stop the recording at once. Why...? Because when people are speaking, we speak like this: "Hello, how are you?" instead of "Hellohowareyou", you see the spaces between each word are also detected as silences, which is why:
if(isRecording) { if(silenceTime > MAX_SILENCETIME) { NSLog(@"Next silence detected"); [audioMonitor stop]; [self stopRecordingAndPlay]; silenceTime = 0; } else { silenceTime += dt; }
MAX_SILENCETIME is threshold for the silence time between words.
And then to make sure the size of our audioMonitor output will not explode:
if([audioMonitor currentTime] > MAX_MONITORTIME) { [audioMonitor stop]; [audioMonitor record]; }
It saves the file after MAX_MONITORTIME.
And then stopRecordingAndPlay:
-(void) stopRecordingAndPlay { NSLog(@"stopRecording Record time: %f", [recorder currentTime]); if([recorder currentTime] > MIN_RECORDTIME) { isRecording = NO; [recorder stop]; isPlaying = YES; // insert code for playing the audio here } else { [audioMonitor record]; } }
After the audio is played, call:
-(void) stopPlaying { isPlaying = NO; [audioMonitor record]; }
And there we go! :)
To summarize:
1 AVAudioRecorder to monitor when the player starts talking and stops talking
1 AVAudioRecorder to record the player's voice
Use peakPowerForChannel to detect talking or silence
And that's about it!
#app#avaudiorecorder#check metering#detect silence#detect sound#openal#stack overflow#talking#ios#iphone#sdk
2 notes
·
View notes