AudioStream API
IntroductionProvides APIs for audio streams. GroupsAudioStreamPlays or records a stream of audio. DiscussionThe general flow is that the audio stream is created; configured with input or output flags, format of the samples, preferred latency, audio callback, etc.; and then started. When started, the implementation will generally set up 2 or 3 buffers of audio samples (often silence at first). The size of the buffer depends on the preferred latency, but is generally less than 50 ms. Smaller and fewer buffers reduces latency since the minimum latency is the number of samples in each buffer times the number of buffers plus any latency introduced by the driver and/or hardware. Smaller buffers can increase CPU usage because it needs to wake up the CPU to supply data more frequently. It also increases the likelihood of the hardware running dry and dropping audio if there are thread scheduling delays that prevent the audio thread from running in time to provide new samples to the hardware. As each buffer completes in the hardware, it reuses the buffer by calling the AudioStream callback to provide more data. It then re-schedules the buffer with the audio hardware to play when the current buffer is finished. For systems that don't provide direct access to the audio hardware, it may use another mechanism, such as a file descriptor. The AudioStream implementation waits for the file descriptor to become writable, calls the AudioStream callback to fill audio data into a buffer, and then writes that buffer to the file descriptor. This process repeats as long as the audio stream is started. The audio driver wakes up the audio thread when buffers complete by indicating the file descriptor is writable. Timing is important for proper synchronization of audio. When the AudioStream callback is invoked, it provides the sample number for the first sample to be filled in and a host time in the future for when the first sample will be heard. The host time should be as close as possible to when the sample will really be heard. If the hardware or driver supports it, the sample time would come directly from the hardware's playback position within the buffer. This can be correlated with a host time by getting an accurate host time at the beginning of the audio buffer the hardware is playing. Group members:
Functions
AudioStreamCreateCreates a new AudioStream. OSStatus AudioStreamCreate( AudioStreamRef *outStream ); AudioStreamGetLatencyGets the minimum latency the system can achieve (may be higher if samples are already queued). uint32_t AudioStreamGetLatency( AudioStreamRef inStream, OSStatus *outErr ); DiscussionThis function should return the minimum latency (in microseconds) the system can achieve, which should include any DSP delays introduced by the accessory hardware after the audio samples have been retrieved from AirPlay Core. This latency information is used by AirPlay to ensure synchronization of the AirPlay audio when streaming simultaneously to multiple accessories. AudioStreamGetVolumeGets the current volume of the stream as a linear 0.0-1.0 volume. double AudioStreamGetVolume( AudioStreamRef inStream, OSStatus *outErr ); AudioStreamPreparePrepares the audio stream so things like latency can be reported, but doesn't start playing audio. OSStatus AudioStreamPrepare( AudioStreamRef inStream ); AudioStreamSetAudioCallbackSets a function to be called for audio input or output (depending on the direction of the stream). void AudioStreamSetAudioCallback( AudioStreamRef inStream, AudioStreamAudioCallback_f inFunc, void *inContext ); AudioStreamSetFlagsStreams the AudioStream. inFlags will be kAudioStreamFlag_Output. OSStatus AudioStreamSetFlags( AudioStreamRef inStream, AudioStreamFlags inFlags ); AudioStreamSetFormatSets the format to provided to the callback for input or the format provided by the callback for output. OSStatus AudioStreamSetFormat( AudioStreamRef inStream, const AudioStreamBasicDescription *inFormat ); AudioStreamSetPreferredLatencySets the lowest latency the caller thinks it will need. Defaults to 100 ms. OSStatus AudioStreamSetPreferredLatency( AudioStreamRef inStream, uint32_t inMics ); AudioStreamSetThreadNameSets the name of threads created by the clock. OSStatus AudioStreamSetThreadName( AudioStreamRef inStream, const char *inName ); AudioStreamSetThreadPrioritySets the priority of threads created by the clock. void AudioStreamSetThreadPriority( AudioStreamRef inStream, int inPriority ); AudioStreamSetVarispeedRateSets the fine-grained sample rate for use when varispeed is enabled for skew compensation. OSStatus AudioStreamSetVarispeedRate( AudioStreamRef inStream, double inHz ); ParametersDiscussionThis function will be called by AirPlay when it detects skew in the stream being played out. Input pamarter inHz specifes the new sample rate to be used by the Platform to normalize the detected skew. Platform should adjust the audio hardware to playout the stream with the new sample rate. Note that this functions will be called only on Platforms that specify its own skew compensation ability. AudioStreamSetVolumeSets the volume of the stream to a linear 0.0-1.0 volume. OSStatus AudioStreamSetVolume( AudioStreamRef inStream, double inVolume ); AudioStreamStartStarts the stream (callbacks will start getting invoked after this). OSStatus AudioStreamStart( AudioStreamRef inStream ); AudioStreamStopStops the stream. No callbacks will be received after this returns. void AudioStreamStop( AudioStreamRef inStream, Boolean inDrain ); Typedefs
AudioStreamAudioCallback_fCallback function to be called by Platform to retrieve audio samples from AirPlay. typedef void ( *AudioStreamAudioCallback_f )( uint32_t inSampleTime, uint64_t inHostTime, void *inBuffer, size_t inLen, void *inContext ); Parameters
DiscussionPlatform should call this AirPlay audio callback function from a separate platform audio thread to retrieve the audio samples from AirPlay, and then render the audio on the audio output path. If there isn't enough audio buffered to satisfy this request, the missing data will be filled in with silence. It is recommended that platform utilize available hardware mechanisms (for ex. low buffer threshold notification) as the trigger to invoke the audio callback function when the audio hardware needs the next chunk of audio. The periodicity of this callback invocation should be chosen based on the following requirements: - It should be invoked frequent enough to ensure that there is very little likelihood of the hardware running dry due to thread scheduling delays, which could prevent the audio thread from running in time to provide new samples to the hardware. So it should be based on platform properties like thread scheduling, thread priorities, cpu usage etc. - It should not be invoked with too much audio samples already queued in the audio HW buffer, as it could lead to silence samples being returned to the platform if the AirPlay buffer is close to empty. This will lead to audio drops which could have been avoided. Timing information is important for proper synchronization of audio. The inSampleTime parameter should specify the sample count/number for the first sample to be filled in. The inHostTime parameter should specify an UpTick() compatible timestamp in the future when the first sample will be heard. The developer should use the pending queued audio samples in the audio HW buffer to ensure that the inHostTime timestamp is as close as possible to when the sample will really be heard. As mentioned earlier, buffering too much audio samples in the HW buffer can lead to the scenario in which silence samples could be returned in the platform buffer if the airplay buffer is close to empty. So it is recommended that platform buffer only 2 x “low buffer threshold” worth of audio samples in the audio HW buffer. And when the hardware consumes/plays out the audio samples and hits the “low buffer threshold”, the audio callback function should be invoked again. This will ensure that: + audio callback function is called with sufficient audio samples in the HW buffer to prevent underflows. + too much audio samples is not buffered in the HW buffer when invoking the callback function. Note that the Buffer size provided to the callback function will impact the additional latency for audio to be heard. So the sizing of this should not be too big as to introduce too much latency. So the size of the buffer should depend on the preferred latency, but should be generally around 50ms. |