|
|
|
|
Introduction At the February 2000 NAMM show, Cakewalk invited representatives from Microsoft and over 30 major hardware and software vendors to the first annual "Windows Professional Audio Roundtable." The purpose of the roundtable was to work together towards solutions that will make Windows the ideal platform for professional audio. This paper presents the results of the roundtable discussions. Latency: What's Required vs. What's Possible The most important performance criterion of a DAW is latency, i.e., the delay between when the software changes a sound and when that change is actually heard. Latency effects the overall responsiveness of a DAW's user interface to input gestures as well the applicability of a DAW for live input monitoring. The present trend towards software synthesis also highlights the influence of latency on the playability of a software-based instrument. Unfortunately, latency happens to be exact place where external factors influence performance the most. How low must latency be? A skilled audio engineer can hear subtle differences in the "feel" of a drum recording simply by moving a microphone 1 foot, a distance equaling 1 msec of delay. Studies have shown that humans can perceive interaural (stereo) differences as low as 10 usec (0.01 msec). Obviously, lower is better. What's the best we can deliver? Despite claims by hardware and software vendors, no one has ever scientifically measured audio latency in a DAW. However, we do know for certain that there are 3 hard limitations that put a fixed lower bound on the latency that a host application can deliver.
An analysis of interrupt latency in Windows was presented at OSDI'99 by Erik Cota-Robles and James P. Held. Their results show that the best case latency on Win9x or WinNT is about 1 msec, and that the worst case (on Win9x) can be as long as 100+ msec. When you consider the effects of converter latency and interrupt latency, it becomes clear that the lowest latency you can ever hope to achieve under Windows is about 2 msec. In reality, the influence of system load on interrupt latency and the scheduler will lead to inconsistent performance (manifested by random audio drop-outs), so in most practical cases the audio latency will be much higher. For real-world usage scenarios, minimizing the uncertainty that arises under heavy system loads is tantamount to reducing audio latency. Since WinNT (and Win2k) have tightly bounded interrupt latencies, these platforms should be better suited to the task of audio streaming. We believe an obtainable target for audio latency under Win2k is 5 msec, even under heavy system loads. Software and Hardware Development Software vendors face a daunting set of challenges. Customers demand the lowest latency possibly, but delivering this requires knowledge of O/S issues that are neither well documented nor well understood. As demonstrated by the WavePipeT technology introduced in Cakewalk Pro Audio 9, it is possible to get low latency out of standard drivers, but this is still very much dependent on the quality of the driver. Hardware vendors are challenged even further. On the Windows platform, there are a variety of driver models to consider: VxD, NT drivers and WDM. On top of these drivers live a multitude of user-mode APIs: MME, DirectX, ASIO and EASI. Audio hardware vendors are writing too much code to support too many driver models and too many APIs. As a result, driver performance is suffering overall. Consider the steps a hardware vendor takes when planning which drivers to build:
Observation 1: Too many drivers Supporting both Win9x and WinNT requires writing 2 different kernel mode drivers (a .VxD and a .SYS driver). On top of that, supporting MME, ASIO and EASI requires writing 3 different user-mode drivers.
Observation 2: Not enough kernel mode support Some vendors never leave kernel mode to do their processing. Obvious examples of this are the WDM KMixer and DirectMusic software synthesizers. Furthermore, DAW vendors need the option of moving more of their mixing and DSP into kernel mode.
Observation 3: The term "driver" is misunderstood Referring back to the 4 steps of driver development, we see that all paths of driver development lead through the DDK. Only the DDK provides the tools for interfacing to hardware in a standard way. The majority of interfacing to hardware must be done in kernel mode, within a VxD or SYS file.
Conclusion The best way to manage driver complexity while providing adequate support for future technologies is to provide a single kernel-mode audio driver. A single kernel-mode driver is in fact the hallmark of the Win32 Driver Model. The Win32 Driver Model (WDM) WDM is Microsoft's vision of simplifying driver development, providing a unified driver model for both consumer and commercial O/S's, and providing a migration path towards future O/S offerings. In this section we shall examine how close WDM comes to achieving this ideal, and the relevance of WDM to audio streaming. WDM Overview WDM works across the Win9x and Win2k platforms. A driver written to the WDM specifications will be source-code compatible on all Win9x platforms (starting with Win98SE) and Win2k. Most drivers are even binary compatible across these platforms. This implies that hardware companies can develop a single kernel-mode driver, period.
WDM provides considerable leverage to audio applications. It provides an audio mixing and resampling component that runs in kernel mode, known as "KMixer." KMixer facilities multiclient access to the same hardware and provides the illusion of limitless audio streams that are mixed in realtime. Due to its layered architecture, WDM also provides automatic support for the MME and DirectX APIs. A vendor simply needs to implement a WDM mini-port driver, and other layers in the system's driver stack provide MME and DirectX support. Unfortunately, this power comes at a price. Due to internal buffering KMixer nominally adds 30 msec of latency to audio playback streams. (At present, Microsoft does not provide a method to allow host applications to bypass KMixer.) |
|||
|
|
|||
|
Copyright © 2008 by Twelve Tone Systems, Inc. All rights
reserved. View complete copyright notice. |
|||