Video editing servers should not be treated as just data file servers. This informative white paper from Mercury Computer Systems explains how to get the best from your hardware
Challenges of video editing servers
Non-centralised I/O management architecture
A Systems solution
Compiled by Will Garside
(c) B.Beck & C.Stakutis, Mercury Computer Systems
From 30,000 feet, a video server is not much different than a data world file server. Both support access to dozens of clients and store volumes of data. Upon closer examination, the following important differences are discovered: Video storage routinely approaches terabytes Uninterrupted video delivery tolerates latency of less than a few milliseconds The "sweetspot" data rates required by editing clients range from 30Mb/s to 270Mb/s The most important difference between video playback and editing servers is the extreme data rates that video editing demands. These data rates impose severe challenges to conventional computer and network architectures. The most common data world LAN technology today is 10Mb/s Ethernet or TokenRing networks. These network topologies are still suitable for most engineering and business LAN users. Datacentric file servers used in this environment support 100Mbits technology to a 10Mbits distribution hub or switch. 100Mbits support to the desktop is a nascent technology in most enterprises. 100Mbits LAN technology is physically attractive because it doesn't require expensive cabling and is supported by mass-market manufacturers. However, aside from video editing, there are few applications that need high bandwidth to the desktop (client). Even highly publicised video-on-demand applications typically demand under 5Mb/s highly compressed MPEG format, which can be supported within a switched 10Mb/s architecture. In contrast, the digital full-uncompressed video data rate is 270Mb/s, or 33Mbytes/s, thus every one minute of video requires two GBs of storage. A 30-second commercial production requires many minutes of source material. Of significance is that although there are instances of large databases in the conventional corporate data world, the video world demands storage orders of magnitude higher. The impact of delivering this data rate on typical computer architectures highlights another major difference between the data and video worlds. Supporting merely 10 potential clients at a 60Mb/s data rate creates a significant challenge. An effective data rate of 75Mb/s will require a bus architecture of many times that performance (due to double-moving data from device to device and other overheads). Certainly, this I/O demand far exceeds the resources of all but the most expensive computer systems in the marketplace. Thus, serving video editing users is an expensive and complex task due to: Need for terabyte storage architectures with fault tolerance State-of-the-art network performance requirements High-performance computer bus architectures Typical computer systems address internal device support with more enthusiasm than network connectivity. Even when deployed as a high-performance file server, it is common to find that the network interfaces are concentrated into an external network switch and then interfaced to the computer. Similarly, even when a terabyte-size RAID system is needed, the typical interface back to the computer is a single SCSI cable. These two interfaces quickly become bottlenecks in a high bit rate video world. However, if we eliminate the bottleneck and do network switching inside the computer, and create more I/O to the storage system, the computer's precious resource of adapter space is exhausted quickly. A video editing server supporting 10 clients will likely consume 14 to 16 I/O adapter slots in the host (to say nothing of the overall data rate these connections impose on the bus). The video world needs a computer architecture that offers: Dozens of adapter slots Bus architecture to support them Inexpensive and standard interfaces PCI is a wonderful technology that has enabled many manufactures to produce high-performance adapters sold in tremendous volumes. However, the PCI bus itself is typically limited to three or four slots in a computer and has a mere theoretical bandwidth of 100Mbytes/s. A system offering 10 to 20 PCI slots would capitalise on the value and availability of standard non-proprietary PCI adapters, provided it could solve the PCI data-rate problems. We now enter the world of "switched fabrics". For several years, the networking world has been enchanted with switched-network architectures like ATM. In the ideal ATM world, each packet passes through a fabric that is much like roadways with intersections. At each intersection, packets get routed on a new path based on congestion and quality of service requirements. This is in contrast to a bus architecture (like Ethernet or computer buses) where packets must politely wait for quiet moments and then monopolise the entire bus for their transaction time. Applying this technology to computer buses is not often done and is typically expensive. Furthermore, specialised computer buses imply specialised adapters further increasing system cost. RACEway is an ANSI standard switched-bus architecture that has been in use since 1993. Designed and developed for moving large image data between processing elements in 100-node systems, it provides a high-performance and resilient fabric for a computer video architecture. A system that accepts standard PCI cards yet "adapts" the connector to RACEway provides for an inexpensive and highly scalable architecture. RACEway offers 160Mb/s (1200Mb/s) per channel, and each PCI end-point is a separate functioning channel (using 480Mb/s RACEway chips as interconnects). Thus, a 20-slot chassis delivers one gigabyte of I/O, but most importantly, in a real-time, low-latency deterministic and scalable manner. Typical computer architectures are memory-centric. That is, a CPU (or several SMP CPUs) surround a single pool of memory. Data coming in from a SCSI channel funnels into this memory. Once there, it is moved around for operating system reasons and then built into a network structure (protocol layers). Small blocks are then sent to network adapters with protocol acknowledgements going back and forth. Obviously, such a scheme places demands on the shared access points (memory and bus), and systems need to be overbuilt if they are to expect any sort of scaling. A new combination of technology allows 95 per cent of the overhead related to dragging data from a disk out to a network to be eliminated. The disk-to-LAN patent-pending technology is unique in its ability to off-load the system resources. Using intelligent network adapters deploying a modest amount of buffering RAM and an inexpensive RISC CPU, data is transferred directly from the SCSI interface to the adapter (no host memory involved). This single transfer of the data alone is a significant benefit (compared to the multi-transfer needed in regular architectures). Furthermore, once on the card, the adapter performs all the protocol processing, thus completely eliminating this load from the host CPU. SCSI adapters often attempt to optimise their transactions for use in a high-performance environment. Multi-port adapters may be knowledgeable about attached RAID systems, but seldom are they privy to the overall task at hand. Close relationships with a SCSI vendor provides an opportunity to offer driver and firmware-level enhancements for disk-to-LAN technology, without leaving the world of standards and interoperability. Related, a partnership with a high-performance RAID manufacturer brings the whole picture together by allowing end-to-end performance optimisations. Even extremely expensive computer solutions often use second-sourced components such as adapters and storage sub-systems, but simple integration of such components is not sufficient to solve the extreme demands of the video editing world. The buyer of a video editing server desires the ultimate in a tuned and integrated high-performance package, yet is standards-based and affordable. Currently, the video editing world is typically a world with "islands of automation". Online and offline edit suites seldom are interconnected. Yet, the value of interconnected suites is enormous: Amortisation of large and expensive storage systems Centralised access to projects Ability to interoperate dissimilar suites (recompression on the fly) Asset management Digitising without consuming an edit station Yet in practice, video editing servers are not common. Until recently, physical networking technologies were not fast enough to deliver the bit rate needed by an editing application. Now, 100Mb/s Ethernet, 155Mb/s ATM, and other much faster technologies are readily available. However, computer architectures have never had to deal with such tremendous sustained data rates. Protocol processing, internal bus bandwidth issues, and most importantly, high connectivity (high number of adapters), have all slowed the migration to a centralised digital solution. A breakthrough set of technologies changes this picture. Systems built accommodating 20 PCI slots using a switched-bus PCI interconnect solve the connectivity problem. The switched nature of RACEway provides for scalability and the overall bandwidth needed. Adhering to the popularity of PCI assures affordable access to key networking and I/O adapters. Specialised direct disk-to-LAN with adapter-local protocol processing greatly reduces the strain on the system overall. Finally, tight partnerships with key I/O and sub-system suppliers provides for the needed optimised video delivery solution.