Joe Pranevich -- Wonderful World of Linux 2.4 (10/02/99 "Final" Edition)Oct 03, 1999, 07:10 (40 Talkback[s])
(Other stories by Joe Pranevich)
[ This is the fourth edition of this article. -- lt ed ]
(A long time ago, in a galaxy not to far from this one, I wrote an article. It wasn't much, just a laundry list of the new and improved features of Linux 2.2. That was a long time ago and the newness of Linux 2.2 is beginning to wear off; it's even lost that new kernel smell. And so with anxious eyes, we set our sights towards the future of Linux: Linux 2.4. In the Linux world, it is uncommon to announce release dates; new versions are released when they are ready and not before. With that said however, Linux 2.3 has entered into a feature freeze and looks to be ready for the masses sometime around Christmas.--
What, at the core, is the Linux kernel? Just as the kernel is the heart of the Linux (or GNU/Linux or whatever) Operating System, the kernel itself can be divided into core and non-core parts. Linux is much more than just a collection of assorted device drivers, as any operating system must be. It's what binds these drivers together into a cohesive unit that matters. It's the scheduler, the resource allocator, the virtual filesystem layer, memory management, and so many other unsung features that are the real heroes of the Linux world. These are the portions of the Linux operating system that really define what is Linux because on every platform that Linux has been ported to from i386 (Intel-compatible PC), to ARM (embedded devices), to Sparc64 (high-end servers) this code is the same. In many ways, this "heart" of Linux 2.4 is different than Linux 2.2's and most of the subsystems that I just listed have been changed in one way or another.
Linux 2.2 and earlier Linux's included a base resource management system which was used rather bluntly to allocate and keep track of IO ports and IRQ lines and the other limited niceties of computer architectures. Unfortunately it was deficient in a number of important ways which proved crucial to the needs of a modern desktop operating system. The new system under Linux 2.4 includes a much more generic implementation which allows for nested resource groups, removed the dependencies on pre-defined resource types, and otherwise made it easier to use for a majority of the tasks required by driver developers. Additionally, this has laid the groundwork for ISA PnP support which is discussed more fully later in this article. This quick hack by Linus will probably be one of the most influential changes to go into the 2.4 kernel.
The virtual filesystem layer (VFS) has also been heavily modified from earlier Linuxes. Linux 2.2 featured a number of wonderful changes to this layer that allowed for better caching and a much more efficient system overall. However, the system in Linux 2.2 still had a number of important limitations which were resolved in time for Linux 2.4. One major limitation to the way Linux 2.2 handled things was its use of two buffers for caching: one for reading and one for output. As you can imagine, this made things very complicated as the kernel developers had to code with kid gloves to always ensure that these caches were in synch when they had to be. Linux 2.4 brings this wall completely down by removing the multiple cache system and putting all the work into a single page caching layer. This change makes Linux 2.4 more efficient, the code is easier to understand for developers, and the amount of memory needed for the caches have been split roughly in two. During the course of this rewrite, many race conditions (errors caused when multiple processes "race" for access to unprotected variables) were removed and the code streamlined to allow significantly better scaling to higher-end systems and disk writes to happen faster when multiple volumes are involved.
One common problem with Linux 2.2 that interfered with high-end (Intel?) machines was its process limitations. Linux 2.2 only allowed you to have 1024 processes or threads running at once. With high-end systems with many thousands of users, this could become a problem very quickly. Linux 2.4 has gotten rid of this relic and implemented a scalable limit which can be configured at run time and is only limited by the amount of memory in the system. On high-end servers with as little as half a gigabyte of RAM installed, it is easily possible to support as many as 16 thousand processes at once. Other users have reported being able to run many more than that on their specific systems. This was one of the major bottlenecks that kept Linux out of the Enterprise markets.
In terms of memory consumption, Linux 2.4 will require approximately the same amount of memory as does Linux 2.2. Some subsystems have been added or expanded and some have been streamlined. Some obsolete code has been removed. There are even certain cases where some systems will require less memory than Linux 2.2! It should be noted also that Linux 2.4 will also support /more/ memory than its predecessor. As of Linux 2.4, up to 4 gigabytes of RAM will be supported on Intel machines. This additional RAM will not be treated in exactly the same way as lower RAM (due to Intel design features) but will however be used by many in-kernel structures.
Linux 2.2 and Linux 2.0 included support for running Java applications (or rather, starting a Java interpreter/compiler when necessary) and was the first OS to do so at the kernel level. When a java application was executed, the Java binary loader would load up your Java interpreter with the proper arguments. Naturally, it would be easy to implement this functionality using the newer "misc." loader type and instructions were provided with Linux 2.2 on how to do so. Linux 2.4 will finally put the old binary loader to rest and all users who used the old module will have to upgrade their configuration to make the new association.
Linux 2.4 will be much more dependent on the ELF format than Linux 2.2 was, although Linux 2.2 was the first version of Linux to require the kernel to be compiled as ELF. (ELF is an advanced binary format that includes support for multiple code and data sections, easier support for shared libraries, and other niceties. It is approximately akin to the Win32 format, however better designed and without nearly as much cruft.) By more fully exploiting the ELF binary format, the kernel developers could make some pieces of code more modular and easy to maintain. Many types of drivers will become more "plug and play" (if I me be so bold as to abuse that term) as they will be initialized based on how they are linked rather than by having an explicit initialization line in the core code.
In addition, there are some other noteworthy changes to Linux 2.4 that I should mention before we move on into the specific subsystems. Linux 2.4 will be in some ways more standards compliant than previous Linuxes with the adoption of support for POSIX clocks and timers, allowing for non-rtc devices to be used as clocks internally. (This would be specialized hardware, generally.) The NFS filesystem, the standard network filesystem used under most UNIXes, now supports most of the features of version 3 of the protocol and Linux will better be able to communicate with machines which communicate with this standard (this will be discussed further in the filesystem section, below.) In addition, some minor changes were made to the threading model and elsewhere to make things more compatible.
The Many Flavors of Linux
On Intel-compatible hardware, Linux 2.4 includes the same excellent support for processors as did Linux 2.2. This includes optimizations for 386, 486, 586 (Pentium), and 686 (Pentium Pro / Pentium II / Pentium III) processors, as well as "compatible" counterparts such as those made by AMD and Cyrix. Additionally, Linux 2.4 will include additional support for hardware present with modern chips. While Linux 2.2 includes support for Intel's Memory Type Range Registers (MTRRs) to increase performance to some kinds of high-bandwidth devices, Linux 2.4 has taken this support even further by supporting variants common to the compatible chips. (This includes both the double MTRRs present with AMD K7 processors and the MCR variant preferred by Cyrix.) Linux 2.2 also included support for the IO-APIC (Advanced Programmable Interrupt Controller) which allowed interrupts to be spread across multiple processors in a multi-processing system. Linux 2.4 will, as expected, take this to the next level and support some high-end systems which actually contain multiple IO-APIC controllers; this will allow these machines to scale even better than before.
To my knowledge, the only multi-processor systems which we still do not completely support are some very old 486 ones that mix 486DX and 486SX chips in the same system. This is mainly because the SX chips did not contain math coprocessors and there is some difficulty in making sure that applications that need floating point math get to work on the right one. As you can imagine, there isn't much call for this feature. (There may be other buggy chipsets combinations that are unsupported, however I am not personally aware of them.) Considering that no one in their right mind will still be using such a system (they could easily upgrade to a chip with a FPU), I don't consider this much of a limitation.
Linux 2.4 and Merced (ia64)
I do not want to imply that there are no remaining problems that keep Linux from being "perfect" on 64-bit systems; that is not the case. However a vast majority of the difficult and subtle parts are completed and all that remains are problems derived from a legacy world.
Linux 2.4 and Pre-386 Intel Chips
Buses - ISA, PCI, USB, MCA, etc.
There is more exciting news from this front however. Universal Serial Bus, a new external bus type just now coming into prominence for devices such as keyboards, mice, sound systems, and scanners is now supported in the Linux kernel. At the time of this writing, the support is not 100% and many individual and common USB devices are not supported or not completely supported. I would be confident however that the number of devices which are supported will only rise over time, just as we observed a similar rise in the number of framebuffer devices that are now supported. (The framebuffer was a new feature to Linux 2.2, see below.) Currently, keyboards and mice are working mostly as you would expect. Support for sound systems is coming along rapidly. Other devices, such as modems and network cards, already have preliminary support however their drivers are not complete.
In addition to USB, I2O device (Intelligent Input/Output) support, an extension of PCI, has been added in Linux 2.4. In theory, this will allow for more operating system independent devices and drivers to exist. Many I2O devices are already functioning and more will be added before Linux 2.4.
PCMCIA support, the semi-external bus common in laptop computers, is now supported from within the standard kernel distribution. No longer will PCMCIA users need to download and install separate packages to get their systems to work properly.
Linux and ISA Plug-and-Play
Block Devices - Disk Drives, RAID Controllers,
Block devices are hardware whose data can be best expressed in an array of bytes that can be accessed individually. (This is simplified a bit.) To use a more computer savvy term, block devices are devices that support random access; allowing a user to seek to a specific place anywhere on the device to read from or write to (this is also simplified a bit). Common examples of block devices are harddisks, floppy drives, (anything that you can imagine as a "drive", mostly.), ramdisks, etc. If a device has special features (for example, can be ejected), it will support these extras through ioctls (I/O controls) which any program can use. Linux 2.2 already supports the most common types of storage media for enterprise and desktop use including RAID controllers, IDE and SCSI disks, and many others. Linux 2.4 will build on this in a number of important ways.
IDE is the most common type of disks used in PCs today. Each IDE controller actually supports two separate disks (harddrives, cdrom drives, etc.) which appear under Linux as separate block devices. Linux 2.4 has improved on Linux 2.2's support of IDE by more than doubling the number of IDE controllers allowed in a system to 10. (Previously, 4 was the maximum allowed.) This boosts Linux to a theoretical limit of 20 IDE devices. There have also been some changes to allow for better support for DVDs and CD-ROM changers. While it may not be ready for Linux 2.4, there is ongoing work to allow Linux to fully support rewritable CDs and DVDs in a transparent fashion, for the time being however these should be considered read-only under normal circumstances but a previously formatted disk image can be copied out to the disk directly. And finally, Linux 2.3 has access the UDMA features of many new hardware chipsets and can work better around the bugs present in some pieces of hardware.
The SCSI subsystem has advanced in Linux 2.4, the most obvious example being in the number of new SCSI controllers supported. The long awaited SCSI rewrite has not happened for Linux 2.4 although a major cleanup effort is underway.
One idea adopted from the commercial UNIX world into Linux is the concept of a "raw" I/O device. A raw device is one whose accesses are not handled through the caching layer and whose actions are immediately and always synchronous with the "hard" data on the disk or elsewhere. This idea has many enterprise uses as it allows Linux to better maintain data integrity in the case of a system failure for ultra-important data. Also, this capability has been exploited by database applications which feel that they can do a better caching job than the native filesystem. What kept this idea from being adopted before was that commercial UNIXes did not provide a scalable process to allocate and access these devices, rather they required that a "raw" device node be allocated for each and every block device on the system. After much thought and many rejected ideas, this functionality was finally allowed in by creating a pool of "raw" device nodes which then can be associated with any arbitrary block device. Thus, we need only have nodes allocated for the number of raw devices that we will be using at any one time.
Linux 2.4 includes all of the new filesystems present in Linux 2.2. These filesystems include FAT (for MSDOS), NTFS (for Windows NT/2000), VFAT and FAT32 (for Windows 9x), HFS (for Macintoshes), and many, many others. All of these filesystems have been rewritten to some extent, sometimes a very large extent, to support the new page caching system and will be more efficient because of it. On the flip side however, binary-only filesystem modules designed for Linux 2.2 will not work with Linux 2.4. (Unlike some software firms, Linux does not generally provide for back-compatibility at the module level. Generally, open source modules can adapt quickly enough and binary module providers are expected to do the same or release the code.)
Some users will however notice major improvements to allow for better compatibility with other systems. OS/2 users will finally be able to both read and write to their disks under Linux. (This change is a long time in coming.) NT users unfortunately don't yet have that luxury unless they wish to use an "experimental" driver which may lead to disk corruption under certain situations. Linux 2.4 will also include a couple of improvements designed to make it interoperate better with other UNIX-like operating systems. Key to this is Linux 2.4's upcoming support for the IRIX efs filesystem and the IRIX disklabel (partition table) format. Also, support for NextStep has also improved as the UFS driver now supports its CDROMs.
Users who mount Windows shared drives via SMB (Server Message Block protocol) will be pleased that there will no longer be a compile time option for enabling workarounds for (released broken) Win9x systems. Instead, Linux will be able to detect what kind of system it is connecting to and enable bug fixes as needed. This will make Linux a considerably better option for heterogeneous networks. (This is a SMB client only, the popular Samba package can be used if server features or access to printers is desired.)
Of special importance to many Linux users is Linux's ability to mount the shared drives of UNIX operating systems. Linux 2.4 includes for the first time the ability to access NFS shares which conform to version 3 of the NFS protocol. NFS version 3 includes many advantages over previous versions and it has been one of Linux's most often requested features for the enterprise user.
There are still some pieces of support that is currently lacking in Linux 2.4. There is no support for journalizing filesystems, for instance. Due to the relatively low fsck times and the ease of data recovery journalizing filesystems support, this is considered by many to be an entrance requirement to the enterprise. HFS+, the successor to HFS and the filesystem used on some Macintosh disks, is not yet supported. Also not supported is the UDF format, the format commonly used on DVD drives. It is hoped that these and other "missing" features will be completed before 2.4 is ready for release however there will be a code freeze coming soon.
Linux 2.4 includes a number of new drivers and improvements to old drivers. Especially important here is Linux's support for many more "standard" VGA cards and configurations, at least in some mode. (Probably less than optimally.) Please remember that this feature can be bypassed and (on i386) is only necessary for people with certain systems which cannot be supported in any other way. At this time, the XFree project provides many more drivers to many more video cards than the kernel can support so it is not necessary to use this feature to get X Windows support. (SVGAlib and other libraries allow you to do direct video manipulation on supported hardware, however the use of these libraries must be done carefully as there are some security concerns.)
Character Devices - Keyboards, Mice, Consoles,
The biggest news on this front is that Linux 2.4 will support for the first time keyboards and mice attached to the Universal Serial Bus. When plugged in, these input device will behave just as if they were "normal" keyboards and mice. Additionally, Linux will now work on more systems, including broken (or specially embedded) ones where the keyboard is not pre-initialized by the BIOS. Also, better support is provided for machines without keyboards in some cases. (Mostly for buggy machines that don't handle the lack of a keyboard as well as one would like.)
As much as it may not appear so, all versions of Linux output to the screen in character mode. (Linux supports a built-in extended vt100 interface to handle cursor positioning. This is done using a very small text-mode only frame-buffer device.) In the case of a frame-buffer, Linux 2.2 and later support overlaying the framebuffer driver with a terminal driver allowing identical (sometimes even better) features as (than) the built-in text mode.
Linux 2.4 does not include many major changes to this subsystem however it does for the first time support redirecting the console (the primary display used for Linux kernel messages) to the parallel port for, for example, a printer. (Earlier versions of Linux already supported redirecting messages to serial ports.) This functionality will be of primary interest to some developers and server applications which want to maintain a hard-copy of kernel and debug messages that Linux uses.
Of course, Linux would not go far without excellent support for ports, the truest form of character device. These can generally be divided between serial and parallel varieties.
Serial support for Linux 2.4 has not changed much and many of the same limitations from 2.2 still apply. (In particular, setting module options is generally done with an external utility rather than the standard parameters passed to modules.) Later versions of Linux 2.2 and all versions of Linux 2.4 will allow one to share IRQs on PCI serial boards; previously this was only allowed on ISA cards and on-board serial ports. Some other pieces of multiport hardware will be better supported under Linux 2.2. More updates and new drivers are flowing in regularly.
In contrast, the parallel port subsystem has undergone some major overhauls since 2.2. There is now a generic parallel port driver for abstracted communication with "unknown" types of parallel devices. This could be used, for example, by programs that want to poll the parallel port for Plug-and-Play information as we described earlier. It is these changes that allow us the side-effect of being able to use the parallel port as a console. Also, Linux 2.4 supports using all the different modes of modern parallel ports, including writing to the parallel ports using DMA, if supported in the hardware. This will speed up access to printers and other parallel devices.
Infra-red support has progressed since Linux 2.2 and there have been many changes in this area, including better network support.
In a separate department, there has been some (but little) work since 2.2 on supporting so-called "WinModems" (or "soft modems" or "Linmodems"). These are modems which exist largely in software and whose drivers are often only provided by the manufacturer for Windows. (Hence the common name.) While no code has been submitted to Linus for the support of these beasts, it is possible that we may see some support for them before 3.0. One major obstacle here is that each and every WinModem is different; it is unlikely that a driver for one would be applicable to another and the sheer number of different types of WinModems would make this difficult or impossible to ever get a decent selection of hardware supported. Impossible odds have never phased open source developers however and I for one will not be surprised when the first driver makes it into the kernel, someday. Much of the legwork has already been completed.
There are some other places where some people feel that Linux 2.4 could improve, of course. With the addition of USB we have the chance to have multiple keyboards and mice attached to the same bus. Linux 2.4 however does not have internal multi-heading of these devices; you cannot assign one keyboard and one mouse to one terminal and another set to a different terminal. Support for this is provided in the GGI project (a project to provide multi-heading, frame-buffering, and other features to the Linux kernel), however this project's code has not yet (and may never be) synched into the mainstream kernel. (It is however a good place to check if you need this functionality.)
Multimedia: Sound, TV, Radio, etc.
Networking and Protocols
The Linux model of network sockets is one which is standard across most UNIX variants. Unfortunately however, the standard does have some deficiencies but these deficiencies can be corrected without breaking the standard altogether. Under Linux 2.2 and previous versions, if you have a number of processes all waiting on an event from a network socket (a web server, for instance), they will all be woken up when activity if detected. So, for every web page request received, Linux would wake up a number of processes which would each try and get at the request. As it does not make sense for multiple processes to serve the same request, only one will get to the data; the remainder will notice that it doesn't have anything to process and fall back asleep. Linux is quite efficient at making this all happen as quickly as possible, however it is still very inefficient... but there is a better way. Linux 2.4 includes changes which implement "wake one" under Linux which will allow us to completely remove the "stampede effect". In short, "wake one" does exactly as its name indicates: wakes up only one process in the case of activity. This will allow applications such as Apache to be even more efficient and make Linux an even better choice as a web server.
Linux 2.4 also includes a completely rewritten networking layer. In fact, it has been made an unserialized as possible so that it will scale far better than any previous version of Linux. In addition, it contains many optimizations to allow it to work with the particular quirks of the networking stacks in use in many common operating systems, including Windows. It should also be mentioned at this point that Linux is still the only operating system completely compatible with the IPv4 specification (Yes, IPv4) and Linux 2.4 boasts an IPv4 implementation that is much more scalable than its predecessor. As Linux 2.2 became completely compatible with the specification, the use of "colon mode" for aliasing was depreciated. This functionality was completely removed in Linux 2.4 and may require some advanced users to partially rewrite scripts.
Next to the new network layer, the next most important improvement in the Linux 2.4 network layer is the addition of code to handle the DECNet protocols. This allows for better interoperation with specialized Digital/Compaq systems.
For the low-end desktop users, PPP is an important part of day to day life. Linux 2.4 includes some major rewrites and modularization of much of the code, including a long awaited combination of the PPP layers from the ISDN layer and the PPP layer used on serial devices, such as modems.
All in all, I feel that Linux 2.4 will probably be known as the "desktop" Linux for all its new desktop features. I hope someone doesn't quote me on that as Linux 2.4 includes many features that are great for servers and embedded systems. At the heart of the matter, is there really a difference?
This is the final version. If you haven't commented on it by now and I missed something that you feel is important, please email me at email@example.com and maybe I'll put out a patch. :)