System initialization starts where the kernel bootup ends. The first program that the kernel runs is the init process. The init process reads the system initialization table (/etc/inittab) to see how various daemons need to be initialized and started. This section examines how system initialization works in Linux.
The various Linux distributions have adopted two system initialization styles. Red Hat and Debian use the System V initialization type. Others, such as Slackware, use the BSD style.
The Filesystem Hierarchy Standard (FHS) v2.0 for Linux states that either BSD- or System V-style initialization is acceptable. The standard stopped short, however, of outlining exactly where the rc scripts would go, except to say that they would be below /etc. Future revisions to the standard might provide further guidance.
Under system initialization, the biggest difference between BSD and System V is the init scripts. With BSD style, all daemons are started essentially by only a few scripts. For example, the init process in Slackware, which adopts the BSD style, runs the system script (/etc/rc.d/rc.S) to prepare the system. The rc.S file enables the system's virtual memory, mounts necessary file systems, cleans up certain log directories, initializes Plug and Play devices, loads kernel modules, configures PCMCIA devices, and sets up serial ports. The local script (rc.local) is available for system administrators to tailor to the specific system on which it is running. The system and local scripts can, in turn, call other scripts to accomplish their objectives, but they are called by the init process sequentially.
Many other Linux distributions make use of System V style instead of the BSD style. Unlike the BSD style, the System V scripts are independent, stand-alone initialization scripts. They make use of runlevels that correspond to different groups of processes or tasks to be executed. The scripts are run from runlevels 0 to 6 by default, even though several runlevels are not used. In other words, each runlevel is given a subdirectory for init scripts that allows for maximum flexibility in initializing the system and necessary daemons. The BSD style, by having only a few scripts to start everything, does not allow for the kind of flexibility System V brings. It does, however, make things easier to find.
It should be noted that, even though Slackware adopts the BSD-style system initialization, it does provide System V initialization compatibility. In fact, many Linux distributions use the same init binary, so the difference is not that great between various Linux distributions when it comes to system initialization.
Initialization Table (/etc/inittab)
As mentioned earlier, the system initialization table (/etc/inittab) specifies to the init process how to initialize and start various daemons during system bootup. Comments in /etc/inittab are preceded by # and are skipped over by the init process. Non-comment lines in /etc/inittab have the following format:
id:runlevel:action:process
* id is a unique identifier for the rest of the line. It is typically limited to two characters.
* runlevel can be null or contain a valid runlevel, which defines the state that the system will be running in. Runlevels are essentially groups of processes or actions to be executed during system initialization. They can be used as follows:
o Runlevel 0: System halt
o Runlevel 1: Maintenance mode (single-user mode)
o Runlevel 6: System reboot
o Runlevels 25: Can be customized
* action can be several different commands, the most common being respawn, but it can also be any one of the following: once, sysinit, boot, bootwait, wait, off, ondemand, initdefault, powerwait, powerfail, powerokwait, ctrlaltdel, or kbrequest.
* process is the specific process or program to be run.
Now let's see how the system initialization table (/etc/inittab) works in BSD and System V flavors. We'll use Slackware Linux as an example for BSD style and Red Hat for System V style.
BSD inittab (Slackware)
Slackware Linux uses the BSD-style file layout for its system initialization files. All of the system initialization files are stored in the /etc/rc.d directory. Remember, the /etc/rc.d/rc.S script is called by the init process to enable the system's virtual memory, mount necessary file systems, clean up certain log directories, initialize Plug and Play devices, and then call other scripts in the /etc/rc.d directory to complete other work. This includes loading kernel modules (/etc/rc.d/rc.modules), configuring PCMCIA devices (/etc/rc.d/rc.pcmcia), and setting up serial ports (/etc/rc.d/rc.serial).
After system initialization is complete, init moves on to runlevel initialization. As described previously, a runlevel describes the state in which your machine will be running. The files listed in Table 1.2 define the different runlevels in Slackware Linux.
Table 1.2. Runlevels in Slackware Linux
rc.0 Halt the system (runlevel 0). By default, this is symlinked to rc.6.
rc.4 Multiuser startup (runlevel 4), but in X11 with KDM, GDM, or XDM as thelogin manager.
rc.6 Reboot the system (runlevel 6).
rc.K Start up in single-user mode (runlevel 1).
rc.M Multiuser mode (runlevels 2 and 3), but with the standard text-based login. This is the default runlevel in Slackware.
Runlevels 2, 3, and 4 start up the network services if enabled. The files listed in Table 1.3 are responsible for the network initialization.
Table 1.3. Network Initialization
rc.inet1 Created by netconfig, this file is responsible for configuring the actual network interface.
rc.inet2 Runs after rc.inet1 and starts up basic network services.
rc.atalk Starts up AppleTalk services.
rc.httpd Starts up the Apache web server.
rc.samba Starts up Windows file- and print-sharing services.
rc.news Starts up the news server.
rc.sysvinit The rc.sysvinit script searches for any System V init scripts in /etc/rc.d and runs them if the runlevel is appropriate. This is useful for certain commercial software packages that install System V init scripts and scripts for BSD-style init.
In addition to the rc scripts listed here, rc.local contains specific startup commands for a system. rc.local is empty after a fresh install because it is reserved for local administrators. This script is run after all other initialization has taken place.
Sys V inittab (Red Hat)
Under Red Hat, all the system initialization scripts are located in /etc/rc.d. Because Red Hat uses the System V style, the /etc/rc.d subdirectory has even more subdirectories, one for each runlevel: rc0.d to rc6.d and init.d. Within the /etc/rc.d/rc#.d subdirectories (where # is replaced by a single-digit number) are links to the master scripts stored in /etc/rc.d/init.d. The scripts in init.d take an argument of start, stop, reload, or restart.
The links in the /etc/rc.d/rc#.d directories all begin with either an S or a K for start or kill, respectively, a number that indicates a relative order for the scripts, and the script name (generally, the same name as the master script found in init.d to which it is linked). For example, S20lpd runs the script lpd in init.d with the argument start, which starts up the line-printer daemon.
The nice part about System V initialization is that it is easy for root to start, stop, restart, or reload a daemon or process subsystem from the command line, simply by calling the appropriate script in init.d with the argument start, stop, reload, or restart. For example, the script lpd can be called from the command line, as follows:
/etc/rc.d/init.d/lpd start
Red Hat defines the runlevels as follows (the default runlevel is 3):
* Runlevel 0: System halt
* Runlevel 1: Single-user mode
* Runlevel 2: Multiuser mode (the same as runlevel 3, but without networking)
* Runlevel 3: Full multiuser mode
* Runlevel 4: Unused
* Runlevel 5: X11
* Runlevel 6: System reboot
When not called from a command line with an argument, the rc script parses the command line. For example, if it is running K20lpd, it runs the lpd init script with a stop argument. When init has followed the link in /etc/inittab to rc.d/rc3.d, it begins by running all scripts that start with a K in numerical order from lowest to highest, then likewise for the S scripts. This ensures that the correct daemons are running in each runlevel and are stopped and started in the correct order. For example, the sendmail or bind/named (Berkeley DNS or Domain Name Service daemon) cannot be started before networking is started. In contrast, the BSD-style Slackware starts networking early in the rc.M script. As a result, you must always be cognizant of the order of the entries when modifying Slackware startup scripts.
All of the initialization scripts are simple ASCII text files that can be easily modified with vi or any text editor. As noted earlier, many Linux distributions use the same init binary, so the difference is not that great between various Linux distributions when it comes to system initialization. In fact, symbolic links can be added to make a BSD-style initialization look like a System V-style initialization, and vice versa.
Tuesday, March 10, 2009
Linux Logging Facility
One of the key requirements of enterprise systems is to log pertinent events that happen on the system to aid in system management and post failure system debugging. Fortunately, Linux provides an excellent, fully configurable, and simple logging facility.
All Linux logs are in plain text, so any text tool can be used to view them, such as vi, tail, more, or less. A browser, such as Mozilla, can be used to display a log file and provide search capability. Scripts can also be written to scan through logs and perform automatic functions based on the contents.
The main location for Linux logs is in the /var/log directory. This directory contains several log files that are maintained by the system, but other services and programs can put their log files here as well. Most log files require root privilege, but this can be overcome by simply changing the access rights to these files.
/var/log/messages
The /var/log/messages log is the core system log file. It contains the boot messages when the system comes up as well as other status messages as the system runs. Errors with I/O subsystem, networking, and other general system errors are logged in this file. Messages from system services, such as DHCP servers, are also logged in this file. Messages indicating simple actions on the system, such as when someone becomes root, are also listed here.
/var/log/XFree86.0.log
The /var/log/XFree86.0.log shows the results of the last execution of the XFree86 X Window server. If there are problems getting the graphical mode to come up, this file usually provides an answer as to what is failing.
In addition to these two log files, there might be other log files in the /var/log directory that are maintained by other services and applications running on the system. For example, there might be log files associated with running a mail server, resource sharing, or automatic tasks.
Log Rotation
Log files can become large and cumbersome, especially on systems that have been running for long periods of time. To solve this problem, Linux provides a tool, logrotate, to rotate the logs so that the current log information does not get mixed up with older messages. The logrotate command can be run manually as needed, or it can be run automatically on a periodic basis. When executed, logrotate takes the current version of the log files and adds a sequence number to the end of the log filename. The larger the sequence number after the log filename, the older that file is. For example, messages.2 is older than messages.1, which is older than the current messages file. The automatic behavior for logrotate can be configured using the /etc/logrotate.conf file. More details are available on the logrotate man page.
In addition to /var/log/messages, dmesg provides a quick view of the kernel messages, which can be helpful when you want to know what happened during the last system boot.
Logger
The logger facility generates system log messages out of your own scripts and programs that are recognized and processed by the syslogd daemon. This lets you send messages to the log files without worrying about the format of the log files or whether the logging facility has been customized.
Customized Logging
The Linux logging facility consists of two daemons: klogd for kernel messages and syslogd for user-space messages. These daemons can be configured through the /etc/ syslog.conf and /etc/sysconfig/syslog files. You can edit the /etc/syslog.conf file to specify what you want to do with a particular type of message. For example, you can specify that critical kernel messages should be put on a remote host for security reasons.
Here is an example of customized logging taken from the /etc/syslog.conf man page:
kern.* /var/adm/kernel
kern.crit @finlandia
kern.crit /dev/console
kern.info;kern.!err /var/adm/kernel-info
The first statement directs any message from the kernel to the file /var/adm/kernel.
The second statement directs all kernel messages of the priority crit and higher to the remote host finlandia. Sending critical log messages to the remote host can help prevent malicious users from modifying the message log files on the local system to cover their tracks. It can also be useful in the event the local system crashes and the disks get irreparable errors.
The third statement directs these messages to the actual console, so the person who works on the console will see them.
The fourth line tells the syslogd to save all kernel messages that come with priorities from info up to warning in the /var/adm/kernel-info file. Everything from err and higher priority is excluded.
The ability to customize logging like this provides a great deal of flexibility and control over the Linux environment.
All Linux logs are in plain text, so any text tool can be used to view them, such as vi, tail, more, or less. A browser, such as Mozilla, can be used to display a log file and provide search capability. Scripts can also be written to scan through logs and perform automatic functions based on the contents.
The main location for Linux logs is in the /var/log directory. This directory contains several log files that are maintained by the system, but other services and programs can put their log files here as well. Most log files require root privilege, but this can be overcome by simply changing the access rights to these files.
/var/log/messages
The /var/log/messages log is the core system log file. It contains the boot messages when the system comes up as well as other status messages as the system runs. Errors with I/O subsystem, networking, and other general system errors are logged in this file. Messages from system services, such as DHCP servers, are also logged in this file. Messages indicating simple actions on the system, such as when someone becomes root, are also listed here.
/var/log/XFree86.0.log
The /var/log/XFree86.0.log shows the results of the last execution of the XFree86 X Window server. If there are problems getting the graphical mode to come up, this file usually provides an answer as to what is failing.
In addition to these two log files, there might be other log files in the /var/log directory that are maintained by other services and applications running on the system. For example, there might be log files associated with running a mail server, resource sharing, or automatic tasks.
Log Rotation
Log files can become large and cumbersome, especially on systems that have been running for long periods of time. To solve this problem, Linux provides a tool, logrotate, to rotate the logs so that the current log information does not get mixed up with older messages. The logrotate command can be run manually as needed, or it can be run automatically on a periodic basis. When executed, logrotate takes the current version of the log files and adds a sequence number to the end of the log filename. The larger the sequence number after the log filename, the older that file is. For example, messages.2 is older than messages.1, which is older than the current messages file. The automatic behavior for logrotate can be configured using the /etc/logrotate.conf file. More details are available on the logrotate man page.
In addition to /var/log/messages, dmesg provides a quick view of the kernel messages, which can be helpful when you want to know what happened during the last system boot.
Logger
The logger facility generates system log messages out of your own scripts and programs that are recognized and processed by the syslogd daemon. This lets you send messages to the log files without worrying about the format of the log files or whether the logging facility has been customized.
Customized Logging
The Linux logging facility consists of two daemons: klogd for kernel messages and syslogd for user-space messages. These daemons can be configured through the /etc/ syslog.conf and /etc/sysconfig/syslog files. You can edit the /etc/syslog.conf file to specify what you want to do with a particular type of message. For example, you can specify that critical kernel messages should be put on a remote host for security reasons.
Here is an example of customized logging taken from the /etc/syslog.conf man page:
kern.* /var/adm/kernel
kern.crit @finlandia
kern.crit /dev/console
kern.info;kern.!err /var/adm/kernel-info
The first statement directs any message from the kernel to the file /var/adm/kernel.
The second statement directs all kernel messages of the priority crit and higher to the remote host finlandia. Sending critical log messages to the remote host can help prevent malicious users from modifying the message log files on the local system to cover their tracks. It can also be useful in the event the local system crashes and the disks get irreparable errors.
The third statement directs these messages to the actual console, so the person who works on the console will see them.
The fourth line tells the syslogd to save all kernel messages that come with priorities from info up to warning in the /var/adm/kernel-info file. Everything from err and higher priority is excluded.
The ability to customize logging like this provides a great deal of flexibility and control over the Linux environment.
Configurable 2.6 Kernel Features
Two new features available in the 2.6 kernel merit serious consideration because they can impact Linux system performance for some workloads: I/O elevators and huge TLB (Translation Look-aside Buffer) page support. These features must be explicitly enabled (for example, in the kernel configuration file or the boot kernel command line).
I/O Elevators
An elevator is a queue where I/O requests are ordered by the function of their sector on disk. Two I/O elevators are now available in the 2.6 kernel: anticipatory and deadline. The default mode is anticipatory. Using anticipatory mode, synchronous read operations are scheduled together with a delay of a few milliseconds, anticipating the "next" read operation. This mode should help read performance for commands that require multiple synchronous reads, especially during streamed writes. However, several workloads that seek all over the disk, performing reads and synchronous writes, such as database operations, can actually suffer with anticipatory I/O. For these workloads, the deadline I/O scheduler is better and can deliver up to a 10% performance improvement over the anticipatory scheduler. Select the deadline I/O scheduler by booting with elevator = deadline on the kernel command line.
Huge TLB Page Support
Huge TLB page support is now present in the 2.6 kernel as well as in the latest 2.4 kernel-based Linux distributions, such as RHEL 3. The TLB is the processor's cache of virtual-to-physical memory address translations. As such, the number of TLB entries is very limited, and a TLB miss is very costly in terms of processor cycles. With huge TLB page support, each dedicated, large TLB entry can map a 2MB or 4MB page, thus reducing the number of TLB misses, and could increase performance by a few percent for database operations. This is even more critical as more and more systems with gigabytes of physical memory are now available. Huge pages are reserved inside the kernel, are mapped by dedicated, large TLB entries, and are not pageable, making it very attractive for large database applications. A user application can use these pages either via the mmap system calls or shared memory system calls. Huge pages must be preallocated by the superuser (for example, the system administrator), preferably during system initialization when huge contiguous memory blocks are still available, before they can be used. More specifically, to use this huge TLB page support on the 2.6 kernel, you need to consider the following:
* The kernel must be built with CONFIG_HUGETLB_PAGE (under the Processor types and features section of the kernel configuration file) and CONFIG_ HUGETLBFS (under the File system section) configuration options.
* /proc/meminfo should be able to show the huge page size supported, the total number of huge pages in the system, and the number of huge pages that are still available.
* /proc/filesystem should also show a file system of type hugetlbfs configured in the kernel.
* /proc/sys/vm/nr_hugepages indicates the current number of configured huge pages in the system.
* Huge pages can be preallocated by using the following command:
echo x >/proc/sys/vm/nr_hugepages
where x is the number of huge pages to be preallocated. This command can be inserted into one of the local rc initialization files so it can be executed during system initialization. (On RHEL 3 systems that are based on the 2.4 kernel technology, this can be done by echoing the value in megabytes into /proc/sys/vm/hugetlb_pool or by putting the value in the /etc/sysctl.conf file.)
* To use huge pages via mmap system calls, the superuser must first mount a file system of type hugetlbfs on the directory /mnt/huge. Any files created on /mnt/huge will use huge pages. User applications can use mmap system calls to request huge pages.
* To use huge pages via shared memory system calls (shmat / shmget), there is no need to mount hugetlbfs. However, it is possible for the same user application to use any combination of mmap and shared memory system calls to use huge pages.
It should be noted that the use of huge pages is most effective on large memory systems. Using huge pages on systems with limited physical memory can adversely affect system performance because huge pages, if they can be allocated, must be physically contiguous and are not pageable, thus making memory swapping ineffective.
In the event of any problems with your system, Linux can log the event. This allows better system management.
(taken from Linux Server Performance Tuning)
I/O Elevators
An elevator is a queue where I/O requests are ordered by the function of their sector on disk. Two I/O elevators are now available in the 2.6 kernel: anticipatory and deadline. The default mode is anticipatory. Using anticipatory mode, synchronous read operations are scheduled together with a delay of a few milliseconds, anticipating the "next" read operation. This mode should help read performance for commands that require multiple synchronous reads, especially during streamed writes. However, several workloads that seek all over the disk, performing reads and synchronous writes, such as database operations, can actually suffer with anticipatory I/O. For these workloads, the deadline I/O scheduler is better and can deliver up to a 10% performance improvement over the anticipatory scheduler. Select the deadline I/O scheduler by booting with elevator = deadline on the kernel command line.
Huge TLB Page Support
Huge TLB page support is now present in the 2.6 kernel as well as in the latest 2.4 kernel-based Linux distributions, such as RHEL 3. The TLB is the processor's cache of virtual-to-physical memory address translations. As such, the number of TLB entries is very limited, and a TLB miss is very costly in terms of processor cycles. With huge TLB page support, each dedicated, large TLB entry can map a 2MB or 4MB page, thus reducing the number of TLB misses, and could increase performance by a few percent for database operations. This is even more critical as more and more systems with gigabytes of physical memory are now available. Huge pages are reserved inside the kernel, are mapped by dedicated, large TLB entries, and are not pageable, making it very attractive for large database applications. A user application can use these pages either via the mmap system calls or shared memory system calls. Huge pages must be preallocated by the superuser (for example, the system administrator), preferably during system initialization when huge contiguous memory blocks are still available, before they can be used. More specifically, to use this huge TLB page support on the 2.6 kernel, you need to consider the following:
* The kernel must be built with CONFIG_HUGETLB_PAGE (under the Processor types and features section of the kernel configuration file) and CONFIG_ HUGETLBFS (under the File system section) configuration options.
* /proc/meminfo should be able to show the huge page size supported, the total number of huge pages in the system, and the number of huge pages that are still available.
* /proc/filesystem should also show a file system of type hugetlbfs configured in the kernel.
* /proc/sys/vm/nr_hugepages indicates the current number of configured huge pages in the system.
* Huge pages can be preallocated by using the following command:
echo x >/proc/sys/vm/nr_hugepages
where x is the number of huge pages to be preallocated. This command can be inserted into one of the local rc initialization files so it can be executed during system initialization. (On RHEL 3 systems that are based on the 2.4 kernel technology, this can be done by echoing the value in megabytes into /proc/sys/vm/hugetlb_pool or by putting the value in the /etc/sysctl.conf file.)
* To use huge pages via mmap system calls, the superuser must first mount a file system of type hugetlbfs on the directory /mnt/huge. Any files created on /mnt/huge will use huge pages. User applications can use mmap system calls to request huge pages.
* To use huge pages via shared memory system calls (shmat / shmget), there is no need to mount hugetlbfs. However, it is possible for the same user application to use any combination of mmap and shared memory system calls to use huge pages.
It should be noted that the use of huge pages is most effective on large memory systems. Using huge pages on systems with limited physical memory can adversely affect system performance because huge pages, if they can be allocated, must be physically contiguous and are not pageable, thus making memory swapping ineffective.
In the event of any problems with your system, Linux can log the event. This allows better system management.
(taken from Linux Server Performance Tuning)
Preinstallation Planning
Before installing Linux on the system, there are several things worth considering that might help optimize the performance of the operating system and the applications that run on it later. These areas include the following:
* Placing partitions
* Using multiple hard drives
* Selecting file systems
* Converting file systems
* Configuring RAID
Partition Placement
At a minimum, Linux requires a root and a swap partition. Where these and other frequently accessed partitions reside on disks ultimately impacts system performance. Following are some of the recommendations for placement of the root, swap, and other frequently accessed partitions that take advantage of the disk geometry:
* Use separate partitions for root, swap, /var, /usr, and /home.
* Most drives today pack more sectors on the outer tracks of the hard drive platter than on the inner tracks, so it's much faster to read and write data from the outer tracks. Lower-numbered partitions are usually allocated at the outer tracks (for example, /dev/hda1 is closer to the drive's outer edge than /dev/hda3), so place partitions that require frequent access first.
* The first partition should be the swap partition (to optimize memory swap operations).
* The next partition should be /var because log entries are frequently written to /var/log.
* The next partition should be /usr, because base system utilities and commands are placed in /usr.
* The root and /home partitions can reside near the end of the drive.
Now that we have considered how best to place the most frequently used partitions on a hard drive, we will look at how to take advantage of your multiple hard drivesif you have more than one in your system.
Using Multiple Hard Drives
Most systems today have more than one hard drive. If your system has only one drive, and if performance is really important to you (which is why you are reading this book in the first place!), you may need to seriously consider adding more drives to your system to improve performance. To take full advantage of multiple drives, you'll need to do the following:
* Place frequently accessed partitions on the faster drives.
* If the drives are relatively equal in performance, place frequently used partitions on alternate drives. For example, place /var on one drive and /usr on another drive. The swap partition should be on its own drive.
* Consider using RAID if you have multiple drives with relatively equal performance. (This will be discussed in more detail later.)
* Place each drive as the master device on its own I/O channel (for example, IDE) to maximize bus throughput. You will need to modify the file system table (/etc/fstab) after moving drives across I/O channels because the device name will change. If the drive contains the root or /boot partition, you need to edit the grub /boot/grub/menu.lst file as well.
When using multiple hard drives, you need to make some decisions in modifying the file system table. In the next section, we'll discuss selecting file systems.
Selecting File Systems
In addition to the original ext2 file system, new enterprise Linux distributions, such as RHEL 3, RHEL 4, and SLES 9, also support journaled file system technology, such as ext3 and ReiserFS. XFS is also included in several Linux distributions but may not be fully supported. Table 1.1 shows the general advantages and disadvantages of each type of file system.
Table 1.1. File System Types
File System Type Comment
ext3 Easy to upgrade from existing ext2 file system
ReiserFS Best performance with small files; fully supported by major enterprise distributions
XFS Best performance, especially with large files
Some Linux distributions, such as Red Hat and SUSE, also include the IBM JFS (Journaled File System), which is designed for high-performance e-commerce file servers and is used on many IBM enterprise servers supporting high-speed corporate intranets. The selection of file system(s) ultimately depends on the role and the expected workload the system is supposed to handle. Careful planning before installation is highly recommended. Making the right decisions during installation can save you headaches later on.
Several mkfs and mount options might yield file system performance improvements under specific circumstances. See Chapter 11, "File System Tuning," for a complete discussion of tuning file systems for improved performance on Linux.
* Placing partitions
* Using multiple hard drives
* Selecting file systems
* Converting file systems
* Configuring RAID
Partition Placement
At a minimum, Linux requires a root and a swap partition. Where these and other frequently accessed partitions reside on disks ultimately impacts system performance. Following are some of the recommendations for placement of the root, swap, and other frequently accessed partitions that take advantage of the disk geometry:
* Use separate partitions for root, swap, /var, /usr, and /home.
* Most drives today pack more sectors on the outer tracks of the hard drive platter than on the inner tracks, so it's much faster to read and write data from the outer tracks. Lower-numbered partitions are usually allocated at the outer tracks (for example, /dev/hda1 is closer to the drive's outer edge than /dev/hda3), so place partitions that require frequent access first.
* The first partition should be the swap partition (to optimize memory swap operations).
* The next partition should be /var because log entries are frequently written to /var/log.
* The next partition should be /usr, because base system utilities and commands are placed in /usr.
* The root and /home partitions can reside near the end of the drive.
Now that we have considered how best to place the most frequently used partitions on a hard drive, we will look at how to take advantage of your multiple hard drivesif you have more than one in your system.
Using Multiple Hard Drives
Most systems today have more than one hard drive. If your system has only one drive, and if performance is really important to you (which is why you are reading this book in the first place!), you may need to seriously consider adding more drives to your system to improve performance. To take full advantage of multiple drives, you'll need to do the following:
* Place frequently accessed partitions on the faster drives.
* If the drives are relatively equal in performance, place frequently used partitions on alternate drives. For example, place /var on one drive and /usr on another drive. The swap partition should be on its own drive.
* Consider using RAID if you have multiple drives with relatively equal performance. (This will be discussed in more detail later.)
* Place each drive as the master device on its own I/O channel (for example, IDE) to maximize bus throughput. You will need to modify the file system table (/etc/fstab) after moving drives across I/O channels because the device name will change. If the drive contains the root or /boot partition, you need to edit the grub /boot/grub/menu.lst file as well.
When using multiple hard drives, you need to make some decisions in modifying the file system table. In the next section, we'll discuss selecting file systems.
Selecting File Systems
In addition to the original ext2 file system, new enterprise Linux distributions, such as RHEL 3, RHEL 4, and SLES 9, also support journaled file system technology, such as ext3 and ReiserFS. XFS is also included in several Linux distributions but may not be fully supported. Table 1.1 shows the general advantages and disadvantages of each type of file system.
Table 1.1. File System Types
File System Type Comment
ext3 Easy to upgrade from existing ext2 file system
ReiserFS Best performance with small files; fully supported by major enterprise distributions
XFS Best performance, especially with large files
Some Linux distributions, such as Red Hat and SUSE, also include the IBM JFS (Journaled File System), which is designed for high-performance e-commerce file servers and is used on many IBM enterprise servers supporting high-speed corporate intranets. The selection of file system(s) ultimately depends on the role and the expected workload the system is supposed to handle. Careful planning before installation is highly recommended. Making the right decisions during installation can save you headaches later on.
Several mkfs and mount options might yield file system performance improvements under specific circumstances. See Chapter 11, "File System Tuning," for a complete discussion of tuning file systems for improved performance on Linux.
Wednesday, February 18, 2009
Configuring RAID
RAID (Redundant Array of Inexpensive Disks) lets you configure multiple physical disks into a single virtual disk, thereby taking advantage of multiple disks and I/O channels working in parallel on a disk I/O operation. Many Linux distributions, especially enterprise versions, now provide RAID support. The easiest way to configure RAID is during installation. However, RAID can be configured on a preinstalled system as well. Here's how:
1.If new partitions are created, modify /etc/fstab appropriately. If the root and /boot partitions are on these new partitions, modify the /boot/grub/menu.lst file accordingly.
2.If existing partitions are combined to create a new RAID partition:
* Verify that the raidtools package is present:
mkraid V
*Verify that RAID support is compiled into the kernel:
cat /proc/mdstat
*Create or modify /etc/raidtab. Create the following entry for each of the RAID devices:
/* Create RAID device md0 */
raiddev /dev/md 0 /* New RAID device */
raid-level 0 /* RAID 0 as example here */
nr-raid-disk 2 /* Assume two disks */
/* Automatically detect RAID devices on boot */
persistent-superblock 1
chunk-size 32 /* Writes 32 KB of data to each disk */
device /dev/hda1
raid-disk 0
device /dev/hdc1
raid-disk 1
Large chunk sizes are better when working with larger files; smaller chunk sizes are more suitable for working with smaller files.
3.Create the RAID device:
mkraid /dev/md0
4.View the status of the RAID devices:
cat /proc/mdstat
5.Format the RAID device with the ReiserFS file system, for example:
mkreiserfs /dev/md0
6.Modify /etc/fstab to indicate which partition(s) are on the RAID device. For example, to have /usr on the RAID device /dev/md0, make sure that the /usr line in /etc/fstab points to /dev/md0.
To squeeze the most performance out of the disk I/O subsystem, make sure that DMA and 32-bit transfers are enabled. This can be done via the hdparm utility, as follows (all commands are examples only):
1.Verify that DMA is enabled:
hdparm d /dev/hda
2.If DMS is not enabled, enable it by issuing the following command:
hdparm dl /dev/hda
3.Verify that 32-bit transfers are enabled:
hdparm c /dev/hda
4.If 32-bit transfers are not enabled, enable them by issuing the following command:
hdparm cl /dev/hda
5.Verify the effectiveness of the options by running simple disk read tests as follows:
hdparm Tt/dev/had
So far, we have discussed how best to set up the disk I/O subsystem for optimal performance. Now we need to look at two key configurable kernel features that are available on the 2.6 kernel. These 2.6 kernel features can impact performance for some application workloads.
1.If new partitions are created, modify /etc/fstab appropriately. If the root and /boot partitions are on these new partitions, modify the /boot/grub/menu.lst file accordingly.
2.If existing partitions are combined to create a new RAID partition:
* Verify that the raidtools package is present:
mkraid V
*Verify that RAID support is compiled into the kernel:
cat /proc/mdstat
*Create or modify /etc/raidtab. Create the following entry for each of the RAID devices:
/* Create RAID device md0 */
raiddev /dev/md 0 /* New RAID device */
raid-level 0 /* RAID 0 as example here */
nr-raid-disk 2 /* Assume two disks */
/* Automatically detect RAID devices on boot */
persistent-superblock 1
chunk-size 32 /* Writes 32 KB of data to each disk */
device /dev/hda1
raid-disk 0
device /dev/hdc1
raid-disk 1
Large chunk sizes are better when working with larger files; smaller chunk sizes are more suitable for working with smaller files.
3.Create the RAID device:
mkraid /dev/md0
4.View the status of the RAID devices:
cat /proc/mdstat
5.Format the RAID device with the ReiserFS file system, for example:
mkreiserfs /dev/md0
6.Modify /etc/fstab to indicate which partition(s) are on the RAID device. For example, to have /usr on the RAID device /dev/md0, make sure that the /usr line in /etc/fstab points to /dev/md0.
To squeeze the most performance out of the disk I/O subsystem, make sure that DMA and 32-bit transfers are enabled. This can be done via the hdparm utility, as follows (all commands are examples only):
1.Verify that DMA is enabled:
hdparm d /dev/hda
2.If DMS is not enabled, enable it by issuing the following command:
hdparm dl /dev/hda
3.Verify that 32-bit transfers are enabled:
hdparm c /dev/hda
4.If 32-bit transfers are not enabled, enable them by issuing the following command:
hdparm cl /dev/hda
5.Verify the effectiveness of the options by running simple disk read tests as follows:
hdparm Tt/dev/had
So far, we have discussed how best to set up the disk I/O subsystem for optimal performance. Now we need to look at two key configurable kernel features that are available on the 2.6 kernel. These 2.6 kernel features can impact performance for some application workloads.
Converting File Systems
If the existing file system is ext2, converting it to ext3 can be done using the tune2fs command. For example, if you want to convert the existing ext2 partition /dev/hda1 to ext3, issue the following command:
tune2fs j /dev/hda1
Converting to a file system type other than ext3 is more time-consuming. For example, to convert /usr, which is on /dev/hdb2, to ReiserFS, do the following:
1.Choose an empty partition that is larger than /dev/hdb2say, /dev/hdb3as a temporary partition.
2.Format the temporary partition:
mkreiserfs /dev/hdb3
3.Create a temporary directory:
mkdir /mnt/tempfs
4.Copy the contents of /usr to a temporary directory:
cp preserver=all R /usr/mnt/tempfs
5.Unmount /usr:
umount /usr
6.Unmount /mnt/tempfs:
umount /mnt/tempfs
7.Mount /usr on /dev/hdb3:
mount /dev/hdb3 /usr
8.Reformat the old /usr partition:
mkreiserfs /dev/hdb2 mount /dev/hdb2 /mnt/tempfs
9.Copy the contents of /usr back to its original partition:
cp preserve=all R /usr /mnt/tempfs
10.Unmount /usr:
umount /usr
11.Unmount /mnt/tempfs:
umount /mnt/tempfs
12.Remount /usr on its original partition:
mount /dev/hdb2 /usr
Repeat this process for other directories you want to convert.
The final step for preinstallation planning for optimization is configuring RAID.
tune2fs j /dev/hda1
Converting to a file system type other than ext3 is more time-consuming. For example, to convert /usr, which is on /dev/hdb2, to ReiserFS, do the following:
1.Choose an empty partition that is larger than /dev/hdb2say, /dev/hdb3as a temporary partition.
2.Format the temporary partition:
mkreiserfs /dev/hdb3
3.Create a temporary directory:
mkdir /mnt/tempfs
4.Copy the contents of /usr to a temporary directory:
cp preserver=all R /usr/mnt/tempfs
5.Unmount /usr:
umount /usr
6.Unmount /mnt/tempfs:
umount /mnt/tempfs
7.Mount /usr on /dev/hdb3:
mount /dev/hdb3 /usr
8.Reformat the old /usr partition:
mkreiserfs /dev/hdb2 mount /dev/hdb2 /mnt/tempfs
9.Copy the contents of /usr back to its original partition:
cp preserve=all R /usr /mnt/tempfs
10.Unmount /usr:
umount /usr
11.Unmount /mnt/tempfs:
umount /mnt/tempfs
12.Remount /usr on its original partition:
mount /dev/hdb2 /usr
Repeat this process for other directories you want to convert.
The final step for preinstallation planning for optimization is configuring RAID.
Wednesday, November 19, 2008
fedora10
Apa itu bosang?
APa itu borim?
ahh smpi bila ayat tu kan hilang dari dunia ni...
hmm baik aku selidik psl info mengenai fedora 10. Lgpn dlm 7hr lg akan release fedora10 General Availability..
br aku blh upgrade f9 ke f10...
more info klik [at] fedora10 Task
rasa cam addict ah dgn fedora..fedora..aku mimpi fedora..aha ha..
APa itu borim?
ahh smpi bila ayat tu kan hilang dari dunia ni...
hmm baik aku selidik psl info mengenai fedora 10. Lgpn dlm 7hr lg akan release fedora10 General Availability..
br aku blh upgrade f9 ke f10...
more info klik [at] fedora10 Task
rasa cam addict ah dgn fedora..fedora..aku mimpi fedora..aha ha..
Subscribe to:
Comments (Atom)