Apache tuning : Beyond the configuration files

When we think of apache tuning, our focus go straight to httpd.conf. How do I tune the configuration file to achieve that perfect setting? Wrong approach.

An inverted pyramid approach is the best way to analyze factors that bring in maximum performance out of your web server. There are various factors that affect the performance of a server application and fine tuning the configuration file should be reached only after considering the parameters that affect it externally.

Processor/CPU :

If you install a default configuration of apache on 386 and P4 dual cores, the performance difference is huge. Therefore, whenever possible give your server, the best equipment to function. If the cost model permits, run apache on a dedicated server. Running it along with other applications will affect its performance. There are control panels that allow you to adopt this clustered model.

RAM :

Having considered the processor capabilities, the next factor that stands out is RAM. RAM is precious, it is the real-estate on server. Every process that runs is battling away to utilize RAM in order to complete its task. So, if you have many static pages to serve, performance is higher if you allow apache to cache these in RAM, thereby avoiding read/write access to the harddisk. Disk or RAM, caching definitely helps to improve performance avoiding continuous read/write. Use mod-cache if you are short of RAM or mod_mem_cache at best if you have enough RAM to play with.

Hard drive :

Hard drive is next in list. No doubt, we need to use a disk as fast as possible and if possible a hardware RAID, improving the speed of files served. Conclusion is that the entire hardware contributes to how quickly and efficiently the apache calls are processed.

Nature of files served :

We need to keep in mind that the type of files served are a determining factor. If apache needs to do a lot of thinking before a file is served, you are asking it to again utilize more cpu and memory. This means, the more dynamic your web pages, the more resource intensive it is. So, consider what is best and really essential before deciding the trade off.

Server environment :

Even if you run apache as a dedicated server, there are plenty of other applications that come along with a default installation. Be sure to disable it. For instance, NFS and print server and so on and generally enabled by default. You have to realize each of these small processes take up enough memory to eventually run the RAM out. Disable every single process that you don’t require to serve the web pages.

Don’t use the system for anything else!

Yes, don’t login to the system unless it is really essential to login. Use automated scripts to alert you of changes happening on the server. Decide the interval for doing such checks carefully. It is recommended not to login frequently, test compile new applications, other tests that involve editing/moving files around or check the load on the server every other time of day. All these activities will affect the web serving performance. If necessary, employ remote plugins and set an interval for the checks without compromising the server resources.

Keep the software versions up to date :

Newer stable versions of any software are more likely to perform better. So, whenever possible upgrade your server to the latest stable version with proper planning. Most control panels that help hosting automation do these updates automatically. Still, it is always good to cross check.

Apache application environment :

You make compromises on the modules to configure along with your initial compilation of apache. However, these compromises can prove costly in the long run if not carefully selected. Many custom scripts that support this build usually configures it with default settings. It is extremely important to understand the environment the web pages are going to be served before it is decided which modules should be compiled along with the initial build.

An important factor to consider during the build is to decide whether to compile a given module as static or dynamic. For once, we know dynamic is always convenient, but it utilizes more cpu resource each time it is loaded on and off the memory. This takes a performance hit. So, if you think you can accommodate a commonly and most frequently used module, compiling it as static makes more sense. In all other cases, compile the module as shared.

However, choose the static modules carefully as it eats up into the server RAM. When multiple child processes run serving the same type of module, consider the amount of memory utilized. Even though, statically loaded modules are served faster, you should enable it carefully so that it doesn’t create a RAM crunch which may keep the processes in queue and eventually degrade performance.

Configuring the httpd.conf file :

a) Remove everything you don’t need!

Yes, it is normal practice leave everything we don’t utilize as ‘comments’. Not only is it disruptive and confusing, it is unnecessarily adding to unused data working against the principle of optimization.

b) Make the configuration simple

By trying to achieve high performance we tend to add a host of features without looking at the conflicting nature of these settings. These eventually add to additional cpu cycles, affecting performance.

c) MaxClients :

MaxClients setting determines how many simultaneous connections Apache can handle. A very low MaxClients value, on a high traffic server, below par with what the server can handle will definitely affect the performance of your web server. This puts a lot of requests in wait and if the processor and the RAM allows these requests to be served simultaneously it should be allowed. Study the maximum number of requests expected before changing this setting.

d) ServerSignature Off, Server status and info :

Disable these features as they add additional overhead to your server by serving these requests unless you are monitoring, deploying applications or testing an error.

e) HostnameLookups :

HostnameLookups will slow down the server as it tries to lookup hostname information of a client IP each time it requests a page. It is not necessary as you always have the option to process the logs later for more info, even if you disable it. Turn it off by using “HostnameLookups off” directive.

f) FollowSymLinks :

If you turn this on, apache checks for symlinks for each directory served in the path. There is additional process calls involved for each, so if you really don’t need to use symlinks to serve your web pages, turn off this feature for the web directory.

g) Custom settings for DirectoryIndex :

This decides which page should be ( the index page ) displayed in order of priority when someone accesses a web directory. As far as possible, try to avoid fancy custom pages that override the main configuration. Let the server use default settings. index.html and index.php are the standard.

h) Put all CGI components into one single directory and configure it so, in order to prevent apache from spending time to determine how it should be handled.

Server Logs :

This is an area which takes a lot of disk read/write. It is not recommended to disable it, but you will be surprised to see the increased results in performance when you disable logging. For security reasons, you may not be able to do this, so at least make sure you have disabled hostname lookup.

.htaccess files :

.htaccess provides everyone a convenient way to make changes specific to a user account. This has several drawbacks though. Each time the server needs to check the .htaccess file in the requested directory and all its parent directory for the configurations to decide which overrides the other. This is a time consuming process and hence it will definitely affect the performance. Therefore, as far as possible, include the changes you require in the main configuration file itself.

.htaccess can be disabled by using AllowOverride None in the main configuration file, the directive itself imply the meaning.

I think the above list comprehensively covers most areas that affect server performance. When you talk about the apache server in itself, it is unfortunate that the main reasons for server slowness are actually factors that are outside these configuration. In most cases, apache spends time waiting to serve the data because a script or a dynamic content which is supposed to give the data is taking time to execute. So, changes like CGI to mod_perl increases the overall speed of execution and drastically improve performance to as much as 70%. It is interesting to note that once the data is released, it takes only milliseconds for Apache to serve the request :)

Leave a Reply

You must be logged in to post a comment.