I’m sure even ESRI would admit that cache generation in ArcGIS 9.3.1 is a bit of a dark art, and has a tendency to be unreliable at the best of times.

No doubt there are numerous enhancements to the caching process in versions 10 and beyond of ArcGIS, but many organisations are still in version 9 land.

After a fair amount of trial and error, I have finally gotten around to compiling a small best practice guide to making the caching process in ArcGIS 9.3.1 as smooth as possible.

Always Cache an Extent

Unless you are dealing in the millions when it comes to scale, you should always be using a feature class to define the areas of your map to cache. Even working at the 1:1,000,000 scale, you will waste a lot of time caching ocean if you are building a World service.

It makes most sense to have a few different update extent feature classes for working at different scales to streamline this process as much as possible. However, do not get too complex with your features, simple rectangles are best.

Also beware of using multiple polygons in a feature class to create your cache extent if you are caching in the millions. For example, if you used the Ordnance Survey 1:50k tiles to create caching update extents, it would be perfect for working on the 1:50k rasters, but for anything above that it would cache the same image over and over, the same image being created for each shape in the feature class.

If you envisage running your caching (at larger scales in the thousands) in one go, I wold just settle with one or two shapes defining the cache area. However, if you think the process could be interrupted, you might want to use more blocks and skip existing tiles when you have to recache.

Measure Twice, Cut Once

As with construction work, make sure you are happy with a small area of your cache through all scales before hitting Go on a world wide caching run. You can fine tune labels, symbology and levels in ArcMap and it will save you a lot of back and forth.

Consider Your Scale Levels

In the same vein as the above, make sure that you are happy with your scale levels before generating your complete cache. If you decide later on that actually, you’d quite like that 1:25k layer after all, you may end up deleting your entire cache and having to start again.

Fiddling around with directory structures in a cache folder is apparently like playing with fire, and it’s only a matter of time before you get burnt (however, this has not happened to me (yet)).

Optimise Your Running Processes and Threads

If you have a 16 core server to cache on, that’s great. However, the advice for the number of instances and caching threads is based on the number of processors and not the number of cores.

I found some advice (somewhere) that you should run N+1 processes for the service and threads for the caching process, where N is the number of physical processors in the machine.

I have run my own tests on this and found the advice to hold true. No matter how tempted you are to try a caching run with 16 threads, don’t bother. You will be disappointed.

Leave the Machine to Cache

Because we are considering the physical processor as the limit of performance when it comes to caching, it is better to leave the machine alone whilst it caches. I have witnessed a doubling of cache time for threads when I’ve put the machine under additional load. So, if you can afford to, just let the machine churn away on its own.

Choose your File Formats Wisely

In a perfect world I would cache everything as PNG. However PNGs disregard the MXD/MSD background colour and leave you with a transparent background. Great for many uses, but not good for mine, so I’m stuck with JPEG.

Whilst ESRI advocate JPEG for base layer caching, there is no way it makes sense to use a lossy compression format for cached vector data; even with no compression it is a poor substitute for PNG.

Do note that PNGs are usually a little larger than JPEGs, though.

Deleting Blank Tiles

This is especially useful for PNG caches, as the lossless compression creates blank tiles all at the exact same size. If, for example, you are caching an overlay layer (say, orthos to overlay on vector), there will inevitably be quite a few tiles that are created as blanks (hopefully not too many because you have a good caching extent).

In the case of the PNG8 cache format, these are black PNGs with a file size of 393 bytes. After your cache is created you can write a process to pick out and delete these files from the cache folders and substitute them on the fly with ESRI’s blank.png workaround via IIS.

Whilst this doesn’t save you much drive space, it can save you the overhead of storing thousands of useless files.

Use Python to Run Caching

The caching toolbox in ArcCatalog is an amazingly vague and unresponsive beast. You can save yourself a lot of time by writing a Python script using the ArcGIS modules (installed with ArcMap) to run the caching. You can then leave this running in a command window or a scheduled task.

It is far easier to pick and choose your scales this way, and helps you automate the process.

Practice Makes Perfect

You could read advise all day, but as with most things, it is a case of just getting stuck in and seeing how you get on. No doubt there will be times where you want to pull your hair out as you have to delete your cache yet again, but you will get there eventually! How’s that for cheery optimism.

P.S. Yes I have played around with MapServer and, yes, I loved it and would choose to use it for any of my own projects.