There comes a time when you are required to solve a bug that reproduces in some obscure environment but doesn’t on your development PC. You don’t want to wait until it happens to you make your emergency plan. Here are a couple of basic tools and tips that will help you prepare.

(Everything I’m about to tell applies equally to native and to managed development)

emergency plan

You will need to start by addressing two core questions:
“What is the state of the system?” and “How did it get there?”

Logs

The most effective way to know how the system got to the point of failure is by writing good logs.
You must be able to enable logging and you have to be prepared for the results. If you never read your logs, you will not get much out of them the first time you try. Analyzing logs takes practice.

Even so, a good framework can’t hurt. There are many awesome logging frameworks, but I would encourage you to consider one of Apache Log4X family: Log4j for Java, log4cxx for C++ and log4net for C# and the other .NET languages. Due to its trans-language nature and heavy adoption on many platforms, the experience you’ll gain is more likely to be valuable on your next job.

Let’s look at an example in C++:

#include <log4cxx/logger.h>
#include <log4cxx/xml/domconfigurator.h>

using namespace std;
using namespace log4cxx;
using namespace xml;

LoggerPtr logger(Logger::getRootLogger());

void main()
{
    DOMConfigurator::configure("Log4cxxConfig.xml");

    LOG4CXX_TRACE(logger, "Program started");

    while (true);
}

Dump Files

As for answering “What is the state of the system?”, nothing is as helpful as a dump file.

Full User Mode Dump File is essentially a snapshot of your process memory space. Among other things, it contains the following key pieces of information:

  • Loaded modules
  • Heaps
  • Thread information, including call-stacks for each thread

Common techniques for creating a dump file:

  • Configure Windows to generate a dump file every time any application crashes:
    If you are trying to investigate the origin of an unhanded exception, this feature will let you look inside the application when the exception was thrown. Here’s how you enable it.
  • Create a dump file manually:
    When the program is stuck and you can’t attach a debugger at that moment, you can take a snapshot of the system and inspect it later:

  • Create dump file with a custom trigger using ProcDump:
    ProcDump has you covered in case of a crash or a hang and it also provides many other triggers to help you identify the problem. It’s free, requires no installation and it’s self-contained in a single file. In a nutshell, here is how you use it:

For more, check out “How To Capture A Minidump: Let Me Count The Ways”.

Once you’ve retrieved a dump-file and copied it to your development environment, you can analyze it using pretty much any debugger. Here is how you do it in Visual Studio:

Symbol Server

You need PDB files for the debugger to map memory addresses to source lines. This mapping is unique – the debugger will only accept the exact PDB file that was created with the DLL.

Re-compiling the same source code will not do. This means one thing – you must keep PDB files for each release. You will probably not deploy them with your product – these files can get quite big and you don’t want an extra 100 MB for something most users will not use.

But given an environment without a PDB file, how do you go about finding the correct one?
You can (and probably should) embed version information inside every release, but you would need to manually maintain and navigate the repository of your PDB files. A Symbols Server does that for you. The word Server is a bit misleading as it doesn’t require configuring any actual server. All you need is just some shared folder with appropriate public read privileges.

If you are using the Team Foundation Server, putting the PDB in the right location can be automated as a part of your build process:

Once you have your PDBs in place, you need to set up the debugger to work with your newly configured Symbols Server:

Source Indexing

The last piece of the puzzle is finding the appropriate version of the source code. Manually, you would do this by fetching the correct specific version from your source control system.

By default, PDB file contains the file names relative to the build location. Usually this is some local path on the server. If only there was a way to embed the exact revision under source control into the PDB. Oh wait, there is! This is what usually referred to as Source Indexing. In the latest versions of the Team Foundation Server, it is automatically enabled with the Symbols Server.

When trying to debug a properly indexed DLL, you will get the following message from Visual Studio:

Once you accept, you will automatically get the right PDB file from the Symbols Server and the right source files from your source control system:

You are ready to go. Happy debugging!