Shared Memory Logging

Overview

There are numerous ways of logging information while an app is running, and every programmer has a favorite technique for logging. One of my favorites is to write messages into shared memory, something that is useful when large amounts of data need to be written into the log with a minimum of latency.

The classic way for logging is to printf messages to a console window, or fprintf to a file. A console window will usually have a fairly limited history, while a file will end up being truncated if the app crashes, since any unwritten data in the file write buffer will be lost. The lost information may be preserved by frequent calls to fflush, but this can have a significant impact on performance if large numbers of messages are being written. And for legacy reasons, fflush may not do anything, so it may not be possible to flush information through to the disk.

On Windows, a more common approach is to write the messages to OutputDebugString, which transfers the messages to a debugger. But this requires that a debugger (or a separate utility like DbgView) be attached to the process. And the logging app may expend lots of CPU processing the data to keep the data displayed in a window.

For many apps, using a system call like OutputDebugString is more than adequate for maintaining a trail of breadcrumbs to help trace down what was happening when the app crashed.

However, problems start arising when writing real-time software that is constantly stressing the system, using all available CPU cycles for processing, and performing constant disk I/O operations. Crashes or other aberrant behavior can turn into Heisenbugs as soon as a logging utility is attached to the process, or as a result of the programmer adding more logging messages to try to narrow down where any why a crash is occurring.

For a real-time system, interactive debugging is often not an option. And the modest amount of CPU time being consumed by a debug message window (as it constantly formats messages and redraws itself) is all it takes to mask timing bugs or change the way in which the bug manifests itself.

The trick I use most often for logging in time-critical applications is to allocate a memory buffer that is shared between the app being debugged and a separate logging app. This allows the app being tested to dump messages into a flat memory buffer, without the expense of interprocess communication, file I/O, or having a display app that is trying to redraw itself after every message.

Granted, shared memory is not a zero-expense option. Messages must still be copied into the buffer, possibly subjected to a printf-like formatting of text, and protected with critical sections if multiple threads are writing to the same buffer.

However, in my experience, shared memory is the least performance-invasive technique for leaving a trail of breadcrumbs for that inevitable crash.

Refer to the cross-platform library in crossplat.zip for the full version of the code discussed in this article.

On Windows

My approach to shared memory logging is to use a small, separate app. This app will create the buffer that will be used for logging, then it does absolutely nothing else. It does not monitor the contents of the shared memory, nor attempt to display any kind of status information. The app exists for the sole purpose of holding a copy of the shared memory in case the app being debugged should crash.

Under Windows, refer to the QzLogUtilWin project. This is a simple little MFC app. Yeah, that's right, MFC. This is code I've been using in one form or another for many years.

QzLogUtilWin offers three simple options: quit, save the contents of shared memory to a file, or toggle a flag in shared memory that disables whether log messages are written to shared memory.

Shared memory is created with CreateFileMapping and MapViewOfFile. It is possible to create shared memory that is backed by a physical file, but that is not necessary for purposes of passing data between processes.

The logging utility is responsible for creating the shared memory. As covered later in this article, the app being debugged will first check to see if the shared memory exists. If not, the app will fall back to fwriteing messages to a file.

The following section of code demonstrates how the QzLogUtil creates the shared memory buffer:

bool QzSharedMemory::Initialize(char name[], U32 byteCount, bool allowCaching)
{
    Free();

    // This is the default permission flag required to read and write to
    // shared memory.
    U32 protection = PAGE_READWRITE;

    // Add the following flags to prevent page caching.  Disabling caching
    // will deliver a slightly better performance, but prevents data changes
    // from immediately being visible to other processes that are accessing
    // this memory (meaning that the last range of data written to shared
    // memory before a crash may end up being lost -- exercise caution
    // when commenting out these flags).
    //
    if (false == allowCaching) {
        protection |= SEC_COMMIT | SEC_NOCACHE;
    }

    m_ByteCount = byteCount;

    // Attempt to open a shared, memory-mapped file that will be used for
    // the shared buffer.
    //
    // The two important values here are the name and byte count.  All
    // apps that are mapping the same memory space must use the exact
    // same "file name" to create the memory file.  This file exists in
    // memory only, without any disk space backing it (since we're passing
    // in INVALID_HANDLE_VALUE as the handle to the file), but it is still
    // a file as far as the OS is concerned.
    //
    // The other important value is m_ByteCount.  This specifies the
    // maximum size of the buffer.  This defines how much memory space is
    // reserved for the virtual file.  The memory will not be committed
    // until it is mapped with MapViewOfFile(), but the value provided
    // here will limit the maximum amount of memory that can be accessed
    // by MapViewOfFile().
    //
    // Note that we're calling the ASCII version of the function, since
    // we were given a char[] containing the file name.
    //
    m_hFile = ::CreateFileMappingA(INVALID_HANDLE_VALUE, NULL, protection,
                0, m_ByteCount, name);

    // Immediately test GetLastError().  This will indicate whether the
    // buffer was created successfully, or if it "failed" because the
    // buffer already exists.  Either way, a valid handle to the memory
    // file was returned, so we can use that handle for shared memory.
    // The only distinction drawn here is that if the memory file already
    // exists, we know that some other app has already created the memory
    // file.  Since shared memory is only being used by the logging code,
    // this indicates that a separate logging utility has reserved the
    // memory and is waiting to catch any log info in the event that this
    // process crashes.
    //
    m_AlreadyExisted = (ERROR_ALREADY_EXISTS == GetLastError());

    // Now check the return value.  If it returned the error code we gave
    // to the function (which was INVALID_HANDLE_VALUE), then the buffer
    // could not be created for some reason.
    //
    // However, we're going to be overly paranoid here.  Although testing
    // of the code shows it returns INVALID_HANDLE_VALUE, the docs say it
    // will return NULL if it fails.  Test both possible values to avoid
    // any chance of being screwed over by the inconsistent use of the
    // INVALID_HANDLE_VALUE value.
    //
    if ((INVALID_HANDLE_VALUE == m_hFile) || (NULL == m_hFile)) {
        return false;
    }

    // Now map the shared file into virtual memory so we can read/write
    // it.  Note that we are mapping the full range of memory created by
    // the call to CreateFileMapping().
    //
    m_pAddress = reinterpret_cast<U08*>(MapViewOfFile(m_hFile,
                 FILE_MAP_ALL_ACCESS, 0, 0, m_ByteCount));

    // If for some reason the file could not be mapped, we need to close
    // the handle so the file will be freed, then return failure so the
    // caller does not attempt to access the file.
    //
    if (NULL == m_pAddress) {
        CloseHandle(m_hFile);
        m_hFile = NULL;

        return false;
    }

    return true;
}

Both the log utility and the code in QzSystemWin.cpp perform the same operations to create the shared memory. They first allocate a range of physical memory with CreateFileMapping, then map that physical memory into the application's address space using MapViewOfFile.

To free the shared memory, the code first needs to release the virtual address space associated with the memory using UnmapViewOfFile, then free the physical address space itself by closing the shared file handle with CloseHandle:

void QzSharedMemory::Free(void)
{
    if (m_pAddress != NULL) {
        UnmapViewOfFile(reinterpret_cast<void*>(m_pAddress));
        m_pAddress = NULL;
    }

    if (m_hFile != NULL) {
        CloseHandle(m_hFile);
        m_hFile = NULL;
    }

    m_ByteCount      = 0;
    m_AlreadyExisted = false;
}

As an aside, when multiple processes map the same shared memory, they will not be given the same void* pointer. Each process will map the same shared memory to a different virtual address. Do not attempt to store pointers within shared memory. Attempting to dereference a pointer to another process's address space will at best cause the app to promptly crash with an access violation, and at worst cause lingering memory damage that will not immediately manifest itself.

On Mac

Creating shared memory on OSX works similar to Windows, except for using the shm_open and mmap routines to create the memory file and map it into the process's virtual address space.

bool QzSharedMemory::Initialize(char name[], U32 byteCount, bool allowCaching)
{
    Free();

    // For generality, we are given the name of a virtual file that will
    // be used to share memory.  However, for shm_open() to work, the name
    // needs to start with '/', so we will make a copy of the name that
    // starts with the required symbol.
    //
    char sharedName[64];
    strncpy(sharedName + 1, name, ArraySize(sharedName) - 2);
    sharedName[0] = '/';
    sharedName[ArraySize(sharedName)-1] = '\0';

    // Call shm_open() to create a virtual file that will be used to back
    // the shared memory.  This file is created without any physical
    // backing, since we don't want to persist the data to disk, nor do we
    // want to run the performance risk of writing to a physical device
    // when logging large amounts of data.
    //
    // Note that for cross-platform compatibility, the return result is
    // being stored in a void* pointer instead of an int.
    //
    m_hFile = (Handle_t)shm_open(sharedName, O_RDWR);

    // Failure returns -1 instead of NULL.
    if (-1 == (int)m_hFile) {
        m_hFile = NULL;
        return false;
    }

    // Next we need to map the system memory into the virtual address
    // space of this process.  Again we need to specify read and
    // write permission to avoid page faults when writing to the
    // address space.
    //
    m_pAddress = (U08*)mmap(NULL, byteCount, PROT_READ | PROT_WRITE,
                 MAP_SHARED, (int)m_hFile, 0);

    if (NULL == m_pAddress) {
        close((int)m_hFile);
        m_hFile = NULL;
        return false;
    }

    m_ByteCount = byteCount;

    return true;
}

Unlike the Windows version of the code, where either the app itself or the logging utility are allowed to create the shared memory, on OSX only the logging utility is allowed to create the buffer. To permit this to work, QzLogUtilMac needs to include the O_CREAT bit flag, and add a third parameter to shm_open that allows creation: S_IRWXU. Once the shared file has been created, the code needs to call ftruncate to set the size of the shared buffer.

    // Use the O_CREAT and S_IRWXU flags to force the shared file to be
    // created.
    int hShare = shm_open(shareName, O_RDWR | O_CREAT, S_IRWXU);

    // Then call ftruncate to set the initial size of the file.
    ftruncate(hShare, byteCount);

Other than the different flags to shm_open and the call to ftruncate, both the logging code and the logging utility create the shared memory buffer in the same way. This difference is necessary since we want to track who owns the buffer. In Windows, the ERROR_ALREADY_EXISTS code can be used to tell whether a logging utility is waiting to catch data in shared memory. There is no equivalent to this code in OSX, so we restrict shared memory creation to the logging utility. The app can only access the shared memory if the logging utility is started first.

Once we are done using the shared memory, we free it by calling munmap and close:

void QzSharedMemory::Free(void)
{
    if (NULL != m_pAddress) {
        munmap(m_pAddress, m_ByteCount);
        m_pAddress = NULL;
    }

    if (NULL != m_hFile) {
        close((int)m_hFile);
        m_hFile = NULL;
    }

    m_ByteCount = 0;
}

The logging utility also does things a little different when freeing the shared memory. On Windows, the shared memory will automatically go away when all apps referencing it are closed. However, on OSX the shared memory will persist until the system shuts down. The only way to get rid of the shared memory file is to explicitly call shm_unlink. This will cause the shared buffer to go away after all references to it are closed.

Writing Log Data

My logging code is fairly agnostic about content. All it is really doing is memcpying blocks of data into a ring buffer. Some care is needed when a block of data wraps around the end of the ring buffer. Although it is writing UTF-8 strings into the buffer, it could just as easily be writing fixed-format message packets (although serializing binary data for access by another app can take extra processing to serialize or deserialize, and extreme caution needs to be taken to avoid storing any kind of pointer in shared memory).

To track where data is located in the ring buffer, a short header is stored at the beginning of shared memory:

struct QzLogInfo_t
{
    U32 Enable;        // flag used to disable the writing of log messages
    U32 ErrorCount;    // how many error messages have been logged
    U32 MaxByteCount;  // total size of shared memory buffer
    U32 ByteCount;     // total number of bytes in the ring buffer
    U32 ByteOffset;    // offset to the start of the ring
};

All of the logging data is written immediately following the header, all the way through the end of shared memory. The buffer is treated as a ring buffer, so once the buffer fills up, it wraps around and overwrites the oldest range of memory. This means that the oldest chunk of data left in memory is often a partial line of text, possibly starting in the middle of a multi-byte UTF-8 symbol. The logging code ignores these issues. It simply memcpys the data into the ring buffer without any additional processing. It is the responsibility of the logging app to clean up any fragmentary data at the beginning of the ring buffer.

The following chunk of code is used to append data to the end of the ring buffer in shared memory.

    // If the size is unknown, call a string function that will
    // determine the length of the string to write.  Normally, the
    // string formatting code will report the byte count, so this
    // extra step will be skipped.
    if (byteCount < 0) {
        byteCount = UtfByteCount(buffer);
    }

    // Get a pointer to the header at the start of shared memory.
    QzLogInfo_t *pInfo = reinterpret_cast<QzLogInfo_t*>(m_pMappedAddress);

    // Sync error count up with shared memory.
    pInfo->ErrorCount = g_ErrorCount;

    // Check to see if the logger has disabled logging.
    if (!pInfo->Enable) {
        return;
    }

    // The logging data starts immediately after the header.
    Utf08_t *pStart = reinterpret_cast<Utf08_t*>(m_pMappedAddress + sizeof(QzLogInfo_t));

    // Once enough data has been written into the buffer to fill it
    // up, we're now dealing with ring buffer logic.  The block of
    // data we're about to write may need to wrap around the end of
    // the buffer.
    //
    if ((byteCount + pInfo->ByteOffset) >= pInfo->MaxByteCount) {
        U32 count2 = byteCount + pInfo->ByteOffset - pInfo->MaxByteCount;
        U32 count1 = byteCount - count2;

        // Copy first part of message to end of ring buffer.
        if (count1 > 0) {
            memcpy(pStart + pInfo->ByteOffset, buffer, count1);
        }

        // Remaining portion of message (if any) wraps to start of buffer.
        if (count2 > 0) {
            memcpy(pStart, buffer + count1, count2);
        }

        pInfo->ByteOffset = count2;
        pInfo->ByteCount  = pInfo->MaxByteCount;
    }

    // Otherwise we have not yet filled up the ring buffer.  No wrap
    // condition can yet exist, so we can memcpy the data in one call.
    //
    else {
        memcpy(pStart + pInfo->ByteOffset, buffer, byteCount);

        pInfo->ByteOffset += byteCount;

        // Check the resulting size.  Initially, the byte count will grow
        // until the buffer fills, after that it overwrites the oldest
        // data and remains a fixed size.
        if ((pInfo->ByteCount + byteCount) < pInfo->MaxByteCount) {
            pInfo->ByteCount += byteCount;
        }
        else {
            pInfo->ByteCount = pInfo->MaxByteCount;
        }
    }

Reading It Back

Now we get to the point where what my code does will probably not be what your code does. All of my logging code works by writing UTF-8 data into the shared memory buffer. The log util app needs to extract the UTF-8 data and write it to a text file that can be viewed by any standard text editor.

The fact that the data is UTF-8 adds some complexity, since UTF-8 symbols are of variable length, between one and four bytes. This means that the code needs to discard any partial symbols from the start of memory and it needs to write a Byte Order Mark (BOM) to the start of the file.

However, the messages that get logged are almost always 7-bit ASCII strings. The only exceptions are file names or other strings/labels that were typed in by a user. In most cases, storing the log as a UTF-8 file is unnecessary, especially since some common apps (such as North American versions of WordPad) exhibit problems with UTF-8 files.

To avoid these problems, the code will first scan the text that needs to be written to disk. If all of the symbols are 7-bit ASCII, it can save the file as a plain old text file. However, if any 8-bit value is detected, then we have a multi-byte UTF-8 symbol, so the whole file needs to be saved in UTF-8 format with the three-byte BOM at the start of the file.

void SaveLog(char filename[], char *pAddress)
{
    QzLogInfo_t *pInfo = reinterpret_cast<QzLogInfo_t*>(pAddress);

    FILE *pFile = fopen(filename, "w");

    if (pFile != NULL) {
        // Shared memory is treated as a ring buffer.  The header at
        // the start of shared memory tells us what range of data in
        // the buffer contains out logging information.
        U32 byteOffset = pInfo->ByteOffset;
        U32 byteCount  = pInfo->ByteCount;
        U32 maxCount   = pInfo->MaxByteCount;

        // Skip over the header at the start of shared memory to get
        // to the start of the actual log text.
        char *pStart = reinterpret_cast<char*>(pAddress + sizeof(QzLogInfo_t));

        // Make life easier by allocating a temp buffer, with extra
        // space for a '\0' at the end of the buffer.  This allows
        // us to ignore any logic problems that will arise if a
        // multi-byte UTF-8 symbol is split across the end of the
        // ring buffer.
        char *pTemp = new char[byteCount + 1];

        // If the log data never wrapped around the end of shared
        // memory, we can simply memcpy() the string.
        if (byteCount < maxCount) {
            memcpy(pTemp, pStart, byteCount);
        }

        // Otherwise the data wraps around the end of the ring buffer,
        // so two memcpy's are required to piece the whole block of
        // log data into a continuous buffer.
        else {
            U32 count1 = maxCount - byteOffset;
            U32 count2 = byteOffset;
            memcpy(pTemp, pStart + byteOffset, count1);
            memcpy(pTemp + count1, pStart, count2);
        }

        // Terminate the buffer.  In this case it is not strictly
        // necessary, doing so can avoid any craziness that may arise
        // from trying to view a non-terminated string in a debugger.
        pTemp[byteCount] = '\0';

        U32  skip   = 0;
        bool isUTF8 = false;

        // It's possible that the oldest byte in the buffer is from the
        // middle of a multi-byte sequence.  This will skip any middle
        // bytes, looking for the first 7-bit ASCII byte, or the first
        // byte of a multi-byte sequence.  This should not be more than
        // three bytes into the buffer.  Normally, it should be the very
        // first byte, since almost no multi-byte symbols are used in
        // logging.
        //
        for (U32 i = 0; i < byteCount; ++i) {
            if (0x80 != (0xC0 & pTemp[i])) {
                skip = i;
                break;
            }

            if (0 != (0x80 & pTemp[i])) {
                isUTF8 = true;
            }
        }

        // This three-byte sequence denotes a UTF-8 file.  But we only
        // need to write it if there are any 8-bit symbols.  If all of
        // the text is 7-bit ASCII, we can skip the marker, since WordPad
        // has problems with reading UTF-8 files (it appears to read data
        // in 8KB chunks, and glitches one of the bytes near the end of
        // each chunk).
        //
        if (isUTF8) {
            U08 marker[3];
            marker[0] = 0xEF;
            marker[1] = 0xBB;
            marker[2] = 0xBF;

            fwrite(marker, 1, 3, pFile);
        }

        // Write all of the log to disk, skipping any partial UTF-8 symbol
        // from the start of the buffer.
        fwrite(pTemp + skip, 1, byteCount - skip, pFile);

        fclose(pFile);

        printf("Log Size: %d bytes\n", byteCount + 3 - skip);

        delete [] pTemp;
    }
    else {
        printf("Unable to open file \"%s\"\n", filename);
    }

    if (0 != pInfo->ErrorCount) {
        if (pInfo->ErrorCount > 1) {
            printf("There were %d errors recorded in the log.\n", pInfo->ErrorCount);
        }
        else {
            printf("There was 1 error recorded in the log.\n");
        }
    }
}

Refer to QzLogger.cpp more examples of how the shared memory code is used. However, that code is rather busy to be a good example, since it supports writing to shared memory as well as writing directly to disk. Furthermore, it uses custom string handling code to deal with UTF-8 strings in a completely platform-independent manner, so it is not code that could be directly copy-n-pasted into someone else's project.