Use gperf for efficient C/C++ command line processing
The GNU tool gperf is a "perfect" hash function that, for a given set of user-provided strings, generates C/C++ code for a hash table, a hash function, and a lookup function. Learn how to use gperf for effective command-line processing in your C/C++ code.
Command-line processing and the need for gperf
Command-line processing is historically one of the most ignored areas in software development. Just about any relatively complicated software has dozens of available command-line options. In fact, it’s not uncommon to find hundreds of lines of if-else statements coded to process user input, and maintenance of such legacy code becomes a time-consuming affair even for seasoned programmers. In such circumstances, most C developers commonly go for a rather long (and often nested) if-else statement with ANSI C library functions such as strcmp, strcasecmp, and strtok thrown in for good measure, as Listing 1 shows.
Listing 1. C-style command-line processing
if (strtok(cmdstring, "+dumpdirectory"))
{
// code for printing help messages goes here
}
else if (strtok(cmdstring, "+dumpfile"))
{
// code for printing version info goes here
}
Instead of the ANSI C-based application program interface (API), a C++ developer would probably use strings from the Standard Template Library (STL). Still, there’s no escaping the nested sequence of if-else statements. Clearly, this approach isn’t scalable as the number of options increases. For a typical program invocation with N options, the code would end up doing O(N2) comparisons. To produce code that runs faster and is easy to maintain, it makes far more sense to use a hash table that stores the command-line options, then uses the hash to validate the user-specified input.
This is where gperf comes in. It generates a hash table from the predetermined list of valid command-line options and a lookup function whose time complexity is O(1). Thus, for a typical program invocation with N options, the code needs only O(N) [N*O(1)] comparisons—an order of magnitude improvement over the legacy code.
Leave a Reply