Use gperf for efficient C/C++ command line processing

Posted by suvi under Programming

The GNU tool gperf is a "perfect" hash function that, for a given set of user-provided strings, generates C/C++ code for a hash table, a hash function, and a lookup function. Learn how to use gperf for effective command-line processing in your C/C++ code.

Command-line processing and the need for gperf

Command-line processing is historically one of the most ignored areas in software development. Just about any relatively complicated software has dozens of available command-line options. In fact, it’s not uncommon to find hundreds of lines of if-else statements coded to process user input, and maintenance of such legacy code becomes a time-consuming affair even for seasoned programmers. In such circumstances, most C developers commonly go for a rather long (and often nested) if-else statement with ANSI C library functions such as strcmp, strcasecmp, and strtok thrown in for good measure, as Listing 1 shows.

Listing 1. C-style command-line processing

                
if (strtok(cmdstring, "+dumpdirectory"))
  {
  // code for printing help messages goes here
  }
else if (strtok(cmdstring, "+dumpfile"))
  {
  // code for printing version info goes here
  }

Instead of the ANSI C-based application program interface (API), a C++ developer would probably use strings from the Standard Template Library (STL). Still, there’s no escaping the nested sequence of if-else statements. Clearly, this approach isn’t scalable as the number of options increases. For a typical program invocation with N options, the code would end up doing O(N2) comparisons. To produce code that runs faster and is easy to maintain, it makes far more sense to use a hash table that stores the command-line options, then uses the hash to validate the user-specified input.

This is where gperf comes in. It generates a hash table from the predetermined list of valid command-line options and a lookup function whose time complexity is O(1). Thus, for a typical program invocation with N options, the code needs only O(N) [N*O(1)] comparisons—an order of magnitude improvement over the legacy code.

Read more at IBM 

Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word