C++ generates formatted standard string instance code

2020-06-19 11:18:56
OfStack

Two ways to format strings

As we all know, C++ std::string Features are incomplete, lacking all kinds of features, such as formatting strings.

In python3, two methods of formatting strings are supported: one is the C style, with the formatted part beginning with %, the corresponding specific type after % (for example, %s corresponds to string %d corresponds to integer), and the other is the type-independent style, {0} corresponds to the first parameter, {1} corresponds to the second parameter.


>>> "{0}'s age is {1}".format(" Latosolic red ", 11)
" Latosolic red 's age is 11"
>>> "%s's age is %d" % (" Latosolic red ", 11)
" Latosolic red 's age is 11"

In C++, you can only borrow the C function and use snprintf to format 1 buffer


#define BUFFSIZE 512
 char buf[BUFFSIZE];
 snprintf(buf, BUFFSIZE, "%s's age is %d\n", " Latosolic red ", 11);

Or use type independent flow operators


 std::ostringstream os;
 os << " Latosolic red " << "'s age is " << 11 << "\n";
 std::string s = os.str();

Efficiency aside, this is useful < < Splicing multiple different types of objects is a lot of code, and it's more difficult to control the exact output format, such as the number of digits, or the number of decimal places. At least it's too complicated for me to remember, preferring C style snprintf for control. Such as


 double d = 3.1415926;
 snprintf(buf, BUFFSIZE, " PI : %-8.3lf It was discovered by Zu Chongzhi \n", d);


$ ./a.out 
 PI : 3.142  It was discovered by Zu Chongzhi

With %-8.3lf, the float of type lf(long float) is set to 8, the decimal point is set to 3, and the minus sign indicates left-aligned, which is a very simple and compact representation.

As for the C++ iomanip header file implementation, I spent some time looking it up.


 double d = 3.1415926;
 os << " PI : " << std::setw(8) << std::fixed
  << std::setprecision(3) << std::left
  << d << " It was discovered by Zu Chongzhi \n";

Except that the code is so long and can be missed std::fixed In addition, the problem is that setprecision has changed the default Settings, that is, if os again < < Passing in a floating point number still preserves three decimal places.

One might say that this benefit is that setprecision and setw can receive one variable and be very large. In fact, the snprintf1 sample can do this.


 double d = 3.1415926;
 int n1 = 8, n2 = 3;
 snprintf(buf, BUFFSIZE, " PI : %-*.*lf It was discovered by Zu Chongzhi \n", n1, n2, d);

C++ wraps snprintf to generate formatted std::string objects

In APUE UNP TLPI these books on C programming under Linux, have written their own error handling library to wrap snprintf to produce formatted output, so as not to repeatedly define buffer/call snprintf and so on.

One of the disadvantages of this approach is that the buffer (character array) length is limited, but generally buffer size is defined large enough to be sufficient, after all, printing too long formatted strings is not as good as calling the function several times.

On the other hand, because these functions simply print information, especially often after printing information directly out of the program. So no error string is returned. These functions are not sufficient if you want to pass the error message as an exception to the previous layer in C++. So I need to make some simple changes.


inline std::string format_string(const char* format, va_list args) {
 constexpr size_t oldlen = BUFSIZ;
 char buffer[oldlen]; //  Buffer on the default stack 
 va_list argscopy;
 va_copy(argscopy, args);
 size_t newlen = vsnprintf(&buffer[0], oldlen, format, args) + 1;
 newlen++; //  Count the terminator '\0'
 if (newlen > oldlen) { //  The default buffer is not large enough to be allocated from the heap 
  std::vector<char> newbuffer(newlen);
  vsnprintf(newbuffer.data(), newlen, format, argscopy);
  return newbuffer.data();
 }
 return buffer;
}

inline std::string format_string(const char* format, ...) {
 va_list args;
 va_start(args, format);
 auto s = format_string(format, args);
 va_end(args);

 return s;
}

This is an implementation modeled after UNP, with the parameters defined as va_list and... The version that accepts va_list is also available for other functions. Because of the C-style variable parameter list... Cannot be passed as a parameter. On the other hand, the va_list type does not necessarily have a copy constructor, so va_copy has to copy one copy of va_list for the second use.

C++11 adds a variable template parameter feature that simplifies the above code


template <typename ...Args>
inline std::string format_string(const char* format, Args... args) {
  constexpr size_t oldlen = BUFSIZ;
  char buffer[oldlen]; //  Buffer on the default stack 

  size_t newlen = snprintf(&buffer[0], oldlen, format, args...);
  newlen++; //  Count the terminator '\0'

  if (newlen > oldlen) { //  The default buffer is not large enough to be allocated from the heap 
    std::vector<char> newbuffer(newlen);
    snprintf(newbuffer.data(), newlen, format, args...);
    return std::string(newbuffer.data());
  }

  return buffer;
}

Passing variable template parameters is also made 10 points easier (perfect forwarding with forward), as shown below


xyz@ubuntu:~/unp_practice/lib$ cat test.cc 
#include <string.h>
#include <unistd.h>
#include "format_string.h"

template <typename ...Args>
void errExit(const char* format, Args... args) {
  auto errmsg = format_string(format, std::forward<Args>(args)...);
  errmsg = errmsg + ": " + strerror(errno) + "\n";
  fputs(errmsg.c_str(), stderr);
  exit(1);
}

int main() {
  const char* s = "hello world!";
  int fd = -1;
  if (write(fd, s, strlen(s)) == -1)
    errExit("write \"%s\" to file descriptor(%d) failed", s, fd);
  return 0;
}
xyz@ubuntu:~/unp_practice/lib$ g++ test.cc -std=c++11
xyz@ubuntu:~/unp_practice/lib$ ./a.out 
write "hello world!" to file descriptor(-1) failed: Bad file descriptor

conclusion