wprintf()
is not portable.
The wprintf()
function seems like a very useful function for
modern applications softwares. It speaks wide characters allowing one to
potentially (assuming that one's implementation means something like
UTF-16 or UCS-32 by "wide character") Unicodify one's application yet
further, in a way that is portable from C/C++ compiler to C/C++ compiler
with ease; and it is standardized.
(See
the page for fwprintf()
in the Single Unix Specification version 6
for one of the two standards that define it.)
That's the theory from the standardization perspective, at least. Unfortunately, the function suffers from some disastrous non-standardization in real implementations of the C and C++ languages, that make it non-portable across implementations. These incompatibilities, moreover, exist in some fundamental and often-used parts of the function: the output of characters and strings.
Unfortunately, it turns out that it is impossible to be both
standards-conformant and portable when calling
wprintf()
.
These are the main variant behaviours across implementations of the C and C++ languages:
C/C++ implementation(s) | These format specifiers imply these arguments | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
%hs 1 |
%s |
%ls |
%hS 2 |
%S |
%lS 2 |
%hc 1 |
%c |
%lc |
%hC 2 |
%C |
%lC 2 |
|
Microsoft Visual C/C++ 7.1 (see the type and the size/distance documentation) | const char * |
const wchar_t * |
const wchar_t * |
const char * |
const char * |
const wchar_t * |
int 4 |
wint_t 2 |
wint_t 3 |
int 4 |
int 4 |
wint_t 3 |
OpenWatcom C/C++ | const char * |
const wchar_t * |
const wchar_t * |
const wchar_t * 5 |
const wchar_t * |
const wchar_t * 5 |
int 4 |
wint_t 3 |
wint_t 3 |
wint_t 3 5 |
wint_t 3 |
wint_t 3 5 |
GNU libc, and thus C/C++ compilers that use it (see the type and the conversion documentation) | undocumented | const char * |
const wchar_t * |
undocumented | const wchar_t * |
undocumented | undocumented | int 4 |
wint_t 3 |
undocumented | wint_t 3 |
undocumented |
OpenVMS C library, and thus C/C++ compilers that use it (see the output conversion documentation) | undocumented | const char * |
const wchar_t * |
undocumented | const wchar_t * |
undocumented | undocumented | int 4 |
wint_t 3 |
undocumented | wint_t 3 |
undocumented |
See footnote #1 to the next table. These, where they are documented, are all implementation extensions.
Formally, because it is a variable arguments function, any
wchar_t
argument is promoted to wint_t
to be
passed to the function.
Formally, because it is a variable arguments function, any
char
argument is promoted to int
to be
passed to the function.
These format specifiers are not documented in the OpenWatcom C library
reference (in contrast to Microsoft's documentation which does document
them, notice), however the OpenWatcom C library supports them in practice.
The behaviour of %hS
in OpenWatcom C/C++ is quirky, and in
practice varies according to what actual string data are passed to it to
be printed. OpenWatcom is clearly aping the non-standards-conformant
behaviour of the Microsoft compiler. But it does so quite badly.
Or, put another way around:
To print | the standards say to use | but in these C/C++ implementations | you actually have to use these format specifiers | so code that is both standards-conformant and portable |
---|---|---|---|---|
an SBCS/MBCS character |
%c 1 |
Microsoft Visual C/C++ 7.1 |
%C , %hc , or %hC 2
|
cannot exist |
OpenWatcom C/C++ |
%hc
|
|||
GNU libc, and thus C/C++ compilers that use it |
%c
|
|||
OpenVMS C library, and thus C/C++ compilers that use it |
%c
|
|||
an SBCS/MBCS character string |
%s 1 |
Microsoft Visual C/C++ 7.1 |
%S , %hs , or %hS 2
|
cannot exist |
OpenWatcom C/C++ |
%hs
|
|||
GNU libc, and thus C/C++ compilers that use it |
%s
|
|||
OpenVMS C library, and thus C/C++ compilers that use it |
%s
|
|||
a wide character |
%C 3 or %lc |
Microsoft Visual C/C++ 7.1 |
%c , %lc , or %lC 2
|
must use %lc
|
OpenWatcom C/C++ |
%c , %lc ,
%C ,
%hC 2, or
%lC 2
|
|||
GNU libc, and thus C/C++ compilers that use it |
%C or %lc
|
|||
OpenVMS C library, and thus C/C++ compilers that use it |
%C or %lc
|
|||
a wide character string |
%S 3 or %ls |
Microsoft Visual C/C++ 7.1 |
%s , %ls , or %lS 2
|
must use %ls
|
OpenWatcom C/C++ |
%s , %ls ,
%S ,
%hS 2, or
%lS 2
|
|||
GNU libc, and thus C/C++ compilers that use it |
%S or %ls
|
|||
OpenVMS C library, and thus C/C++ compilers that use it |
%S or %ls
|
Strictly, and certainly if portability is one's aim, one cannot apply the
h
size modifier to the c
or s
format specifiers, as the results of doing so are not defined by either
standard.
Strictly, and certainly if portability is one's aim, one cannot apply the
h
and l
size modifiers to the C
or
S
format specifiers, as the results of doing so are not
defined by either standard. These are all implementation extensions.
The C language standard, ISO/IEC 9899:1999, does not define either
%C
or %S
. The Linux C library documentation
rather overcautiously says "Do not use." for these specifiers. In fact
you can use them quite happily if portability to POSIX systems is as
portable as you wish to be, because they are defined by the
Single Unix Specification, and so
as the SUS itself notes
will be supported by any POSIX conformant system.