NXXX1
Integer Constant Expression-Initialized const Integer Declarations are Implicitly constexpr

Published Proposal,

Previous Revisions:
None
Authors:
Paper Source:
GitHub ThePhD/future_cxx
Issue Tracking:
GitHub
Project:
ISO/IEC 9899 Programming Languages — C, ISO/IEC JTC1/SC22/WG14
Proposal Category:
Feature Request
Target:
C2y/C3a

Abstract

const integer type declarations initialized with constant expressions have been a fairly important staple of C programs both large and small over the last 30 years. Attempts to make such declarations ineligible as constant expressions and e.g. force an array declared with them to be a Variable-Length Array (VLA) type failed due to overwhelming existing practice in the opposite direction. This proposal therefore swings with said opposite direction now that we have constexpr. It asks that such const integer type declarations that are both declared and initialized in the same statement are implicitly made constexpr, thereby fulfilling the expectations of users.

1. Changelog

1.1. Revision 0 - August 21st, 2023

2. Introduction and Motivation

A common annoyance amongst C developers has been the ephemeral nature of the following code snippet:

int main () {
  const int n = 1 + 2;
  const int a[n];
  return sizeof(a);
}

Does this create a VLA type all the time, or is this a valid constant expression that produces a translation time (AKA compile-time) sized array with an extent of 3? Will sizeof(a) be executed at compile-time or will it be run at execution time (AKA run-time) and pull the value from somewhere in the binary? Furthermore, if an implementation defines __STDC_NO_VLA__, is this supposed to compile? All of these questions and more revolved around this issue were brought up in n2713. n2713 was accepted into C23, and subsequently forced the above code to resolve with a being a VLA, even if the implementation could ascertain this was a constant expression and treat it as a constant expression at compile-time. This allowed all implementations to have the same semantic frontend errors and/or warnings, while letting them optimize things as necessary during typical linking and code generation/lowering. (E.g., never using alloca with a dynamic value and instead just sizing the stack appropriately to accommodate the array directly for a binary implementation.)

However, during National Body (NB) comment processing, an NB comment pointed out that there was a lot of code relying on the fact that this was being treated -- not just by the backend with its optimizations -- but by the frontend of many compilers to be a plain, compile-time C array. This was formalized in n3138, which presented cases similar to the above. It also presented various other constant expressions to make it clear that there is a wide berth of existing practice beyond just MSVC, GCC, and Clang that accept many additional forms of constant expressions in many different situations. However, the array case remains of very significant impact that affects the most existing code. n3138 promised that a potential future version of C should look into the impact of changing constant expressions one way or another again.

This paper introduces a change for a portion of constant expressions in the opposite direction of N2713, by asking that const integer type declarations that are also immediately initialized with an integer constant expression are implicitly declared constexpr.

3. Prior Art

This is existing practice on a wide variety of compilers both large and small, ranging from SDCC all the way up to much more powerful compilers like ICC (Intel), Clang and GCC. The snippet in § 2 Introduction and Motivation compiles and runs on many implementations with no run-time execution, even on its intentionally-weakest optimization settings (where applicable for a compiler with such settings). It also runs on many implementations even where VLAs are not allowed (e.g. with __STDC_NO_VLA__ or where -Wvla is combined with -Werror).

Furthermore, C++ has a similar feature for all const-declared integer types. However, rather than modeling this after the C++ wording and C++ feature, we instead focus on solidifying and cleaning up the existing practice of implementation’s C mode (for implementations with shared C and C++ modes) and existing purely C compilers. Most importantly, we do not apply the full "manifestly constant evaluated" or "constantly evaluated" powers that C++ has adopted, and instead focus exclusively on what follows from the existing practice of existing C codebases and C implementations.

4. Design

The design of this feature is such that it requires a declaration that is the first declaration of its kind, without external linkage, and is immediately initialized. It also only applies to declarations whose only qualifier is const and, optionally, has static, auto or register for its storage-class specifiers. (If the storage-class is already constexpr, then this proposal affects no change to the declaration at all.) This means that, under this proposal, of the following declarations:

int file_d0 = 1;
_Thread_local int file_d1 = 1;
extern int file_d2;
static int file_d3 = 1;
_Thread_local static int file_d4 = 1;
const int file_d5 = 1;
constexpr int file_d6 = 1;
static const int file_d7 = 1;

int file_d2 = 1;

int main (int argc, char* argv[]) {
  int block_d0 = 1;
  extern int block_d1;
  static int block_d2 = 1;
  _Thread_local static int block_d3 = 1;
  const int block_d4 = 1;
  const int block_d5 = file_d6;
  const int block_d6 = block_d4;
  static const int block_d7 = 1;
  static const int block_d8 = file_d5;
  static const int block_d9 = file_d6;
  constexpr int block_d10 = 1;
  static constexpr int block_d11 = 1;
  int block_d12 = argc;
  const int block_d13 = argc;
  const int block_d14 = block_d0;
  const volatile int block_d15 = 1;

  return 0;
}

int block_d1 = 1;

A handful of these declarations become constexpr, as indicated by the table below which explains the changes for the above code snippet:

Declaration constexpr Before ? constexpr After ? Comment
file_d0 no change; extern implicitly, non-const
file_d1 no change; _Thread_local, extern implicitly, non-const
file_d2 no change; extern explicitly, non-const
file_d3 no change; non-const
file_d4 no change; _Thread_local, non-const
file_d5 no change; extern implicitly
file_d6 no change; constexpr explicitly
file_d7 static and const, initialized by constant expression
block_d0 no change; non-const
block_d1 no change; extern explicitly, non-const
block_d2 no change; non-const, static
block_d3 no change; _Thread_local, static, non-const
block_d4 const; initialized with literal
block_d5 const; initialized with other constexpr variable
block_d6 const, initialized by other constant expression
block_d7 static and const, initialized with literal
block_d8 no change; non-constant expression initializer
block_d9 static and const, initialized by constant expression
block_d10 no change; constexpr explicitly
block_d11 no change; constexpr explicitly
block_d12 no change; non-const, non-constant expression initializer
block_d13 no change; non-constant expression initializer
block_d14 no change; non-constant expression initializer
block_d15 no change; volatile

This matches the existing practice that occurs today.

4.1. Changes in Existing Code

Besides what is enumerated above for given declarations, some typical consequences on existing code are:

Otherwise, all the effects of this proposal are for newly written code that can confidently take advantage of such now rather than leave it implementation-defined.

4.2. What if Someone Takes the Address of a const Declaration that has been Promoted to constexpr?

This is fine. Naked constexpr variables are already implicitly const, and taking the address of one produces an int const* consistent with having a pointer to a variable that cannot be modified. A compiler may be robbed of a constant expression optimization (e.g., doing literal computation replacement and removing the existence of the variable inside of the program) by such a move, but it is fine and behaves perfectly in-line with the expected semantics of having a const integer. Modification of such an object by casting away its const-ness is, as it is throughout the C standard, Undefined Behavior and it should not be done. If it is done, the same rules apply as ever; undefined behavior. This proposal does not change anything in the way these values were being used to-date in either C or C++.

4.3. Why Not More Than Integer Types?

We limit this proposal to integer types (including enumerations) because that is the widest-spread existing practice and easiest to compute. constexpr serves as not just a marker, but as a way to let an implementation know that no matter how complex the initializer or its contained expressions become, it must be evaluated at compile-time. This represents a contract between the user and the compiler, and also serves as a courtesy so that the compiler can be appropriately prepared when processing the declaration.

Conversely, this is an implicit promotion. To ensure compilers are not unduly burdened, we capture what is already existing practice on the vast majority of existing compilers: integer types. If, in the future, implementations process many more declarations at compile-time, then such expansions can be made easily.

5. Wording

The following wording is relative to the latest draft standard of C.

📝 Editor’s Note: The ✨ characters are intentional. They represent stand-ins to be replaced by the editor.

5.1. Add a new paragraph to 6.7 "Declarations", just after paragraph 12 and before "EXAMPLE 3"

A declaration such that the declaration specifiers contain no type specifier or that is declared with constexpr is said to be underspecified. If such a declaration is not a definition, if it declares no or more than one ordinary identifier, if the declared identifier already has a declaration in the same scope, if the declared entity is not an object, or if anywhere within the sequence of tokens making up the declaration identifiers that are not ordinary are declared, the behavior is implementation-defined.136)

If one of a declaration’s init declarator matches the second form (a declarator followed by an equal sign = and an initializer) meets the following criteria:

— it is the first visible declaration of the identifier;

— it contains no other storage-class specifiers except static, auto, or register;

— it does not declare the identifier with external linkage;

— its type is an integer type or an enumeration type that is const-qualified but not otherwise qualified, and is non-atomic;

— and, its initializer is an integer constant expression (6.6);

then it behaves as if a constexpr storage-class specifier is implicitly added for that declarator specifically. The declared identifier is then a named constant and is valid in all contexts where a named constant of the corresponding type is valid to form a constant expression of that specific kind (6.6).