C++ Style Guide

Introduction

Goals of the Guide

Settle trivial disagreements in code review
Enable trivial onboarding between projects
Enable trivial checking in code

Note
It was a deliberate choice to repeat the word trivial that many times. To pull from the broader engineering style guide, "[building] is cheap, [good] ideas aren't." We don't want someone else's code to get in the way of your ideation, nor the opposite.

Scope & Applicability

All PRs should reference the latest style guide.
Our goal is to release updated tooling and configs on the same date as a new style guide.
PRs that exist explicitly to update to new style guides are likely a waste of time.

Relationship to Clang-21 & C++23

Currently we target C++23 features as implemented by clang-21.
- This should be installed on your development machine.
- It is the only clang installed on the distcc and sccache servers.

Enforcement

We recommend the following hierarchy for rules enforcement in review; 1. CI ; 2. reviewer; 3. Lint tooling. With higher numbers representing higher precedence.

Tip
It is worth noting that if reviewers vehemently disagree with a lint output, they should open an issue for the tool and the style guide.

Language Version & Compiler Targets

Target Standard (C++23)

As mentioned earlier, we target C++23 as it is implemented by clang-21.

Therefore, the following static_assert must be added to every translation unit.

#if !defined(__clang__)
#error "Build Error: Clang is required"
#elif __clang_major__ < 21
#error "Build Error: Clang 21+ is required"
#elif __cplusplus < 202302L
#error "Build Error: C++23 is required (-std=c++23)"
#endif

There's not much preference for upgrading to new standard library features since memorizing every new addition is unproductive. It is however worthwhile to implement them in new features since it lessens testing burden and "boilerplate" code.

Required Compiler Flags

Compiler flags are typically handled by tooling and will look as such:

# Release
CXXFLAGS = -std=c++23 -O3 -Wall -Wextra -Wpedantic -Wshadow -DNDEBUG -march=native -flto=full -ffunction-sections -fdata-sections
# DEBUG
    # note: the set of sanitizers can be configured but these are the defaults for debugs
CXXFLAGS = -std=c++23 -O0 -ggdb -Wall -Wextra -Wpedantic -Wshadow -DDEBUG -fsanitize=address,undefined

Prohibited Compiler Flags

-Ofast, this loops in -ffast-math.
-ffast-math.
- NOTE: this is acceptable via #pragma push/pop in non correctness-critical blocks.

This list is eligible for expansion over time.

Linking and Linker Flags

Linker flags are again handled by tooling but the underlying goal is to maximize link-time optimization while minimizing binary size (the goals are often complementary); as such the following is prescribed:

-fuse-ld=mold
- Mandated for link speed and avoiding link-time sequential execution.
-flto=full:
- To allow for optimization, the linker needs full context of the program, as such full link-time-optimization creates a single massive object and runs multiple compilation passes.
- Since this makes linking heavily sequential, in DEBUG builds, use -flto=thin
-Wl,-O3: Instructs the LTO plugin to apply maximum optimization passes across the entire merged AST.
- To allow for the linker to fully optimize the final object, we run as many LTO passes as possible.
- This can be omitted in debug builds.
-Wl,--gc-sections: Strips dead code and unused symbols to avoid binary size bloat.
-Wl,--icf=safe:
- Given our heavy use of templates and constexpr, a lot of code is duplicated.
- This folds identical generated binary sequences into a single copy, massively reducing binary bloat without breaking function pointer equality.
-Wl,--as-needed: Prevents DT_NEEDED bloat by only linking shared libraries if we actually use a symbol from them.

Other Flags

Flag	Purpose	Approval in DEBUG/RELEASE
`-Wl,--no-undefined`	Forces the linker to report unresolved symbols in shared libraries at link-time rather than deferring to the runtime. Catches missing dependencies immediately instead of segfaulting in production.	Both
`-Wl,--hash-style=gnu`	Uses the faster GNU hash table format for dynamic symbol resolution. Massively reduces start up time for dynamically linked executables compared to the default SysV hash.	Both
`-Wl,--build-id=sha1`	Embeds a unique cryptographic identifier in the binary. Absolutely crucial for matching production binaries to external debug symbols or accurately tracking crash mini dumps.	Both (use `=fast` in Debug for fast builds)
`-Wl,-z,relro,-z,now`	Hardens the binary by marking the Global Offset Table (GOT) as read-only and resolving all dynamic symbols at startup (Full RELRO (relocation read-only)). Mitigates GOT overwrite exploits.	Release Only
`-Wl,--strip-all`	Strips all symbol and relocation information from the final executable. Used to aggressively minimize binary bloat before deployment.	Release Only

Final `LDFLAGS` Configurations

DEBUG LDFLAGS:

# Debug builds
LDFLAGS = -fuse-ld=mold -flto=thin -Wl,--as-needed -Wl,--no-undefined -Wl,--hash-style=gnu -Wl,--build-id=fast
# Release Builds
LDFLAGS = -fuse-ld=mold -flto=full -Wl,-O3 -Wl,--gc-sections -Wl,--icf=safe -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro,-z,now -Wl,--hash-style=gnu -Wl,--build-id=sha1 -Wl,--strip-all

Project Structure

Monorepo Layout

/ (Monorepo Root)
├── WORKSPACE.yaml
└── Projects/
    ├── ProjectA/
    │   ├── catalyst.yaml
    │   ├── build/                          <-- Build output for ProjectA
    │   ├── include/
    │   │   ├── ProjectA/                   <-- Standard namespaced includes
    │   │   │   ├── header1.hpp
    │   │   │   └── header2.hpp
    │   │   └── extra_v1/                   <-- Optional extra one-offs
    │   │       └── legacy_compat.hpp
    │   ├── src/                            <-- src files
    │   │   ├── ProjectA.cpp                <-- Entry point. Must bear the same name as the project.
    │   │   ├── utils.cpp
    │   │   └── legacy_compat.hpp
    │   ├── test/                          <-- tests
    │   └── bench/                         <-- benchmark
    └── ProjectB/

This monorepo structure makes it abundantly clear what the scope of a project is and allows instantiation of one project for distribution, testing, compilation, etc.

Header Standards

Header Ownership

In general, every sub component of the project has an associated header file.

Header Closure

Header files should be self-contained (compile on their own) and end in .hpp.
Non-header files that are meant for inclusion should end in .inc and be used sparingly.
- The use of .inc is reserved for xxd or other code generation tooling.
- Such includes should have a comment referring to where they come from.
All header files should be self-contained, i.e. including header_X with or without some header_Y does not have a hidden effect.

Templates & Inline Definitions

When a header declares inline functions or templates that clients of the header will instantiate, the inline functions and templates must also have definitions in the header, either directly or in files it includes.

Include Order Rules

The order is as follows: 1. Subsystem header 2. System headers 3. Standard Library headers 4. External library headers 5. Internal headers

Error
The include_next directive is banned. It leads to non-deterministic builds across systems and it's typically a sign of improper header naming.

Header Guards

Conditional includes are allowed.
Headers that are conditionally included should follow the order as if the condition does not exist.
When possible, the condition should be moved to the header itself, to allow users to include without worrying about the guard.
- Broad header guards, e.g. #ifdef _WIN32 should be reflected in the header name, e.g. win32_xyz.

Forward Declarations Vs. Includes

Forward Declarations are not allowed, since they allow tools like ninja or catalyst skip over forced rebuilds because of header changes. In certain cases, forward declarations also worsen IDE error messages and autocompletion.

We maintain that forward declarations are banned because Catalyst preempts header precompilation and our build server aggressively prebuilds. As such, we have sufficient build speed to make the slight hit bearable in exchange for build determinism. Furthermore, the remainder of the header policy naturally dictates small "atomic" headers, where the cost of including a large struct is marginal.

Interface Segregation & Type Erasure

When circular dependencies arise that cannot be resolved by header decomposition, use type erasure (e.g., std::any, std::function) or abstract interfaces to break the cycle. Banning forward declarations forces a cleaner dependency graph; if you find yourself needing one, your components are likely too tightly coupled.

Pragma Once Policy

Header guards are disallowed. Use #pragma once to avoid the off chance of header guard collision.

Tooling

Most of this will be flagged prior to push by pinc-eye.

If you observe a disparity between pinc-eye, iwyu, and clang-tidy, use the strictest of the 3 and file a bug report for pinc-eye.

Naming Conventions

The goal of naming is to provide understanding of what everything in a statement is just from casing. For example, below are two blocks that achieve the same functionality, while using different casing.

The different cases enables easy "textural" differentiation.

Files

File systems have different case sensitivity i.e. mac/windows are insensitive while Linux is sensitive. To avoid compilation bugs between platforms, we use snake_case. File names should also be as concise and precise as possible.

Tip
If a filename is too long, it could possibly be nested deeper as it's likely part of a broader niche within the system.

Example:

src/MathMatrixOperations.cpp                    # BAD: this file is not in snake_case
src/math_la_matrix_operations.cpp               # BAD: this file is too long
src/math/linear_algebra/matrix_operations.cpp   # GOOD: this file is descriptively named and nested

Types (Classes, Structs, Enums)

We use PascalCase for naming of types. This distinguishes "User Types" from "Standard Library Types" (which are snake_case) and variables. It signals that this identifier creates a new object layout.

PascalCase vs snake_case should immediately signal high scrutiny towards user defined types.

// BAD
class pulse_schedule; // class defined in snake_case
struct QUBIT_MAP; // class defined in SCREAMING_CASE
// GOOD
class PulseSchedule;
struct QubitMap;
using ImageBuffer = std::vector<byte>;

Suffix _t is disallowed.

Exceptions

There are a few exceptions to this rule. Often times, we will roll our own types that are meant to be used interchangeably with standard library types, e.g. std::priority_queue vs pq::priority_queue or pq::unstable_unordered_map. Here it makes sense to name them interchangeably too.

Functions & Methods

Functions and methods follow camelCase.

Named lambdas in namespace scope should follow the function naming convention, while named lambdas in inner scopes follow variable naming convention.

Variables

Variables follow snake_case to distinguish from classes, and functions.

Constants & `constexpr`

Constants should follow SCREAMING_CASE for constexpr defined, const defined, and macro defined constants. This makes it immediately obvious that something cannot be changed and that possibly, it doesn't have a strong type to refer to. For variables that need to be tuned as a build parameter, namespace with the TUNABLE_ prefix. Catalyst will automatically pull these out into the config file.

Magic Numbers & Literal Constants

For all arbitrarily defined magic numbers and literal constants, one should use constexpr as such:

constexpr float RESIZE_FACTOR = 2.0f;

For magic numbers that arrive from a formula, one should seek to document it at compile time via the definition, e.g.

constexpr double cexprSqrt(double x) {
    // ... constexpr safe impl.
}

constexpr double HYPOTENUSE = cexprSqrt(2);

Template Parameters

Template type parameters follow PascalCase_T to denote that something is a template parameter. For non-type parameters, use regular parameter syntax i.e. snake_case.

Tip
Most of the time, you should provide a using declaration that binds to the template parameter and use that. This makes code introspection for stuff like template meta programming easier.

Namespaces

Namespace should follow lower case, i.e., no delimiters. If you find the need for one, you should consider nesting namespaces.

Note
This should fairly closely resemble the directory naming convention

Namespaces & Scoping

Namespace Usage Rules

All code must exist within the project's top-level namespace (e.g., catalyst:: or pq::).
The global namespace is strictly reserved for main() and system calls that require it (e.g., extern "C").

Directory Correspondence

Namespaces should roughly correspond to the directory structure, but do not be slavish about it.

Good: src/compiler/ast -> catalyst::compiler::ast
Bad: src/compiler/backend/llvm/utils/strings -> catalyst::compiler::backend::llvm::utils::strings (Too deep; flatter is better).

`using namespace`

Never use using namespace a header file.
- This forces your namespace choices onto every file that includes that header, creating invisible conflicts.
Exception: Inside a source file, after all includes, you may use using namespace but explicit qualification is still preferred.

Anonymous Namespaces

Use anonymous namespaces (namespace { ... }) to define file-local functions, variables, and types in .cpp files.
Prefer anonymous namespaces to static since it enables better LTO.
Never put an anonymous namespace inside a header.
- This causes every translation unit that includes the header to define its own copy of the symbols.

Inline Namespaces

Inline namespaces (inline namespace v2 { ... }) are reserved strictly for ABI Versioning.
Context: They allow the library to present a default interface while keeping older binary symbols available.
Rule: Do not use them for organizational structure.

Namespace Aliases

We maintain a strictly approved list of namespace aliases (See Section 21) that are safe to use project-wide.

Local Aliases: You may define local aliases inside a .cpp file or inline header function/class definition. Even in these contexts, you should use the common names defined in the appendix.

// Good (inside function)
void process() {
    namespace sv = std::views;
    auto view = sv::iota(0, 10);
}

// BAD (inside of a header file)
namespace sv = std::views; // Pollutes everyone's build and breaks interpretability

Argument Dependent Lookup (ADL)

ADL allows the compiler to find functions in namespaces based on the arguments passed to them. This breaks determinism.
Explicitly qualify function calls unless ADL is strictly required (e.g., for swap or operator overloads).
- Good: std::sort(...)
- Bad: sort(...) (Might pick std::sort, might pick catalyst::sort, might fail).

Exceptions

Operators (operator<<, operator+) rely on ADL to function. This is acceptable.

Symbol Visibility

Default: We build with "hidden" visibility by default (-fvisibility=hidden). Public API: Explicitly mark classes and functions intended for external consumption (outside the shared library/DLL) with the project's export macro (e.g., CATALYST_API).

Reasoning: This creates a smaller binary size, faster load times, and enforces a strict boundary between "Public API" and "Internal Implementation."

Classes & Structs

Class Layout Order

Class layout should have the goal of optimizing performance. The primary layout related causes of bad performance are:

False Sharing
Poor Cache Locality from layout
Poor Cache Locality from padding

Since different structs have different applications, and the goals above conflict, the only possible remedy is micro benchmarks. To make this type of benchmark trivial, you can do the following to allow for easier benchmarking:

template <int k>
struct S {
#ifdef DEBUG
    static_assert(false, "you should use specialized member order");
#else
    // final chosen layout
#endif
};

#ifdef DEBUG
template<> struct S<0> {
    Type1 a;
    Type2 b;
    Type3 c;
};

template<> struct S<1> {
    Type1 a;
    Type3 c;
    Type2 b;
};

// ...
#endif

Rule of Zero/Five

Object semantics should be strict until otherwise needed. Therefore, the order is as such:

Delete everything first.
Re-enable the default when semantically valid.
Specialize when needed for extra behavior.

Constructors & Explicitness

Constructors are allowed to be implicit when all the following are true: - The incoming type is semantically equivalent to the constructed type - The conversion is obvious at the call site - No ownership, allocation, lifetime, or narrowing ambiguity is introduced

Examples where implicit is acceptable: - view/wrapper types over the same underlying representation - strong semantic aliases of existing types - cheap value-preserving transformations

Examples where implicit is forbidden: - ownership transfer - allocation - lossy conversion - policy-changing wrappers - anything that alters threading or synchronization guarantees

Tip
If a reader cannot immediately infer that a constructor is being invoked, it must be explicit.

Inheritance Policy

Prefer compile-time polymorphism via enum-dispatch templates over runtime polymorphism.

enum class Backend { CPU, CUDA, METAL };

template <Backend Backend_T>
struct Engine {
    /// definition
};

This yields: - static specialization - branch elimination - layout visibility - optimizer-friendly code paths

Virtual Functions

runtime polymorphism is required
specialization explosion is worse than dispatch cost
call frequency is low or amortized

vtable overhead is widely overstated in modern CPUs. Indirect calls are typically cached and predictable. However, inheritance hierarchies must remain shallow and semantically clean, since "abstraction hell" is harder to reason about than "specialization hell"

Multiple Inheritance

Multiple inheritance is strongly discouraged outside interface-only layering.

Pointer to Implementation (PIMPL Patter)

PIMPL (pointer to implementation) is allowed for ABI stability across releases, and isolation of heavy dependencies. However, it should not lead to performance regression. When PIMPL is used

The owning object must aggressively inline hot-paths
Construction must pre-touch or warm referenced memory when latency-sensitive
Dereference chains must be minimized inside tight loops
Heap allocation must be justified by benchmark

Templates & Concepts

All templates must be constrained using C++20 Concepts. Prefer standard concepts (e.g., std::derived_from, std::integral) where applicable. Constraints should be applied to the template declaration to provide clear error messages and enable better IDE support.

SFINAE via std::enable_if is banned.

Functions

We model functions as a premise, things that should be true at the call site, and a promise, things that will be true of "side-effects", if premise is met.
- Side Effects constitute, I/O, mutating global state, and returning values. The first 2 are discussed in more detail later.

Function Size & Responsibility

We define responsibility as how many promises a function makes, and how many side effects it produces.
We try to restrict a function to 1 direct promise and 1 direct side-effect, with infinitely many indirect promises/side-effects from function calls within the body.

Return Types

For functions that return a value, it's useful to wrap in a pq::expected
- This is basically std::expected with the error type standardized to string and implementing small object optimization.

Error Handling Policy

Differentiate between 'Contract Violations' and 'Expected Failures'. - Contract Violations: Bugs that should never happen in a correct program. Use std::terminate or pq_assert. - Expected Failures: Anticipated runtime issues (I/O, network). Use pq::expected<T> to force the caller to handle the error.

Attribute-Driven Intent

Use standard attributes to communicate intent to the compiler and reviewers: - [[nodiscard]]: Mandatory for all functions returning pq::expected, std::optional, or resources. - [[maybe_unused]]: For parameters that are only used in certain build configurations (e.g., DEBUG). - [[deprecated]]: Use when phasing out old APIs to provide a clear migration path.

Exceptions

For performance critical code, even with SSO, the indirection overhead is unacceptable. Mark the function noexcept and throw (triggering std::terminate).

Parameter Correctness

Standard View Types

Prefer non-owning view types for function parameters to maximize flexibility and performance: - Use std::string_view for read-only string access. - Use std::span<T> for read-only or mutable contiguous memory access.

Avoid passing const std::vector<T>& or const std::string& unless ownership or specific container properties are required.

This is the order of preference for parameter types - Const Reference - Exception: when the size of the parameter is less than 8 bytes or less than 16 bytes when optimization is enabled. In these cases, we just pass by value - NOTE: the 8 byte rule is typical on 64-bit architecture, but the 16 byte rule is specific to us since compilers will try and use the SIMD registers to "smuggle" in parameters instead of normal register. - Reference - This is for when the side effect of the function is mutating the "out parameter". - Your function should typically choose between out parameters and returns, not both. - RValue - This is for when the parameter belongs to a move only type and the function assumes ownership of the object. - Value - This is typically never useful, except for the exception to const references mentioned above. - Pointers - This isn't recommended since it litters the code with &,*,-> and doesn't provide good ownership semantics clarity, but we might still need it for C interop.

Lambda Usage

Lambdas are almost always preferable to functions since they explicitly mark any global capture, but not an explicit requirement.
When using lambdas as functions,
- adhere to function naming conventions,
- mark the lambda constexpr,
- and explicitly define the return type

Examples

// BAD: no constexpr declaration
auto lambda = [] () {
};

// BAD: constexpr declaration applies to the call operator, not the return value
auto lambda = [] () constexpr {
};
// BAD: no return value
constexpr auto lambda = [] () {
};
// BAD: implicit global capture
constexpr auto lambda = [&] () {
};
// GOOD
constexpr auto lambda = [&x, &y, z] () -> void {
};

Memory Management

Ownership Rules

We use RAII for cleanup of objects. As such, we rely on wrapper types to both clarify ownership semantics and handle RAII properly for those ownership semantics. Mainly we rely on smart pointers, std::unique_ptr, pq::nonatomic_ptr, and std::shared_ptr.

Pointer Policy

Always use smart pointers, when an object or scope retains ownership of the pointed to object.
- Use std::unique_ptr in unique ownership cases
- Use std::shared_ptr in shared ownership in multi threaded code
- Use pq::nonatomic_ptr in shared ownership in single threaded code, e.g. graph-like data structures.
Use std::weak_ptr when it doesn't signify ownership and when in cyclic data structures.
Use pq::deferred_allocator for cleanup in hot paths.
Never use or accept raw pointers, except for C interop.

Stack Vs. Heap

Always prefer the stack to the heap as it better optimizes cache locality and cleanup.

Concurrency

We've broken this section out into a broader concurrency guide, and the specific C++ section of that guide.

Coroutines

Coroutines are allowed for complex asynchronous state machines and generators. However, be mindful of the hidden heap allocations for the coroutine frame. Use pq::task<T> (our optimized coroutine handle) to ensure frame elision (HALO) where possible.

Formatting & Layout

Use the following .clang-format

BasedOnStyle: LLVM
IndentWidth: 4
ColumnLimit: 100
AccessModifierOffset: -4
AllowShortFunctionsOnASingleLine: None
AlignAfterOpenBracket: Align
AlignConsecutiveAssignments: false
BinPackArguments: false
BinPackParameters: false
FixNamespaceComments: true
IncludeBlocks: Regroup
IndentCaseLabels: true
SortIncludes: true

Documentation

Source code documentation is generated by Jocasta, our attribute driven documentation generation.

We use Jocasta primarily because Doxygen is hard to parse and painfully slow to deploy. By making documentation an emitted artifact of compilation, we significantly speed up documentation generation. In order to achieve documentation as an artifact of compilation, we need some way of hooking the docs framework to compiler-native data structures, i.e. the AST. The most extensible method at our disposal is attributes and as such, that's what Jocasta mandates.

Going beyond compiler artifacts, attributes enable defining "scope" in that it's obvious exactly what is being documented since docs are attributable to the documented structure.

Jocasta Example

[[doc::brief("Allocates memory on the heap")]]
[[doc::param(bytes, "The number of bytes to allocate")]]
[[doc::return("A pointer to the heap or nullptr in case of an exception")]]
void *malloc(size_t bytes);

This will generate a Markdown file in a docs directory, to be served by mkdocs.

Appendix

Namespace Aliases

Alias	Actual Namespace	Reason
`clk`	`std::chrono`	"clock"
`rv`	`std::views`	originally an alias for `std::ranges::views`
`fs`	`std::filesystem`	standard abbreviation

C++ Style Guide

Introduction

Goals of the Guide

Scope & Applicability

Relationship to Clang-21 & C++23

Enforcement

Language Version & Compiler Targets

Target Standard (C++23)

Required Compiler Flags

Prohibited Compiler Flags

Linking and Linker Flags

Other Flags

Final LDFLAGS Configurations

Project Structure

Monorepo Layout

Header Standards

Header Ownership

Header Closure

Templates & Inline Definitions

Include Order Rules

Header Guards

Forward Declarations Vs. Includes

Interface Segregation & Type Erasure

Pragma Once Policy

Tooling

Naming Conventions

Files

Example:

Types (Classes, Structs, Enums)

Exceptions

Functions & Methods

Variables

Constants & constexpr

Magic Numbers & Literal Constants

Template Parameters

Namespaces

Namespaces & Scoping

Namespace Usage Rules

Directory Correspondence

using namespace

Anonymous Namespaces

Inline Namespaces

Namespace Aliases

Argument Dependent Lookup (ADL)

Exceptions

Symbol Visibility

Classes & Structs

Class Layout Order

Rule of Zero/Five

Constructors & Explicitness

Inheritance Policy

Virtual Functions

Multiple Inheritance

Pointer to Implementation (PIMPL Patter)

Templates & Concepts

Functions

Function Size & Responsibility

Return Types

Error Handling Policy

Attribute-Driven Intent

Exceptions

Parameter Correctness

Standard View Types

Lambda Usage

Examples

Memory Management

Ownership Rules

Pointer Policy

Stack Vs. Heap

Concurrency

Coroutines

Formatting & Layout

Documentation

Jocasta Example

Appendix

Namespace Aliases

Final `LDFLAGS` Configurations

Constants & `constexpr`

`using namespace`