C++ Style Guide

Introduction

Goals of the Guide

Note
It was a deliberate choice to repeat the word trivial that many times. To pull from the broader engineering style guide, "[building] is cheap, [good] ideas aren't." We don't want someone else's code to get in the way of your ideation, nor the opposite.

Scope & Applicability

Relationship to Clang-21 & C++23

Enforcement

We recommend the following hierarchy for rules enforcement in review; 1. CI ; 2. reviewer; 3. Lint tooling. With higher numbers representing higher precedence.

Tip
It is worth noting that if reviewers vehemently disagree with a lint output, they should open an issue for the tool and the style guide.

Language Version & Compiler Targets

Target Standard (C++23)

As mentioned earlier, we target C++23 as it is implemented by clang-21.

Therefore, the following static_assert must be added to every translation unit.

#if !defined(__clang__)
#error "Build Error: Clang is required"
#elif __clang_major__ < 21
#error "Build Error: Clang 21+ is required"
#elif __cplusplus < 202302L
#error "Build Error: C++23 is required (-std=c++23)"
#endif

There's not much preference for upgrading to new standard library features since memorizing every new addition is unproductive. It is however worthwhile to implement them in new features since it lessens testing burden and "boilerplate" code.

Required Compiler Flags

Compiler flags are typically handled by tooling and will look as such:

# Release
CXXFLAGS = -std=c++23 -O3 -Wall -Wextra -Wpedantic -Wshadow -DNDEBUG -march=native -flto=full -ffunction-sections -fdata-sections
# DEBUG
    # note: the set of sanitizers can be configured but these are the defaults for debugs
CXXFLAGS = -std=c++23 -O0 -ggdb -Wall -Wextra -Wpedantic -Wshadow -DDEBUG -fsanitize=address,undefined

Prohibited Compiler Flags

This list is eligible for expansion over time.

Linking and Linker Flags

Linker flags are again handled by tooling but the underlying goal is to maximize link-time optimization while minimizing binary size (the goals are often complementary); as such the following is prescribed:

Other Flags

Flag Purpose Approval in DEBUG/RELEASE
-Wl,--no-undefined Forces the linker to report unresolved symbols in shared libraries at link-time rather than deferring to the runtime. Catches missing dependencies immediately instead of segfaulting in production. Both
-Wl,--hash-style=gnu Uses the faster GNU hash table format for dynamic symbol resolution. Massively reduces start up time for dynamically linked executables compared to the default SysV hash. Both
-Wl,--build-id=sha1 Embeds a unique cryptographic identifier in the binary. Absolutely crucial for matching production binaries to external debug symbols or accurately tracking crash mini dumps. Both (use =fast in Debug for fast builds)
-Wl,-z,relro,-z,now Hardens the binary by marking the Global Offset Table (GOT) as read-only and resolving all dynamic symbols at startup (Full RELRO (relocation read-only)). Mitigates GOT overwrite exploits. Release Only
-Wl,--strip-all Strips all symbol and relocation information from the final executable. Used to aggressively minimize binary bloat before deployment. Release Only

Final LDFLAGS Configurations

DEBUG LDFLAGS:

# Debug builds
LDFLAGS = -fuse-ld=mold -flto=thin -Wl,--as-needed -Wl,--no-undefined -Wl,--hash-style=gnu -Wl,--build-id=fast
# Release Builds
LDFLAGS = -fuse-ld=mold -flto=full -Wl,-O3 -Wl,--gc-sections -Wl,--icf=safe -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro,-z,now -Wl,--hash-style=gnu -Wl,--build-id=sha1 -Wl,--strip-all

Project Structure

Monorepo Layout

/ (Monorepo Root)
├── WORKSPACE.yaml
└── Projects/
    ├── ProjectA/
    │   ├── catalyst.yaml
    │   ├── build/                          <-- Build output for ProjectA
    │   ├── include/
    │   │   ├── ProjectA/                   <-- Standard namespaced includes
    │   │   │   ├── header1.hpp
    │   │   │   └── header2.hpp
    │   │   └── extra_v1/                   <-- Optional extra one-offs
    │   │       └── legacy_compat.hpp
    │   ├── src/                            <-- src files
    │   │   ├── ProjectA.cpp                <-- Entry point. Must bear the same name as the project.
    │   │   ├── utils.cpp
    │   │   └── legacy_compat.hpp
    │   ├── test/                          <-- tests
    │   └── bench/                         <-- benchmark
    └── ProjectB/

This monorepo structure makes it abundantly clear what the scope of a project is and allows instantiation of one project for distribution, testing, compilation, etc.

Header Standards

Header Ownership

Header Closure

Templates & Inline Definitions

Include Order Rules

The order is as follows: 1. Subsystem header 2. System headers 3. Standard Library headers 4. External library headers 5. Internal headers

Error
The include_next directive is banned. It leads to non-deterministic builds across systems and it's typically a sign of improper header naming.

Header Guards

Forward Declarations Vs. Includes

Forward Declarations are not allowed, since they allow tools like ninja or catalyst skip over forced rebuilds because of header changes. In certain cases, forward declarations also worsen IDE error messages and autocompletion.

We maintain that forward declarations are banned because Catalyst preempts header precompilation and our build server aggressively prebuilds. As such, we have sufficient build speed to make the slight hit bearable in exchange for build determinism. Furthermore, the remainder of the header policy naturally dictates small "atomic" headers, where the cost of including a large struct is marginal.

Interface Segregation & Type Erasure

When circular dependencies arise that cannot be resolved by header decomposition, use type erasure (e.g., std::any, std::function) or abstract interfaces to break the cycle. Banning forward declarations forces a cleaner dependency graph; if you find yourself needing one, your components are likely too tightly coupled.

Pragma Once Policy

Header guards are disallowed. Use #pragma once to avoid the off chance of header guard collision.

Tooling

Most of this will be flagged prior to push by pinc-eye.

If you observe a disparity between pinc-eye, iwyu, and clang-tidy, use the strictest of the 3 and file a bug report for pinc-eye.

Naming Conventions

The goal of naming is to provide understanding of what everything in a statement is just from casing. For example, below are two blocks that achieve the same functionality, while using different casing.

The different cases enables easy "textural" differentiation.

Files

File systems have different case sensitivity i.e. mac/windows are insensitive while Linux is sensitive. To avoid compilation bugs between platforms, we use snake_case. File names should also be as concise and precise as possible.

Tip
If a filename is too long, it could possibly be nested deeper as it's likely part of a broader niche within the system.

Example:

src/MathMatrixOperations.cpp                    # BAD: this file is not in snake_case
src/math_la_matrix_operations.cpp               # BAD: this file is too long
src/math/linear_algebra/matrix_operations.cpp   # GOOD: this file is descriptively named and nested

Types (Classes, Structs, Enums)

We use PascalCase for naming of types. This distinguishes "User Types" from "Standard Library Types" (which are snake_case) and variables. It signals that this identifier creates a new object layout.

PascalCase vs snake_case should immediately signal high scrutiny towards user defined types.

// BAD
class pulse_schedule; // class defined in snake_case
struct QUBIT_MAP; // class defined in SCREAMING_CASE
// GOOD
class PulseSchedule;
struct QubitMap;
using ImageBuffer = std::vector<byte>;

Suffix _t is disallowed.

Exceptions

There are a few exceptions to this rule. Often times, we will roll our own types that are meant to be used interchangeably with standard library types, e.g. std::priority_queue vs pq::priority_queue or pq::unstable_unordered_map. Here it makes sense to name them interchangeably too.

Functions & Methods

Functions and methods follow camelCase.

Named lambdas in namespace scope should follow the function naming convention, while named lambdas in inner scopes follow variable naming convention.

Variables

Variables follow snake_case to distinguish from classes, and functions.

Constants & constexpr

Constants should follow SCREAMING_CASE for constexpr defined, const defined, and macro defined constants. This makes it immediately obvious that something cannot be changed and that possibly, it doesn't have a strong type to refer to. For variables that need to be tuned as a build parameter, namespace with the TUNABLE_ prefix. Catalyst will automatically pull these out into the config file.

Magic Numbers & Literal Constants

For all arbitrarily defined magic numbers and literal constants, one should use constexpr as such:

constexpr float RESIZE_FACTOR = 2.0f;

For magic numbers that arrive from a formula, one should seek to document it at compile time via the definition, e.g.

constexpr double cexprSqrt(double x) {
    // ... constexpr safe impl.
}

constexpr double HYPOTENUSE = cexprSqrt(2);

Template Parameters

Template type parameters follow PascalCase_T to denote that something is a template parameter. For non-type parameters, use regular parameter syntax i.e. snake_case.

Tip
Most of the time, you should provide a using declaration that binds to the template parameter and use that. This makes code introspection for stuff like template meta programming easier.

Namespaces

Namespace should follow lower case, i.e., no delimiters. If you find the need for one, you should consider nesting namespaces.

Note
This should fairly closely resemble the directory naming convention

Namespaces & Scoping

Namespace Usage Rules

Directory Correspondence

Good: src/compiler/ast -> catalyst::compiler::ast
Bad: src/compiler/backend/llvm/utils/strings -> catalyst::compiler::backend::llvm::utils::strings (Too deep; flatter is better).

using namespace

Anonymous Namespaces

Inline Namespaces

Namespace Aliases

We maintain a strictly approved list of namespace aliases (See Section 21) that are safe to use project-wide.

Local Aliases: You may define local aliases inside a .cpp file or inline header function/class definition. Even in these contexts, you should use the common names defined in the appendix.

// Good (inside function)
void process() {
    namespace sv = std::views;
    auto view = sv::iota(0, 10);
}

// BAD (inside of a header file)
namespace sv = std::views; // Pollutes everyone's build and breaks interpretability

Argument Dependent Lookup (ADL)

Exceptions

Symbol Visibility

Default: We build with "hidden" visibility by default (-fvisibility=hidden). Public API: Explicitly mark classes and functions intended for external consumption (outside the shared library/DLL) with the project's export macro (e.g., CATALYST_API).

Reasoning: This creates a smaller binary size, faster load times, and enforces a strict boundary between "Public API" and "Internal Implementation."

Classes & Structs

Class Layout Order

Class layout should have the goal of optimizing performance. The primary layout related causes of bad performance are:

Since different structs have different applications, and the goals above conflict, the only possible remedy is micro benchmarks. To make this type of benchmark trivial, you can do the following to allow for easier benchmarking:

template <int k>
struct S {
#ifdef DEBUG
    static_assert(false, "you should use specialized member order");
#else
    // final chosen layout
#endif
};

#ifdef DEBUG
template<> struct S<0> {
    Type1 a;
    Type2 b;
    Type3 c;
};

template<> struct S<1> {
    Type1 a;
    Type3 c;
    Type2 b;
};

// ...
#endif

Rule of Zero/Five

Object semantics should be strict until otherwise needed. Therefore, the order is as such:

  1. Delete everything first.
  2. Re-enable the default when semantically valid.
  3. Specialize when needed for extra behavior.

Constructors & Explicitness

Constructors are allowed to be implicit when all the following are true: - The incoming type is semantically equivalent to the constructed type - The conversion is obvious at the call site - No ownership, allocation, lifetime, or narrowing ambiguity is introduced

Examples where implicit is acceptable: - view/wrapper types over the same underlying representation - strong semantic aliases of existing types - cheap value-preserving transformations

Examples where implicit is forbidden: - ownership transfer - allocation - lossy conversion - policy-changing wrappers - anything that alters threading or synchronization guarantees

Tip
If a reader cannot immediately infer that a constructor is being invoked, it must be explicit.

Inheritance Policy

Prefer compile-time polymorphism via enum-dispatch templates over runtime polymorphism.

enum class Backend { CPU, CUDA, METAL };

template <Backend Backend_T>
struct Engine {
    /// definition
};

This yields: - static specialization - branch elimination - layout visibility - optimizer-friendly code paths

Virtual Functions

vtable overhead is widely overstated in modern CPUs. Indirect calls are typically cached and predictable. However, inheritance hierarchies must remain shallow and semantically clean, since "abstraction hell" is harder to reason about than "specialization hell"

Multiple Inheritance

Multiple inheritance is strongly discouraged outside interface-only layering.

Pointer to Implementation (PIMPL Patter)

PIMPL (pointer to implementation) is allowed for ABI stability across releases, and isolation of heavy dependencies. However, it should not lead to performance regression. When PIMPL is used

Templates & Concepts

All templates must be constrained using C++20 Concepts. Prefer standard concepts (e.g., std::derived_from, std::integral) where applicable. Constraints should be applied to the template declaration to provide clear error messages and enable better IDE support.

SFINAE via std::enable_if is banned.

Functions

Function Size & Responsibility

Return Types

Error Handling Policy

Differentiate between 'Contract Violations' and 'Expected Failures'. - Contract Violations: Bugs that should never happen in a correct program. Use std::terminate or pq_assert. - Expected Failures: Anticipated runtime issues (I/O, network). Use pq::expected<T> to force the caller to handle the error.

Attribute-Driven Intent

Use standard attributes to communicate intent to the compiler and reviewers: - [[nodiscard]]: Mandatory for all functions returning pq::expected, std::optional, or resources. - [[maybe_unused]]: For parameters that are only used in certain build configurations (e.g., DEBUG). - [[deprecated]]: Use when phasing out old APIs to provide a clear migration path.

Exceptions

Parameter Correctness

Standard View Types

Prefer non-owning view types for function parameters to maximize flexibility and performance: - Use std::string_view for read-only string access. - Use std::span<T> for read-only or mutable contiguous memory access.

Avoid passing const std::vector<T>& or const std::string& unless ownership or specific container properties are required.

This is the order of preference for parameter types - Const Reference - Exception: when the size of the parameter is less than 8 bytes or less than 16 bytes when optimization is enabled. In these cases, we just pass by value - NOTE: the 8 byte rule is typical on 64-bit architecture, but the 16 byte rule is specific to us since compilers will try and use the SIMD registers to "smuggle" in parameters instead of normal register. - Reference - This is for when the side effect of the function is mutating the "out parameter". - Your function should typically choose between out parameters and returns, not both. - RValue - This is for when the parameter belongs to a move only type and the function assumes ownership of the object. - Value - This is typically never useful, except for the exception to const references mentioned above. - Pointers - This isn't recommended since it litters the code with &,*,-> and doesn't provide good ownership semantics clarity, but we might still need it for C interop.

Lambda Usage

Examples

// BAD: no constexpr declaration
auto lambda = [] () {
};

// BAD: constexpr declaration applies to the call operator, not the return value
auto lambda = [] () constexpr {
};
// BAD: no return value
constexpr auto lambda = [] () {
};
// BAD: implicit global capture
constexpr auto lambda = [&] () {
};
// GOOD
constexpr auto lambda = [&x, &y, z] () -> void {
};

Memory Management

Ownership Rules

We use RAII for cleanup of objects. As such, we rely on wrapper types to both clarify ownership semantics and handle RAII properly for those ownership semantics. Mainly we rely on smart pointers, std::unique_ptr, pq::nonatomic_ptr, and std::shared_ptr.

Pointer Policy

Stack Vs. Heap

Concurrency

We've broken this section out into a broader concurrency guide, and the specific C++ section of that guide.

Coroutines

Coroutines are allowed for complex asynchronous state machines and generators. However, be mindful of the hidden heap allocations for the coroutine frame. Use pq::task<T> (our optimized coroutine handle) to ensure frame elision (HALO) where possible.

Formatting & Layout

Use the following .clang-format

BasedOnStyle: LLVM
IndentWidth: 4
ColumnLimit: 100
AccessModifierOffset: -4
AllowShortFunctionsOnASingleLine: None
AlignAfterOpenBracket: Align
AlignConsecutiveAssignments: false
BinPackArguments: false
BinPackParameters: false
FixNamespaceComments: true
IncludeBlocks: Regroup
IndentCaseLabels: true
SortIncludes: true

Documentation

Source code documentation is generated by Jocasta, our attribute driven documentation generation.

We use Jocasta primarily because Doxygen is hard to parse and painfully slow to deploy. By making documentation an emitted artifact of compilation, we significantly speed up documentation generation. In order to achieve documentation as an artifact of compilation, we need some way of hooking the docs framework to compiler-native data structures, i.e. the AST. The most extensible method at our disposal is attributes and as such, that's what Jocasta mandates.

Going beyond compiler artifacts, attributes enable defining "scope" in that it's obvious exactly what is being documented since docs are attributable to the documented structure.

Jocasta Example

[[doc::brief("Allocates memory on the heap")]]
[[doc::param(bytes, "The number of bytes to allocate")]]
[[doc::return("A pointer to the heap or nullptr in case of an exception")]]
void *malloc(size_t bytes);

This will generate a Markdown file in a docs directory, to be served by mkdocs.

Appendix

Namespace Aliases

Alias Actual Namespace Reason
clk std::chrono "clock"
rv std::views originally an alias for std::ranges::views
fs std::filesystem standard abbreviation