- Goals and philosophy
- Overview
- Details
- Caveats
- Open questions
- Alternatives considered
- References
Important Carbon goals for code and name organization are:
-
-
Tooling support is important for Carbon, including the possibility of a package manager.
-
Developer tooling, including both IDEs and refactoring tools, are expected to exist and be well-supported.
-
-
Software and language evolution:
-
We should support libraries adding new structs, functions or other identifiers without those new identifiers being able to shadow or break existing users that already have identifiers with conflicting names.
-
We should make it easy to refactor code, including moving code between files. This includes refactoring both by humans and by developer tooling.
-
-
Fast and scalable development:
-
It should be easy for developer tooling to parse code, without needing to parse imports for context.
-
Structure should be provided for large projects to opt into features which will help maintain scaling of their codebase, while not adding burdens to small projects that don't need it.
-
Carbon source files have a .carbon
extension, such as
geometry.carbon
. These files are the basic unit of compilation.
For programs that fit into a single source file, no syntax is required to
introduce the file. A very simple Carbon program can consist of a single file
containing only a Run
function:
fn Run() -> i32 {
return 6 * 9;
}
However, as the program grows larger, it is desirable to split it across multiple source files. To support this, libraries[define] can be written containing pieces of the program:
library "Colors";
choice Color { Red, Green, Blue }
fn ColorName(c: Color) -> String;
impl library "Colors";
fn ColorName(c: Color) -> String {
match (c) {
case .Red => { return "Red"; }
case .Green => { return "Green"; }
case .Blue => { return "Blue"; }
}
}
// Make the "Colors" library visible here.
import library "Colors";
fn Run() {
Print(ColorName(Color.Red));
}
A library is the basic unit of dependency. Separating code into multiple libraries can speed up the overall build while also making it clear which code is being reused.
A library has a single API file which defines its interface, plus zero or more
implementation files that can provide any implementation details that were
omitted from the API file. These files are distinguished by whether the
library
declaration starts with the impl
modifier. By convention,
implementation files also use a file extension of .impl.carbon
.
Separating a library into interface and implementation may help organize code as a library grows, or to let the build system distinguish between the dependencies of the API itself and its underlying implementation. Implementation files allow for code to be extracted out from the API file, while only being callable from other files within the library, including both API and implementation files.
A source file that does not specify a library is implicitly within the default
library. This is the library in which a Carbon Run
function should reside.
A program usually doesn't consist of only a single collection of source files developed by a team of people working together. In order to make it easy for different components of a program to be independently authored, Carbon supports packages of code.
Each source file in a package begins with a declaration of which
package[define]
it belongs in. The package is the unit of distribution. The package name is a
single identifier, such as Geometry
. An example API file in the Geometry
package would start with:
package Geometry;
A tiny package may consist of a single library with a single API file. As with
libraries, additional implementation files can be added to the package by using
the impl
keyword in the package declaration:
impl package Geometry;
However, as a package adds more files, it will probably want to separate out
into multiple libraries. These work exactly like the libraries described above.
In fact, if no package is specified for a source file, it is implicitly part of
the Main
package. For example, an API file adding the library Shapes
to the
Geometry
package, or Geometry//Shapes
in
shorthand, would start with:
package Geometry library "Shapes";
This library can be imported within the same package by using:
import library "Shapes";
This will result in the public names declared in the "Shapes"
library becoming
visible in the importer, for example Circle
might refer to a public type
Circle
declared in library "Shapes"
.
From a different package, this library can be imported by using:
import Geometry library "Shapes";
Unlike in a same-package import, this import only introduces the name
Geometry
, as a private name for the importer to use to refer to the Geometry
package. Names declared within this package can be found as members of the name
Geometry
, for example Geometry.Circle
. The library
portion is optional in
this syntax, and if omitted, the default (unnamed) library is imported.
// Imports the source file beginning `package Geometry;`
import Geometry;
The default library of a named package can be imported within that same package by using:
import library default;
The Main
package can only be imported from other parts of the Main
package,
never other packages. Importing Main//default
is invalid, regardless of which
package is used.
As code becomes more complex, and users pull in more code, it may also be
helpful to add
namespaces[define]
to give related entities consistently structured names. A namespace affects the
name
path[define]
used when calling code. For example, with no namespace, if a Geometry
package
defines Circle
then the name path will be Geometry.Circle
. However, it can
be named Geometry.TwoDimensional.Circle
with a namespace
; for example:
package Geometry library "Shapes";
namespace TwoDimensional;
struct TwoDimensional.Circle { ... };
This scaling of programs into packages, libraries, and namespaces is how Carbon supports both small and large codebases.
A different way to think of the sizing of packages and libraries is:
- A package is a GitHub repository.
- Small and medium projects that fit in a single repository will typically
have a single package. For example, a medium-sized project like
Abseil could
still use a single
Abseil
package. - Large projects will have multiple packages. For example, Mozilla may have multiple packages for Firefox and other efforts.
- Small and medium projects that fit in a single repository will typically
have a single package. For example, a medium-sized project like
Abseil could
still use a single
- A library is a few files that provide an interface and implementation, and
should remain small.
- Small projects will have a single library when it's easy to maintain all code in a few files.
- Medium and large projects will have multiple libraries. For example,
Boost Geometry's Distance
interface and implementation might be its own library within
Boost
, with dependencies on other libraries inBoost
and potentially other packages from Boost.- Library names could be named after the feature, such as
library "Algorithms"
, or include part of the path to reduce the chance of name collisions, such aslibrary "Geometry/Algorithms"
.
- Library names could be named after the feature, such as
Packages may choose to expose libraries that expose unions of interfaces from other libraries within the package. However, doing so would also provide the transitive closure of build-time dependencies, and is likely to be discouraged in many cases.
Every source file will consist of, in order:
- Either a
package
directive, alibrary
directive, or no introduction. - A section of zero or more
import
directives. - Source file body, with other code.
Comments and blank lines may be intermingled with these sections. Metaprogramming code may also be intermingled, so long as the outputted code is consistent with the enforced ordering. Other types of code must be in the source file body.
Name paths are defined above as sequences of identifiers separated by dots. This syntax may be loosely expressed as a regular expression:
IDENTIFIER(\.IDENTIFIER)*
Name conflicts are addressed by name lookup.
The package
directive's syntax may be loosely expressed as a regular
expression:
(impl)? package IDENTIFIER (library STRING)?;
For example:
impl package Geometry library "Objects/FourSides";
Breaking this apart:
- The use of the
impl
keyword indicates this is an implementation files as described under libraries. If it were omitted, this would instead be an API file. - The identifier after the
package
keyword,Geometry
, is the package name and will prefix both library and namespace paths.- The
package
keyword also declares a package entity matching the package name. A package entity is almost identical to a namespace entity, except with some package/import-specific handling. For example, if the file declaresnamespace TwoDimensional;
andstruct TwoDimensional.Line
, the struct may be used from files in other packages that import the library asGeometry.TwoDimensional.Line
, using theGeometry
package entity created by thepackage
keyword. Main
is invalid for use as the package name.Main
libraries must be defined by either thelibrary
directive or theMain//default
rule.
- The
- The string after the
library
keyword sets the name of the library within the package. In this example, theGeometry//Objects/FourSides
library will be used.- If the
library
portion were omitted, the file would implicitly be part of the default library, which does not have a string name.
- If the
The syntax for library
directives is the same, without the package
portion:
(impl)? library STRING;
For example:
impl library "PrimeGenerator";
If the library
directive is used, the file is implicitly part of the Main
package, whose name cannot be written explicitly.
If neither a package
directive nor a library
directive is provided, the file
is an API file for Main//default
. An impl
cannot be provided for
Main//default
.
Every file is in exactly one library, which is always part of a package.
Because every file is within a package, and packages act as top-level namespaces, every entity in Carbon will be in a namespace, even if its namespace path consists of only the package name. There is no "global" namespace.
- Every entity in a file will be defined within the namespace described in the
package
directive. - Entities within a file may be defined in child namespaces.
Files contributing to the Geometry//Objects/FourSides
library must all start
with [impl
] package
Geometry
library
"Objects/FourSides"
;
.
Library names may also be referred to as PACKAGE//LIBRARY
as shorthand in
text. PACKAGE//default
will refer to the name of the library used when no
library
argument is specified, although PACKAGE
may also be used in
situations where it is unambiguous that it still refers to the default library.
The package name Main
is always used implicitly and cannot appear as a package
name within source code, but still appears in shorthand notation. For example,
Main//default
is the library that is expected to contain a Carbon Run
function.
It's recommended that libraries use a single /
for separators where desired,
in order to distinguish between the //
of the package and /
separating
library segments. For example, Geometry//Objects/FourSides
uses a single /
to separate the Object/FourSides
library name.
Because an import of a package declares a namespace entity with the same name, conflicts with the package name are possible.
For example, this is a conflict for DateTime
:
import DateTime;
struct DateTime { ... }
Note that imported name conflicts are handled differently.
Every Carbon library consists of one or more files. Each Carbon library has a primary file that defines its API, and may optionally contain additional files that are implementation.
- An API file's
package
directive does not include theimpl
modifier. For example,package Geometry library "Shapes";
- API filenames must have the
.carbon
extension. They must not have a.impl.carbon
extension. - API file paths will correspond to the library name.
- The precise form of this correspondence is undetermined, but should be expected to be similar to a "Math/Algebra" library being in a "Math/Algebra.carbon" file path.
- The package will not be used when considering the file path.
- API filenames must have the
- An implementation file's
package
directive includes animpl
modifier. For example,impl package Geometry library "Shapes";
.- Implementation filenames must have the
.impl.carbon
extension. - Implementation file paths need not correspond to the library name.
- Implementation files implicitly import the library's API. Implementation files cannot import each other. There is no facility for file or non-API imports.
- Implementation filenames must have the
The difference between API and implementation will act as a form of access control. API files must compile independently of implementation, only importing from APIs from other libraries. API files are also visible to all files and libraries for import. Implementation files only see API files for import, not other implementation files.
When any file imports a library's API, it should be expected that the transitive closure of imported files from the primary API file will be a compilation dependency. The size of that transitive closure affects compilation time, so libraries with complex implementations should endeavor to minimize their API imports.
Libraries also serve as a critical unit of compilation. Dependencies between libraries must be clearly marked, and the resulting dependency graph will allow for separate compilation.
Entities in the API file are part of the library's public API by default. They
may be marked as private
to indicate they should only be visible to other
parts of the library.
package Geometry library "Shapes";
// Circle is part of the public API of the library, and will be available to
// other libraries as Geometry.Circle.
struct Circle { ... }
// CircleHelper is private, and so will not be available to other libraries.
private fn CircleHelper(circle: Circle) { ... }
namespace Operations;
// Operations.GetCircumference is part of the public API of the library, and
// will be available to other libraries as Geometry.Operations.GetCircumference.
fn Operations.GetCircumference(circle: Circle) { ... }
This means that an API file can contain all implementation code for a library. However, separate implementation files are still desirable for a few reasons:
- It will be easier for readers to quickly scan an API-only file for API documentation.
- Reducing the amount of code in an API file can speed up compilation, especially if fewer imports are needed. This can result in transitive compilation performance improvements for files using the library.
- From a code maintenance perspective, having smaller files can make a library more maintainable.
Entities in an implementation file should never have visibility keywords. If
they are forward declared in the API file, they use the declaration's
visibility; if they are only present in an implementation file, they are
implicitly private
.
The compilation graph of Carbon will generally consist of API files depending on each other, and implementation files depending only on API files. Compiling a given file requires compiling the transitive closure of API files first. Parallelization of compilation is then limited by how large that transitive closure is, in terms of total volume of code rather than quantity. This also affects build cache invalidation.
In order to maximize opportunities to improve compilation performance, we will encourage granular libraries. Conceptually, we want libraries to be very small, possibly containing only a single class. The choice of only allowing a single API file per library should help encourage developers to write small libraries.
A namespace declared in an API file is only exported if it contains at least one
public
non-namespace name. For example, given this code:
package Checksums library "Sha";
namespace Sha256;
namespace ImplementationDetails;
private fn ImplementationDetails.ShaHelper(data: Bytes) -> Bytes;
fn Sha256.HexDigest(data: Bytes) -> String { ... }
Calling code may look like:
package Caller;
import Checksums library "Sha";
fn Process(data: Bytes) {
...
var digest: String = Checksums.Sha256.HexDigest(data);
...
}
In this example, the Sha256
namespace is exported as part of the API
implicitly, but the name Checksums.ImplementationDetails
is not available in
the caller.
import
directives supports reusing code from other files and libraries. The
import
directive's syntax may be loosely expressed as a regular expression:
import IDENTIFIER (library NAME_PATH)?;
import library NAME_PATH;
import library default;
An import with a package name IDENTIFIER
declares a package entity named after
the imported package, and makes API entities from the imported library available
through it. Main
cannot be imported from other packages; in other words, only
import library NAME_PATH
syntax can be used to import from Main
. Imports of
Main//default
are invalid.
The full name path is a concatenation of the names of the package entity, any namespace entities applied, and the final entity addressed. Child namespaces or entities may be aliased if desired.
For example, given a library:
package Math;
namespace Trigonometry;
fn Trigonometry.Sin(...);
Calling code would import it and use it like:
package Geometry;
import Math;
fn DoSomething() {
...
Math.Trigonometry.Sin(...);
...
}
Repeat imports from the same package reuse the same package entity. For example,
this produces only one Math
package entity:
import Math;
import Math library "Trigonometry";
NOTE: A library must never import itself. Any implementation files in a library automatically import the API, so a self-import should never be required.
An import without a package name imports the public names from the given library of the same package.
Entities defined in the API of the current library and in imported libraries in the current package may be used without mentioning the package prefix. However, symbols from other packages must be imported and accessed through the package namespace.
For example:
package Geometry;
// This is required even though it's still in the Geometry package.
import library "Shapes";
// Circle is visible here. The name Geometry is not declared, so
// Geometry.Circle is invalid.
fn GetArea(c: Circle) { ... }
Namespaces offer named paths for entities. Namespaces must be declared at file
scope, and may be nested. Multiple libraries may contribute to the same
namespace. In practice, packages may have namespaces such as Testing
containing entities that benefit from an isolated space but are present in many
libraries.
The namespace
keyword's syntax may loosely be expressed as a regular
expression:
namespace NAME_PATH;
The namespace
keyword declares a namespace entity. The namespace is applied to
other entities by including it as a prefix when declaring a name. For example:
package Time;
namespace Timezones.Internal;
struct Timezones.Internal.RawData { ... }
fn ParseData(data: Timezones.Internal.RawData);
A namespace declaration adds the first identifier in the name path as a name in
the file's namespace. In the above example, after declaring
namespace Timezones.Internal;
, Timezones
is available as an identifier and
Internal
is reached through Timezones
.
Namespaces may exist in imported package entities, in addition to being declared in the current file. However, even if the namespace already exists in an imported library from the current package, the namespace must still be declared locally in order to add symbols to it.
For example, if the Geometry//Shapes/ThreeSides
library provides the
Geometry.Shapes
namespace, this code is still valid:
package Geometry library "Shapes/FourSides";
import library "Shapes/ThreeSides";
// This does not conflict with the existence of `Geometry.Shapes` from
// `Geometry//Shapes/ThreeSides`, even though the name path is identical.
namespace Shapes;
// This requires the above 'namespace Shapes' declaration. It cannot use
// `Geometry.Shapes` from `Geometry//Shapes/ThreeSides`.
struct Shapes.Square { ... };
Namespace members may only be declared in the same name scope which was used to declare the namespace. For example:
namespace NS;
// ✅ Allowed: declaration is in file scope, which also declared `NS`.
class NS.ClassT {
// ❌ Error: A class body has its own name scope.
var NS.a: i32 = 0;
}
fn Function() {
// ❌ Error: A function body has its own name scope.
var NS.b: i32 = 1;
}
// ✅ Allowed: declaration is in file scope, which also declared `NS`.
namespace NS.MemberNS;
// ✅ Allowed: declaration is in file scope, which also declared `NS.MemberNS`.
class NS.MemberNS.MemberClassT {}
When multiple names are declared by binding patterns in the same pattern, all
names must be in the same namespace. Because namespace members can only be
declared in the same scope as the namespace, a namespace-qualified pattern
binding can only be used in the pattern of a var
or let
declaration. For
example:
namespace NS;
// ✅ Allowed: `a` and `b` use the default namespace.
var (a: i32, b: i32) = (1, 2);
// ✅ Allowed: `c` and `d` are in the same namespace.
var (NS.c: i32, NS.d: i32) = (3, 4);
// ❌ Error: `e` and `f` are not in the same namespace.
var (e: i32, NS.f: i32) = (5, 6);
This restriction only applies when declaring names in binding patterns, not other name uses in patterns.
Carbon's alias keyword will support aliasing namespaces. For example, this would be valid code:
namespace Timezones.Internal;
alias TI = Timezones.internal;
struct TI.RawData { ... }
fn ParseData(data: TI.RawData);
Library name conflicts should be avoidable, because it's expected that a given package is maintained by a single organization. It's the responsibility of that organization to maintain unique library names within their package.
A package name conflict occurs when two different packages use the same name,
such as two packages named Stats
. Versus libraries, package name conflicts are
more likely because two organizations may independently choose identical names.
We will encourage a unique package naming scheme, such as maintaining a name
server for open source packages. Conflicts can also be addressed by renaming one
of the packages, either at the source, or as a local modification.
We do need to address the case of package names conflicting with other entity names. It's possible that a preexisting entity will conflict with a new import, and that renaming the entity is infeasible to rename due to existing callers. Alternately, the entity may be using an idiomatic name that it would contradict naming conventions to rename. In either case, this conflict may exist in a single file without otherwise affecting users of the API. This will be addressed by name lookup.
These are potential refactorings that we consider important to make it easy to automate.
Imports will frequently need to be updated as part of refactorings.
When code is deleted, it should be possible to parse the remaining code, parse the imports, and determine which entities in imports are referred to. Unused imports can then be removed.
When code is moved, it's similar to deletion in the originating file. For the destination file, the moved code should be parsed to determine which entities it referred to from the originating file's imports, and these will need to be included in the destination file: either reused if already present, or added.
When new code is added, existing imports can be checked to see if they provide the symbol in question. There may also be heuristics which can be implemented to check build dependencies for where imports should be added from, such as a database of possible entities and their libraries. However, adding references may require manually adding imports.
-
Move the definition of an entity from an API file to an implementation file, while leaving a declaration behind.
- This should be a local change that will not affect any calling code.
- Inlining will be affected because the implementation won't be visible to callers.
- Update imports.
-
Split an API and implementation file.
- This is a repeated operation of individual API moves, as noted above.
-
Move the definition of an entity from an implementation file to the API file.
- This should be a local change that will not affect any calling code.
- Inlining will be affected because the implementation becomes visible to callers.
- Update imports.
-
Combine an API and implementation file.
- This is a repeated operation of individual API moves, as noted above.
-
Add the
private
modifier to a declaration.- Search for library-external callers, and fix them first.
-
Remove the
private
modifier from a declaration.- This should be a local change that will not affect any calling code.
-
Move a
private
declaration from the API file to an implementation file.- The declaration must be moved to the same file as the definition of the entity.
- The declaration can only be used by the implementation file that now contains it. Search for other callers within the library, and fix them first.
- Update imports.
-
Move a
private
declaration from an implementation file to the API file.- This should be a local change that will not affect any calling code.
- Update imports.
-
Move a declaration and definition from one implementation file to another.
- Search for any callers within the source implementation file, and either move them too, or fix them first.
- Update imports.
-
Rename a package.
- The imports of all calling files must be updated accordingly.
- All call sites must be changed, as the package name changes.
- Update imports.
-
Move a public declaration and definition between different packages.
- The imports of all calling files must be updated accordingly.
- All call sites must be changed, as the package name changes.
- Update imports.
-
Move a public declaration and definition between libraries in the same package.
- The imports of all calling files must be updated accordingly.
- As long as the namespaces remain the same, no call sites will need to be changed.
- Update imports.
-
Rename a library.
- This is equivalent to a repeated operation of moving a public declaration and definition between libraries in the same package.
-
Move a declaration and definition from one namespace to another.
- Ensure the new namespace is declared for the declaration and definition.
- Update the namespace used by call sites.
- The imports of all calling files may remain the same.
-
Rename a namespace.
- This is equivalent to a repeated operation of moving a declaration and definition from one namespace to another.
-
Rename a file, or move a file between directories.
- Build configuration will need to be updated.
- This additionally requires the steps to rename a library, because library names must correspond to the renamed paths.
We expect that most code should use a package and library, but avoid specifying namespaces beneath the package. The package name itself should typically be sufficient distinction for names.
Child namespaces create longer names, which engineers will dislike typing. Based on experience, we expect to start seeing aliasing even at name lengths around six characters long. With longer names, we should expect more aliasing, which in turn will reduce code readability because more types will have local names.
We believe it's feasible for even large projects to collapse namespaces down to a top level, avoiding internal tiers of namespaces.
We understand that child namespaces are sometimes helpful, and will robustly support them for that. However, we will model code organization to encourage fewer namespaces.
We use a few possibly redundant markers for packages and libraries:
- The filename and the presence or absence of the
impl
keyword duplicate the API versus implementation choice. - The filename and the library name portion of the package declaration duplicate the name of the library.
- The
import
keyword requires the full library.
These choices are made to assist human readability and tooling:
- Being explicit about imports creates the opportunity to generate build dependencies from files, rather than having them maintained separately.
- Being explicit about API versus implementation in the filename makes it easier for both humans and tooling to determine what to expect, and makes it possible to check the type without reading file content.
- Repeating the type and library name in the file content makes non-file-system-based builds possible.
These open questions are expected to be revisited by future proposals.
Currently, we're using .carbon
and .impl.carbon
. In the future, we may want
to change the extension, particularly because Carbon may be renamed.
There are several other possible extensions / commands that we've considered in coming to the current extension:
.carbon
: This is an obvious and unsurprising choice, but also quite long for a file extension..6c
: This sounds a little like 'sexy' when read aloud..c6
: This seems a weird incorrect ordering of the atomic number and has a bad, if obscure, Internet slang association..cb
or.cbn
: These collide with several acronyms and may not be especially memorable as referring to Carbon..crb
: This has a bad Internet slang association.
Currently, we do not support cross-language imports. In the future, we will likely want to support imports from other languages, particularly for C++ interoperability.
To fit into the proposed import
syntax, we are provisionally using a special
Cpp
package to import headers from C++ code, as in:
import Cpp library "<map>";
import Cpp library "myproject/myclass.h";
fn MyCarbonCall(x: Cpp.std.map(Cpp.MyProject.MyClass));
Currently, we don't support any kind of package management with imports. In the future, we may want to support tagging imports with a URL that identifies the repository where that package can be found. This can be used to help drive package management tooling and to support providing a non-name identity for a package that is used to enable handling conflicted package names.
Although we're not designing this right now, it could fit into the proposed syntax. For example:
import Carbon library "Utilities"
url("https://github.com/carbon-language/carbon-libraries");
Similar to API and implementation files, we may eventually want test-specific files. This should be part of a larger testing plan.
- Packages
- Name paths for package names
- Referring to the package as
package
- Remove the
library
keyword frompackage
andimport
- Rename package concept
- No association between the file system path and library/namespace
- Require the use of the identifier
Main
in package declarations - Permit the use of the identifier
Main
in package declarations - Make the main package be unnamed
- Use a different name for the main package
- Use a different name for the entry point
- Distinguish file scope from package scope
- Default to an implementation file for
Main//default
instead of an API file
- Libraries
- Allow exporting namespaces
- Allow importing implementation files from within the same library
- Alternative library separators and shorthand
- Collapse API and implementation file concepts
- Collapse file and library concepts
- Collapse the library concept into packages
- Collapse the package concept into libraries
- Default API to private
- Default
impl
to public - Different file type labels
- Don't default to introducing an API file
- Function-like syntax
- Inlining from implementation files
- Library-private access controls
- Make keywords either optional or required in separate definitions
- Managing API versus implementation in libraries
- Multiple API files
- Name paths as library names
- Put the
impl
modifier at the end - Put the
impl
modifier beforelibrary
- Imports
- Namespaces