Skip to content

Latest commit

 

History

History
1438 lines (1088 loc) · 43.6 KB

GETTING_STARTED.md

File metadata and controls

1438 lines (1088 loc) · 43.6 KB

Table of Contents

Installation

Note

Ocen is tested on Linux and macOS. Windows (and MSVC) are not supported at the moment. You can use WSL on Windows to use Ocen, or possibly try your luck with MinGW / etc. The standard library heavily relies on Unix-like C functions and system calls.

Ocen requires a C compiler to be available on the system. It uses gcc available on your PATH to compile the resulting C code. Alternatively, you can use the CC environment variable to specify a different compiler.

Building from source

$ git clone https://github.com/ocen-lang/ocen ~/ocen
$ cd ~/ocen
$ ./meta/bootstrap.sh
$ ./bootstrap/ocen --help

Environment Setup

In order to use Ocen to compile and run other programs, you'll need to set up your environment so the compiler can find the standard library and other necessary files. Add these lines to your shell profile (e.g. ~/.bashrc or ~/.zshrc):

$ export OCEN_ROOT=$HOME/ocen             # For standard library and other files
$ export PATH=$OCEN_ROOT/bootstrap:$PATH  # Add the compiler to your PATH

You should now be able to invoke the Ocen compiler from anywhere on your system:

$ ocen --help

VSCode Extension

Ocen has a VSCode extension that provides syntax highlighting and some basic LSP features. For the extension to work, you need to have the Ocen compiler available on your PATH. The extension can be found on the VSCode Marketplace.

Note

The extension is still in development and you are likely to encounter bugs. Please report any that you find. It currently works on a per-document basis, so any changes made in other open files will not automatically be reflected in other open files.

Some tips for using the extension:

  • The extension provides an Ocen: Rescan document command to re-run the LSP server on the current document. This is useful if you've modified another file and want to update the current document with the changes.
  • The extension provides some convenient snippets for common constructs in Ocen.

Annotated Hello World

// Comments

// Entry point is `main`. Arguments for main are optional, and
// Return type is implicitly `i32`
def main(argc: i32, argv: &str) {
   // `print` and `println` are builtin functions, that work like C
   println("Hello world from %s", "ocen")
}

Builtins

The builtin types in Ocen include:

  • u8, i8, u16, i16, u32, i32, u64, i64: Signed integer types
  • f32, f64: Floating point types
  • char: Character type - one byte. (Equivalent to char in C)
  • str: Similar to char* in C, and can be used interchangebly with &char
  • untyped_ptr: Similar to void* in C. It can be implicitly cast to/from any other pointer.
  • bool: A boolean type

Type Modifiers

All types can be modified with some builtin operations. Using u32 as an example:

  • &u32, &&u32: A pointer to u32, and a pointer to a pointer to u32
  • [u32; 5]: An array of 5 u32s.
  • fn(u32, u32): u32: A function-type that takes in 2 u32s and returns a u32

Strings

String Literals

String literals are all C-style. They are null-terminated, and stored as global constants

let s = "Hello, I'm a constant string"

Format Strings

Additionally, There are format strings available, using \`` or Python-style f""` (no difference, just preference). Unless they are directly being passed to a variadic function, they will dynamically allocate memory, which needs to be freed by the user. They are null-terminated.

// NOTE: no `$` is needed before the curly brackets.
let a = `X is {X} and 3+4 = {3 + 4}`
let b = `I allocate memory. \{`        // Escape curly
let c = f"0.1 + 0.2 = {0.1+0.2:.1f}"   // Explicit format specifier

// `print` and `println` functions are variadic - no allocation happens here,
// and this expands to format specifiers + arguments in generated C.
println(`Some math {1+2+3}`)

Variables, and literals

Variables are declared with the let keyword, and they are lexically scoped. Shadowing is permitted, except in the same scope. Unlike in C, annotating a type is optional.

let a = 5       // Literal integers are u32 by default
let b = -6      // Negative literals are i32 by default
let c: u8 = 7   // Adding a type annotation makes literals the correct type
let d = 7u8     // Literal integers can have an integer-type suffix.
let e = "hi"    // str type
let f = 3.14    // Implicitly an f32, but can use annotations

let x: str      // Uninitialized variable (MUST provide a type)

Global variables and constants

Ocen supports global variables and constants. Global variables are similar to local ones, with the exception that type annotations need to be specified, and you can only use literals / other globals to initialize them, not arbitrary expressions such as function calls.

let g_x: i32 = 0          // OK!
let g_z: i32 = g_x + 1    // OK!
let g_y = g_x             // ERROR: No type-annotation.
let g_w: i32 = foo()      // ERROR: Function call not allowed
let g_a: i32              // OK! Zero-initialized by default.

def main() => 0

Constants are defined with const, and are only allowed in the global scope. They are compile-time constant values, and have similar restrictions as globals. Constants must be initialized when they are defined.

Note

Constants compile down to #define ... at the C level. This is so you can use them to define the size of arrays / etc. which is not not possible with global variables. However this does limit what you can do with them, and many of the restrictions are because of this.

let g_A: i32 = 0

const X: i32 = 0        // OK!
const Y: i32 = X * 3    // OK! Can use other constants
const Z: i32 = g_A * 2  // ERROR: Can't use variable in constant expression

def main() => 0

Arithmetic, Comparisions, other operators

All numerical operations in Ocen are strictly typed, and both operands must have the same type.

let x = 5 + 10          // OK: Implicitly both u32
let y = 5u64 + 10       // OK: `10` here picks up the type from the LHS
let z = 7u8 + 9i8       // ERROR: Both LHS and RHS have different types.

All available operators:

// Arithmetic
a + b    a += b
a - b    a -= b
a * b    a *= b
a / b    a /= b
a % b
-a


// Logical
a and b
a or b
not a

// Comparison
a >  b      a >= b
a <  b      a <= b
a == b     a != b
a < b <= c < d       // Allowed

// Increment / Decrement
++x
x++
--x
x--

// Bitwise operators
~x      // Complement
x ^ y   // XOR
x | y   // OR
x & y   // AND

// Bitshift operators
x << y      x >>= y
x >> y      x <<= y

// Misc:
x?      // Question mark checks if pointer is non-null. Only valid for pointer types.
x in y  // No default implementations, but can override

Warning

Make sure you don't use expensive expressions (or with side-effects) in multiple-comparisons. They will be evaluated multiple times. It's recommended to only use these with variables.

Pointer Arithmetic

Pointer arithmetic is treated specially. You can add/subtract any pointer type P with any integer-like type I to get back a value with type P

let x = "Hello" + 2     // OK: A pointer to the the first `l` in the string
let y: str = x - 2i8    // OK: Get back a pointer to the `H`

Arrays, Pointers, and Indexing

Arrays always decay to a pointer when referred to / passed around. Indexing into an array/pointer is supported with any integer type.

let x = [1, 2, 3]           // x: [u32; 3]
let y: [str; 5]             // Zero-intialized by default
let z: &u32 = x             // `x` decays to a pointer
let w = z[2i8]              // Indexing OK with any integer type

Control flow

If statements

Ocen has 2 types of if statements: the regular kind, and what is internally referred to as multi-if. They are semantically equivalent, and can be used based on syntactical preference. No parenthesis are needed around the conditions.

An if-statement accepts an optional then keyword after the condition, which can be useful when writing single line if statements.

if some_cond { println("true") }    // One liner - with curlies, technically allowed
if some_cond println("true")        // Also allowed, but sometimes ambiguous
if some_cond then println("true")   // Preferred way of writing one liners


// Regular if statement
if foo {
   do_foo()
} else if bar {
   do_bar1()
   do_bar2()
} else {
   do_baz()
}

// Equivalent Multi-if
if {
   foo => do_foo()
   bar => {
      do_bar1()
      do_bar2()
   }
   else => do_baz()
}

While Loops

Pretty much what you expect.

while a < 5 {
   do_something()
}

For loops

Standard C-style for loops are available. None of the blocks can be empty (for now). Recommended to use a placeholder in case this is needed.

for let i = 0; i < 10; i++ {
   do_something()
}

For-each loops

On supported iterator types, there is some special syntax you can use to perform for-each loops:

for x in y {
   do_something(x)
   ...
}

The above syntax gets expanded to the following regular for loop:

for let iter = y; iter.has_value(); iter.next() {
   let x = iter.get()
   {
      do_something()
      ...  // Rest of the body
   }
}

and thus expects the following methods to exist on the type of y:

  • has_value(): bool : Does the iterator currently have a value? If not, done.
  • get(): T : Get the current value. Can be any type.
  • next() : Increment the iterator to the next value.

Most of the builtin container-types have iterators defined on them to be able to loop over the values.

Match statement

In Ocen, you can match on integer-like, bool, str, char, enums, and any other type that supports == to compare elements. For all types except enums / bool, an else case is required in the match statement.

Only one case is ever executed; there is no implicit fall-through like in C. For the integer-like types / enums, a match statement gets converted to an efficient C-style switch statement. For other types (str / custom types), this falls back to an if .. else if .. chain.

let x: u8 = ...

match x {
   // One liners don't need a block
   1 => ...
   // Use block for multi-line statements
   2 => {
      let y = 10
      foo(y)
   }
   // Match on multiple values for this block
   3 | 4 | 0x05 | 6 => {
      ...
   }
   7 | 8 | 9 => y = 10
   // catch all
   else => something()
}

Expression Statements

In Ocen, if, match statements and blocks can be treated as expressions that return a value. They can be used in any context where an expression is expected.

  • For if and match statements, all of the bodies need to be valid expressions to be used as an expression statement. The types of all these expressions should be the same.
  • For a block to be used as an expression, it needs to use the yield keyword to "return" a value from it (result of the expression). A block can only have a single yield in it.

Some examples:

// Regular if:
let a = if foo then 5 else 10

// Multi-if
let b = if {
   foo => 5
   bar => 10
   else => 20
}

// Block
let c = {
   let x = 1
   let y = 2
   yield x + y
}

// Match
let z = match x {
   1 => 10
   2 | 3 | 4 => 20
   // Uses a nested block-expression
   5 => {
      yield 7
   }
   // Nested match expression, which uses a nested block expression
   6 => match y {
      4 => 20
      else => {
         let q = foo()
         yield q
      }
   }

   // If the compiler can say this branch exits, it won't complain
   // about types since we will never actually assign to `z`
   else => std::exit(1)
}

Casting

You can cast values between types using the as keyword.

Warning

Casting is not checked by the Ocen compiler - it assumes that it's valid. This can be useful for bypassing certain quirks when interfacing with C types, but can also lead to breaking code. Be careful when doing this. It's possible to get invalid C code from this that may not compile.

let x = 5u8 as u32      // OK
let y = -1i64 as u8     // OK: We don't check for underflow/overflow
let z = "hi" as u8      // OK(!!): This makes no sense, but we don't check.

Functions

Functions are defined with def. All parameters must be typed. A return type is optional. If a function has a return type, it must return a value explicitly.

No function declarations are needed, and you are allowed to use functions declared later in the file.

def foo(a: u32, b: u32): u32 {
   let c = a + b
   return c
}

Parameter labels

When calling a function, the parameters always need to be passed in the same order they are specified. You can optionally specify the name of the parameter when calling it, to make the intention clearer at the call site. If an incorrect label is used, this will trigger an error.

def verify(check_a: bool, check_b: bool, check_c: bool): bool => ...

// You can do this
verify(true, false, true)

// But clearer to do this
verify(check_a: true, check_b: false, check_c: true)

// Can mix and match, depending on preference
verify(true, check_b: false, true)

Default arguments

Ocen allows having default arguments for functions. All the default arguments need to come at the end of the parameter list.

def foo(a: u32, b: u32 = 10): u32 => a + b

let x = foo(1, 2)
let y = foo(3)

Note

It is not possible to provide a value for a default argument B that comes after a default argument A without also providing a value for A. This may be fixed in the future.

def bar(a: u32, b: u32 = 0, c: u32 = 1): u32 => a + b + c

let x = bar(1, 2, 3)    // OK!
let y = bar(1)          // OK!
let z = bar(1, b: 2)    // OK!
let w = bar(1, c: 2)    // ERROR: Need to provide `b` if you want to provide `c`

Arrow functions

If a function returns a single expression, it can be written with arrow-syntax. Note that you still need to annotate the return type explicitly.

def foo(a: u32, b: u32): u32 => a + b

Additionally, this can also be used for one-liner functions that don't return anything:

def foo(a: &u32, b: u32) => *a = b

Variadic functions

These work similar to how variadics work in C. A variadic function is denoted by ... as the last argument, and cannot have default values for any of the arguments.

Warning

Due to their nature, variadic function calls are not type-checked and can type-issues. it's generally recommended to avoid using them.

Note

Ocen does not currently have support for properly implementing variadic functions. They are currently here to serve as a way of writing Ocen wrappers for external variadic functions only.

import std::variadic::{ VarArgs }

def foo(n: u32, ...) {
   let va: VarArgs
   va.start(n)
   // Can't do much else except call other variadic functions,
   // usually implemented in some C library. Look below for how
   // to interop with C code.
   bar_variadic(va)
   va.end()
}

Structs / Unions

Ocen provides struct and union compound data types (as in C). Anonymous struct/union definitions are not allowed, and each must be declared separately. The defined is referred to by it's name (without any qualifiers). No forward declaration of structs is needed.

Fields (and later methods) for a struct can be accessed using the . syntax

struct Foo {
   x: u32
   y: str
   a, b, c: u32   // Multi-field syntax is OK
   u: Bar         // Can use a type defined later
}

// Setting and accessing fields in a struct
let f: Foo = ...
f.x = 5
f.a = f.x

// A union takes as much space as it's largest member
union Bar {
   f: &Foo
   y: u32
   z: [u32; 10]
}

Constructors

Structures can be constructed by using their name as a function. If doing so, every single field of the structure needs to be specified.

Note

Using constructors is not possible for unions, or when your structure contains any unions. This may be fixed in the future. In the meantime, the way to do this is to create an uninitialized variable and init the fields you want.

struct Vec2 {
   x, y: f32
}

let a = Vec2(1.0, 2.0)
let b = Vec2(x: 1.0, y: 2.0)  // Can use labels

// Manually
let c: Vec2
c.x = 1.0
c.y = 2.0

Enums

Enums are defined with the enum keyword. All enum variants are namespaces to the parent enum. Enums can only be compared with enums of the same type.

enum Size {
   Small
   Medium
   Big
}

enum WordsWithB {
   Big     // Doesn't clash with Size::Big
   Bug
   Bog
}

let x = Size::Big
let y: Size = Small   // Can be inferred if we have a hint

Methods

All builtins, structs, enums and unions in Ocen can have methods associated with them. These methods can either be static, or based on the instance of the object.

Static Methods

A non-static method is one that doesn't take in this or &this as the first argument. It can only be called when qualified by the type name.

// Static method

def u32::from_str(s: str): u32 => ...

let x: u32 = u32::from_str("123")

Instance Methods

An instance method must take in this or &this as it's first argument, depending on whether it wants a copy or reference of the original object the method is being called on. Generally, for most objects where methods modify the internal state of the object, or if the object is dynamically allocated, it should take in &this by reference.

struct Foo {
   x: u32
}

def Foo::get_x(this): u32 => this.x             // OK to take in by value here
def Foo::set_x(&this, y: u32) => this.x = y     // Need to take in by reference

let f: Foo = ...
// Caller doesn't have to care about value/reference capture
let z = f.get_x()
f.set_x(z + 1)

Instance methods can also be treated as static methods if needed. In this case, the this argument needs to be passed in explicitly by the caller.

let f: Foo = ...
let getter = Foo::get_x  // Use as static method, assign to func ptr
let setter = Foo::set_x
let z = getter(f)
setter(&f, z+1)  // Need to manually take reference here

Dot Shorthand

In instance methods, you can use the .foo shorthand to refer to this.foo, to save some typing. This can be used to access fields and methods from the this object.

// Rewritten with dot-shorthand
def Foo::get_x(this): u32 => .x
def Foo::set_x(&this, y: u32) => .x = y

Templates

Ocen supports some basic templating (similar to C++). In particular, it does not support interfaces/traits, and type-checks each different instantiation for the template separately.

This is an intentional choice to keep the language simpler - and allow the programmer to do whatever they wish to do without having to convince the compiler something is valid through complicated trait definitions.

Currently, there is no inference of template parameters, so the full templated name needs to be specified when needed.

Template functions

Simple example:

def swap<T>(a: &T, b: &T) {
   let tmp = *a
   *a = *b
   *b = tmp
}

let x = 5
let y = 10
swap<u32>(&x, &y)    // No inference, specify <u32> explicitly

A more nuanced example, showing possible errors:

def u32::hash(this): u32 => ...  // returns some hash

// Note how this assumes `v` has a `hash` method
def hasher<T>(v: T): u32 => v.hash() + 31415

// OK! `u32` has a `hash` method, which we defined above
let a = hasher<u32>(5)

// ERROR: Invalid
// This will error at `v.hash()` and say `str` has no member named `hash`
let b = hasher<str>("hi")

Template Structs

An example of template structs:

struct Vector2D<T> {
   x, y: T
}

let a = Vector2D<u32>(1, 2)
let b = Vector2D<f32>(1.0, 2.0)

// Can have multiple template arguments
struct Item<K, V> {
   point: Vector2D<K>  // Can use nested templates
   item: FooBar<V>
}

All methods defined for templated structs implicitly get access to the template variables. They do not need to be redefined.

struct Vector2D<T> {
   x, y: T
}

// Note that for `other`, we need to use the full templated type
def Vector2D::add(this, other: Vector2D<T>): Vector2D<T> {
   return Vector2D<T>(.x + other.x, .y + other.y)
}

Explicit Namespaces

Each file in Ocen has it's own namespace for global-level declarations. Usually, we want to organize code in different files to avoid polluting namespaces, but in some cases it can be useful to have an explicit namespace in a file. We can do that with the namespace keyword. Namespaces can be nest arbitrarily.

You can access elements from inside a namespace using the :: syntax, similar to how we access static methods on objects.

namespace foo {
   let a: i32 = 0

   namespace bar {
      def b(): u32 => 40
   }
}

def main() {
   let x = foo::a
   foo::bar::b()
}

Modules, Importing, and Libraries

You can spread your code across several files for organization, and then import the things you need from other files. Each file has it's own namespace, and there are no name collisions across different files.

Outside of the code in your project, Ocen has the concept of libraries, which are files / groups of files that are located in a different place in your system, and can be loaded in.

Symbols can be imported using the import statement from different files / libraries. Import statements are generally to be used at the global level (or at the namespace level), but can be used within a function to limit the scope of the imports.

Note

Ocen looks at global imports to figure out which files it needs to find and load in, not those defined in the function context. If you wish to use a function-local import, then you need to make sure that you have a global import for the relevant symbol(s) somewhere in your project at the global level to ensure it gets properly found and compiled in.

Project vs Single File

The Ocen compiler internally operates in 2 modes: Project mode vs Single File mode. Depending on which mode you are in, there are different ways of imporing available to you.

When you compile a file, Ocen will look at the directory the file is in, and all of it's parent directories. If any of these directories contains a file called main.oc, the compiler will assume it is in Project mode, and consider the directory where it found main.oc to be the root of the project. Otherwise, it will be in Single File mode.

This means that every project must contain a main.oc file at the top-level directory.

Warning

Since the heuristic from the compiler is so simple, it is recommended to not have random files with the name main.oc lying around in your filesystem. If you are writing one-off files, name them something else.

The Import Statement

Each import statements is divided into parts separated by ::. When traversing the file system, each of these paths corresponds to a directory of the same name, or .oc file with the same name before the extension.

Every single file / directory that makes up the path of an import gets it's own namespace, where several definitions can live.

import std::foo::bar::baz
// This is going to import either:
//  - /path/to/std/foo/bar/baz.oc      (file)
//  - /path/to/std/foo/bar/baz         (package containing more files)
//  - /path/to/std/foo/bar.oc  baz     (a symbol `baz` defined in `bar.oc`)

Multiple imports

You can import multiple symbols from some part in the import statement, recursively. For instance, all the following groups of imports are equivalent

// All Manual
import std::foo::bar::uno
import std::foo::bar::dos
import std::foo::qux::one
import std::foo::qux::two
import std::foo        // Also import whole `foo` namespace

// Multi-import from the last part
import std::foo::bar::{ uno, dos }
import std::foo::qux::{ one, two }
import std::foo   // still do this manually

// Recursive multi-import. Can use `this` to import the namespace itself.
import std::foo::{ this, bar::{ uno, dos }, qux::{ one, two } }

Aliasing

When importing some definitions, it's possible we want to rename them in the current scope (to perhaps not collide with any definitions). This can be done using the as keyword inside imports.

import std::foo::{ bar as not_bar, baz }

not_bar()
baz()

What can be Imported?

By default, only definitions created in the current file can be imported from it. Anything that the file has imported for itself stays hidden from everyone outside the file.

However, it is possible to re-export symbols.

Directories as Modules

Whenever a directory is looked at as a part of an import statement, the Ocen compiler will automatically look at that directory to see if a mod.oc file exists in this directory. If it does, then all the definitions in mod.oc are loaded into the namespace that corresponds with the directory itself automatically.

This is useful when writing libraries - it allows you to have multiple files inside the library but still define useful features for a user from what they see as the top-level of the library, without having to import an extra file. For instance, in the standard library, functions such as exit(), panic(), and other useful builtin methods are loaded in through std/mod.oc.

Types of Imports

Depending on which mode you are in, you have a few different methods to import items available to you.

Library Imports

import foo::bar::baz

These are always available. This will search for a library called foo in the library paths (specified by -l flag in the compiler or through OCEN_LIB environment variable), and then search for bar within that library, and baz within that and so on.

Project-Relative Imports

import @foo::bar::baz

Available in Project mode only. This searches for foo in the project root directory, and then bar and baz and so on.

File-Relative Imports

import .foo::bar::baz      // foo is in same directory/namespace
import ..foo::bar::baz     // foo is in parent directory/above namespace
imorpt ...foo::bar::baz    // foo is in ../../

Available in Project mode only - You are not expected to be accessing other files around you in Single File mode. This searches for foo relative to whichever parent namespace is specified by the amount of . put in. it then searches for bar/baz within these as usual.

Current Scope Imports

import ::foo::bar::baz

These are always available. This will search for a symbol called foo in the local scope, and then attempt to import symbols from this. foo here can be a namespace that was explicitly defined in the current file, or it might be something that was imported from somewhere else.

The Standard Library

Ocen comes with a rich standard library that is made available to you in the project-wide global scope under std::. It includes data structures such as dynamic lists, hash-maps, hash-sets, deques, parsers for file formats such as json, png, midi, and a whole host of other functionality. Look here for a list of all available APIs.

Attributes

Top-level declarations (functions/structs/etc) in Ocen can be tagged with different attributes. The available attributes differ for each type of declaration, but are generally of the form:

Attributes are defined at the compiler-level, and it's not possible to create custom attributes without changing the compiler.

[made_up_attr_0]
let X: i32

// All arguments **must** be string literals
[made_up_attr_1 "arg1"]
struct Bar { ... }

// Can use multiple attributes
[made_up_attr_0 "arg1"]
[made_up_attr_1 "arg1"]
[made_up_attr_2 "arg1" "arg2"]
def foo(): u32 => 0

extern attribute

This attribute can be used with structs, enums, functions, variables and constants. If used with no arguments, it assumes the name of the symbol matches the C one. If used with methods, you should always provide the extern name.

For more information on binding external functions, look at Binding C Functions.

exits attribute, Non-returning functions

The exits attribute can only be used for functions. It does not take in any arguments. It is used to indicate that a function does not ever return. It is used by the compiler when doing return analysis.

[exits]
def foo() {
   std::exit(1)
}

def bar(): u32 {
   foo()
   // If foo was not marked as `exits`, the compiler would complain
   // about `bar()` not always returning a `u32`.
}

export attribute, Re-exporting symbols

The export attribute is only available for global import statement. It takes in no arguments. It tells the compiler to re-export the imported symbol(s) from the current namespace.

This is most useful when writing library code, to expose functions defined in nested modules from the top-level file.

///////////// file: src/bar/bar_impl.oc
// The actual function
def do_bar(): u32 => 30

///////////// file: src/bar/mod.oc
// Re-export it
[export] import .bar_impl::{ do_bar }


///////////// file src/main.oc
// Can import `do_bar` from `bar`
import .bar::{ do_bar }

operator attribute, Operator Overloading

The operator attribute is used for operator overloading, and only applies to functions. It takes in exactly one argument, representing the operator we want to overload with the current function. One function can be used to overload multiple operators by specifying separate attributes if needed.

struct Vector2D {
   x, y: u32
}

[operator "+"]  // Operator **must** be in a string literal
def Vector2D::add(this, other: Vector2D): Vector2D {
   return Vector2D(.x + other.x, .y + other.y)
}

let z = Vector2D(0,0) + Vector2D(1,2)  // Now allowed

Every overload defined for an operator needs to have unique input signature. For instance, it is not allowed to have two overloads for Foo + u32 (even if they result in different types). It is allowed to have Foo + u32 and Foo + i32, etc.

For a function to overload a certain operator, it must satisfy the requirements for that operator. These are listed below for all the currently overridable operators. For each of the operations below, x, y and z are the first, second, and third arguments respectively (where applicable).

  • + : 2 arguments to the function (x + y)
  • - : 1/2 arguments to the function (-x / x - y)
  • * : 2 arguments to the function (x * y)
  • / : 2 arguments to the function (x / y)
  • << : 2 arguments to the function (x << y)
  • >> : 2 arguments to the function (x >> y)
  • & : 1/2 arguments to the function (&x / x & y)
  • | : 2 arguments to the function (x | y)
  • += : 2 arguments to the function, first is pointer (x += y)
  • -= : 2 arguments to the function, first is pointer (x -= y)
  • *= : 2 arguments to the function, first is pointer (x *= y)
  • /= : 2 arguments to the function, first is pointer (x /= y)
  • <<= : 2 arguments to the function, first is pointer (x <<= y)
  • >>= : 2 arguments to the function, first is pointer (x >>= y)
  • [] : 2 arguments to the function (x[y])
  • % : 2 arguments to the function (x[y])
  • in : 2 arguments to the function, and returns a bool (y in x) [Look at note below]
  • not : 1 arguments to the function, and returns a bool (not x)
  • == : 2 arguments to the function, and returns a bool (x == y)
  • != : 2 arguments to the function, and returns a bool (x != y)
  • ? : 1 argument to the function, and returns a bool (x?)
  • []= : 3 arguments to the function (x[y] = z)

Note

For the in operator, the order of arguments is swapped. This is because often, the corresponding method we want to bind takes the value we are searching for as the second argument (and the instance variable this as the first). Example usage:

[operator "in"]
def StringHashMap::contains(&this, s: str): bool => ...

// Usage
let h: StringHashMap
// Note how the string here naturally wants to be the second argument
if "foo" in h { ... }

atomic attribute, Atomic Variables

The atomic attribute applies to global variables / struct fields. It takes in no arguments. It inidicates that the variable is atomic, and prepends the declaration in C with _Atomic.

[atomic] let counter: i32 = 0

struct Foo {
   [atomic]
   x: i32
}

variadic_format attribute, Format Strings as arguments

The variadic_format attribute only applies to variadic functions. It takes in no arguments. It tells the compiler that the function being tagged expects a variadic format-like string as it's last argument (similar to printf and fprintf).

When such a function is called with a format-string as the last argument, instead of creating an allocated formating string on the heap, it converts it to the variadic arguments.

Note

The last non-variadic argument for a function being tagged must be of type str.

[variadic_format]
def foo(fmt: str, ...): u32 => 0

let oc = "ocen"
// This call:
foo(`Hello {1+2:.1f} from {oc}`)
// Automatically gets expanded to:
foo("Hello %.12f from %s", 1+2, oc)  // No allocation!

formatting attribute, Basic Formatting of custom structs

The formatting attribute only applies to structs. It takes in 2 arguments, and is used by format-strings to provide a (very minimal) way of formatting some basic structs. The arguments are:

  1. A string representing the format specifier(s) to add to the format string
  2. A string representing what the arguments to the format-string should be. In this string, all used of the character $ will be replaced by the original expression in the format-string.

This is better shown by example:

[formatting "Foo(%u)" "$.x"]
struct Foo {
   x: u32
}

let f: Foo
// This line:
println(`f has the value {f}`)
// Automatically gets expanded to:
println("f has the value Foo(%u)", (f).x)

Here's a more complex example:

Note

In complex cases, we evaulate the expression passed to the format string multiple times. To avoid potential bugs due to unwanted side-effects, the compiler prohibits you from using arbitrary expressions that result in the StringView (or other custom) type. The recommendation is to save the value to a variable, and pass just the variable to the format string as an expression.

// 1. Note how the first argument is an arbitrary string, with multiple specifiers
// 2. We can use `$` multiple times, and can comma-separate multiple arguments
//    In this case, note that `%.*s` takes in 2 arguments
[formatting "SV(size=%u, data='%.*s')" "$.size, $.size, $.data"]
struct StringView {
   data: str
   size: u32
}
let s: StringView
// This line:
println(`s = {s}`)
// Automatically gets expanded to:
println("s = SV(size=%u, data='%.*s')", (s).size, (s).size, (s).data)

Interfacing with C code

Ocen allows you to easily interact with C code. You can bind C libraries to Ocen with minimal work, and can also output just the generated C code to build within your own environment (such as through emcc to compile to WASM).

Compiler Directives

Compiler directives in Ocen are a way to configure how the generated C code should be built by the compiler. They are generally of the form:

@compiler directive_name "some argument"

Including C headers

Since Ocen generates C code, when using libraries we want to include the relevant headers in the generated code for proper compilation. You can tell the Ocen compiler what headers to include in the final code using a compiler directive like:

@compiler c_include "SDL2/SDL.h"

Embedding C files

Sometimes, you may want to implement some functionality in your code in pure C, and have Ocen embed all this code directly into the final .c file. This can help simplify tracking different versions of .c files separately from the compiled ocen.

This directly simply copies all the text in the linked files into the generated C.

Note

The compiler expects the path provided in this directive to be relative to the parent directory of the file where the directive is found.

@compiler c_embed "native_utils.c"

Specifying Compiler Flags

Ocen allows each file to specify what C flags it expects to have (for eg: to link with a library). This makes it so that anyone importing a package doesn't have to worry about having to configure a build system - as long as they have the packages/libs available in their path. (If not, you should output C code and compile yourself with build system of choice.)

@compiler c_flag "-lSDL -lm"  // Include SDL math libs

In addition to the compiler directive specified in the code, the compiler also provides an optional --cflags argument which can be used to add extra flags. For instance:

ocen src/main.oc --cflags "-I/foo/bar/ -DOPT=1" -o foo

Binding C Functions

Declarations tagged with the extern attribute cannot have definitions. This includes functions and constants. These declarations do not result in any code generated, and are simply a way of telling the compiler certain symbols exist, and how to type-check them.

[extern] let errno: i32          // No definition, C variable is called `errno` too
[extern "errno"] let ERR: i32    // Use `ERR` in Ocen, but `errno` in generated C
[extern "FILE"] struct File {}   // Don't need to specify any fields

[extern] def strcpy(a: str, b: str)  // Can lie about return type if we don't care
[extern] def malloc(sz: u16)         // Can lie about input types if C can cast implicitly

Complex Structure Bindings

If we wish to use C structs as more than just an opaque type, we need to tell ocen what the fields are and what the types of those fields are. We only need to specify the ones we actually care about - ocen does not check if these fields actually exist, but just takes your word for it.

Note

Remember, this will not generate any code. It's simply for the type-checker.

[extern "Vector2D"]
struct Vec {
   x: f32   // `Vector2D` struct in C must have a field `x`
   // Only need to specify the fields you want to use
}

let v: Vec = ...
v.x   // Can use it in ocen now...

Method Bindings

Methods in Ocen are just normal functions that implicitly pass in the object as the first argument. We can use this to bind external C functions as methods to external C types, creating a nicer interface at the ocen level.

[extern "FILE"] struct File {}
// Bind extern as static function, with a default argument
[extern "fopen"] def File::open(fname: str, mode: str = "r"): &File
// Bind extern as instance function
[extern "fclose"] def File::close(&this)    // `&this` because `fclose(FILE*)`

// But we don't need to bind it as a method
[extern] def fread(a: untyped_ptr, x: u64, n: u64, f: &File)


let f = File::open("foo.txt")  // Uses default mode
fread(dummy, 1, 2, f)
f.close()

Enum Bindings

For enums, you need to bind each of the enum variants to an external symbol. Here's an example:

[extern "SDL_EventType"]
enum EventType {
   Quit = extern("SDL_QUIT")
   KeyDown = extern("SDL_KEYDOWN")
   KeyUp = extern("SDL_KEYUP")
   ...
}

Miscellaneous Binding Tips

As you might have noticed, we can bind whatever we want to Ocen, as long as we know it's sound at the C level. It won't care as long as you don't. The names you provide in the extern attribute are arbitrary strings - and this can be (ab)used in certain scenarios to improve the usability when interfacing with C code.

One example is binding commonly used (non-enum) values to a function as an enum, to be able to use the type inference in Ocen / make the code more readable. For instance:

// Raylib Bindings

[extern "int"]  // Not an enum, but we don't care
enum Key {
   A = extern("KEY_A")
   B = extern("(KEY_B * 1)")   // Can technically use any valid C expression here...
   ...
}
// Mark the input here as `Key`, since we know it's an int
[extern] def IsKeyPressed(key: Key): bool
[extern] def GetKeyPressed(): Key

def main() {
   IsKeyPressed(Key::A)
   IsKeyPressed(B)         // Inferred, without a global variable `B`

   // Can also print out for free...
   println(f"Key Pressed: {GetKeyPressed()}")
}

Undocumented

These features exist in the ocean compiler, but are not documented here yet due to lack of time. These sections should be updated in the future, but in the meantime you can look in the tests/ folder for examples of how to use these features.