Introduction
The kommando programming language pairs a rust-like syntax and the comfort of generics with the simplicity of c.
The first part of the docs is a sequential introduction to the language and the command line tools.
The parts after are itemized lists of features and not meant for sequential reading. Rather use them as a dictionary.
Resources
- online sandbox
- this book
- the kommando github repository, containing:
- many examples
- the standard library
Consider starring ⭐ the kommando repository if you like the language!
How to
While reading this book you will frequently encounter
Oho, secret text >^.^<
code blocks
Click on the eye symbol in the top right corner to reveal more context.
You then can click the clipboard symbol and paste the code into the sandbox
This allows easy experimentation
hello, world
Let's start by writing a hello world program:
use std::*;
fn main() {
println("Hello, world");
}
Save this into a file called hello.kdo
Although this seems like not much, it already uses a bunch of language features and with it a bunch of libraries:
- core library: every project needs this library as it defines core language primitives
- c_api library: used in-proxy by
std
, it defines many c ffi functions. - std library: this library defines
println
and a bunch of other useful functions and structures- std requires the c header
stdarg.h
- std requires the c header
To compile this example we need to provide path to these libraries, located in the kdolib/ directory.
Note: Use
make help
or./kommando --help
to get a complete list of options
Here we assume we are in the root directory of the kommando compiler, otherwise you need to adjust the paths in the command.
./kommando hello.kdo hello -cr \ # compile + run
::core=kdolib/core ::core=kdolib/c_api ::core=kdolib/std \
--include=stdarg.h
Note: Command line arguments are generally order-independant
Since this is rather verbose and inclusion of these libraries is common, there is an automatic inclusion script:
./kommando hello.kdo hello -cr $(./kdolib/link)
For even more convenience, use the included Makefile instead:
make run file=hello.kdo
Note: Use
make compile
or-c
to compile without running.
hello, world
Let's start by writing a hello world program:
use std::*;
fn main() {
println("Hello, world");
}
Save this into a file called hello.kdo
Although this seems like not much, it already uses a bunch of language features and with it a bunch of libraries:
- core library: every project needs this library as it defines core language primitives
- c_api library: used in-proxy by
std
, it defines many c ffi functions. - std library: this library defines
println
and a bunch of other useful functions and structures- std requires the c header
stdarg.h
- std requires the c header
To compile this example we need to provide path to these libraries, located in the kdolib/ directory.
Note: Use
make help
or./kommando --help
to get a complete list of options
Here we assume we are in the root directory of the kommando compiler, otherwise you need to adjust the paths in the command.
./kommando hello.kdo hello -cr \ # compile + run
::core=kdolib/core ::core=kdolib/c_api ::core=kdolib/std \
--include=stdarg.h
Note: Command line arguments are generally order-independant
Since this is rather verbose and inclusion of these libraries is common, there is an automatic inclusion script:
./kommando hello.kdo hello -cr $(./kdolib/link)
For even more convenience, use the included Makefile instead:
make run file=hello.kdo
Note: Use
make compile
or-c
to compile without running.
No-std
To get more of an understanding for this process, let's try building the same program, but without the use of the standard library.
Fist we start by including all the core primitive types
use core::types::*;
Then we define the extern c function that we want to call.
The c posix function int puts(const char *s);
from stdio.h
takes a char*
(here c_str
) as input
and returns a signed 32 bit int
(here i32
)
use core::types::*;
#[extern]
fn puts(s: c_str) -> i32;
Now we are ready to call the function in main:
use core::types::*;
#[extern]
fn puts(s: c_str) -> i32;
fn main() {
puts("Hello, world!\n");
}
Save this file as no_std.kdo
and compile it:
Note: We only need to include the
core
library this time
./kommando no_std.kdo no_std ::core=kdolib/core -cr
Or, with the use of the link file:
./kommando no_std.kdo no_std $(kdolib/link_core) -cr
Variables
Variables can be created using the following syntax:
let <name>: <Type> = <value>;
Here we create the variable count
as a signed 32 bit int and initialize it with the value 5
:
use std::*;
fn main() {
let count: i32 = 5;
}
Type inferrence can be used to obtain the type from surrounding context:
use std::*;
fn main() {
let count: _ = 5;
}
The _
is a placeholder and can stand for any type.
If such type inferrence is successful, we can also omit the type fully:
use std::*;
fn main() {
let count = 5;
}
Integer constants may also have different types, which can again be inferred from both sides:
use std::*;
fn main() {
let a = 5usize;
let b: usize = 5;
let c: u8 = 5i8; // This does not work as the two types conflict!
}
Note: Goto numbers for more information about different usecases for different types
Expressions
Special care has been taken than anything can be used inside an expression. While the following would not work in languages like C
fn main() {
let x = return;
}
this is perfectly valid kommando syntax and x
(although unreachable) would get the type unit
.
An expression followed by a semicolon is called a statement. Every expression must be a statement, except optionally for the very last one in a block:
fn xor(a: u8, b: u8) -> u8 {
a ^ b
}
is equivalent to:
fn xor(a: u8, b: u8) -> u8 {
return a ^ b;
}
This allows you to yield values from arbitrary blocks:
let r = { // r is now 25
let a = 5;
a * a
};
Conditionals
The simplest form of a condition is an if expression with a body:
if <cond> {
<then>
};
Note: Unlike in other languages, blocks need to be terminated by semicolons aswell if they are to be treated as a statement.
We can also specify what should be executed otherwise:
if <cond> {
<then>
} else {
<otherwise>
};
If expressions can also be chaned:
if <cond> {
<then>
} else if <cond2> {
<then2>
} else {
<otherwise>
};
Values can be yielded from conditional blocks, same as regular blocks:
let num = 17;
let r = if num % 2 == 0 {
num / 2
} else {
(num + 1) / 2
};
Note: The yielded type from all branches must be matching
Loops
While
while <cond> {
<body>
};
Example:
use std::*;
fn main() {
let i = 0usize;
while i < 10 {
c_api::printf("%d\n", i);
i += 1;
};
}
You can also break out of a loop:
use std::*;
fn main() {
let i = 0usize;
while i < 10 {
c_api::printf("%d\n", i);
if i >= 5 {
break;
};
i += 1;
};
}
Or skip the rest of the loop body this iteration:
use std::*;
fn main() {
let i = 0usize;
while i < 10 {
i += 1;
if i >= 5 {
c_api::printf("skipping!\n");
continue;
};
c_api::printf("%d\n", i);
};
}
Note: Same as
return
,break
andcontinue
yield a type ofunit
and any code after is unreachable
let r: unit = return;
let r: unit = break;
let r: unit = continue;
Numbers | Docs [ints, floats, bools]
There are many different numeric types, but usually small subset suffices for most applications:
u8
orchar
(alias of the same type) when dealing with strings and raw byte sttreamsi32
for most standard numeric usesusize
for counters, indicies and pointer arithmeticf32
orf64
depending on precision needsu8
,u32
oru64
for bit masks or enumerations, depending on variant countbool
for flags and conditions
You should try staying within a specific numeric type for a usecase, however sometimes you will need to convert between two different types.
For that use std::numcast<T, U>(t: T) -> U
to do a C-style cast from a numeric type T
to a numeric type U
:
let a: u8 = 3;
let b: f32 = 4.2;
let r: f32 = numcast::<u8, f32>(a) * b;
This is equivalent to the following C code:
unsigned char a = 3;
float b = 4.2;
float r = (float)a * b;
// r = 12.6
If you want to instead reinterpret the bytes of a type literally, use std::typecast<T, U>(t: T) -> U
, which performs a pointer cast:
let f: f32 = 3.1415;
let f_bytes: u32 = typecast::<f32, u32>(f);
// f_bytes = 0b01000000010010010000111001010110 = 0x40490e56 = 1078529622
This is equivalent to the following C code:
float f = 3.1415;
unsigned int f_bytes = *(unsigned int*)&f;
// f_bytes = 0b01000000010010010000111001010110 = 0x40490e56 = 1078529622
Note:
typecast
only works for types of same size.
Printing I
There are two main ways of formatting and printing, depending on your needs.
The C-style way
Using posix printf
, we can do C-style formats:
use std::*;
fn main() {
c_api::printf("int: %d, string: %s, pointer: %p\n",
4, "foo", &unit {}
);
}
Text can also be formatted into a string instead of being printed directly:
use std::*;
fn main() {
let text: c_str = c_api::formatf("Hello, %s\n", "world");
c_api::printf("%s", text);
}
The formatter way
Using traits, we can format a lot more than just primitives:
use std::*;
fn main() {
println("int: $, string: $, Option: $",
42.dyn_fmt(),
"foo".dyn_fmt(),
Option::<u8>::some(3).dyn_fmt()
);
}
Note: Since our strign no longer contaisn the necessary information to deduce the type, we need to package the value together with it's formatting function. Because of that, every format argument must be turned into
DynFmt
as seen above. Failing to do so will most likely crash the program, the same way an invalid C-style format string would cause a segfault.
You can also implement formatting for your own types
use std::*;
fn main() {
// segfaults as a an int cannot function as a string
c_api::printf("%s", 4);
}
Text can also be formatted into a string instead of being printed directly:
use std::*;
fn main() {
// use `formatln` to get a newline appended or just `format` without this feature
let text: c_str = formatln("Hello, $", "world".dyn_fmt());
println(text);
}
Note: In part II we will take a look at how we can make a custom struct printable
Pointers
All objects are passed by value and as such copied. That means once you pass a value somewhere, it exists as a separate entity from the original.
use std::*;
fn increment(v: u32) {
v += 1;
c_api::printf("incremented: %d", v);
}
fn main() {
let value = 4;
increment(value);
c_api::printf("not incremented: %d", value);
}
Increment gets a copy of value
and as such the original is not incremented.
This can however be done by explicitly taking a reference:
use std::*;
fn increment(v: &i32) { // v points to value
*v += 1; // increment the value pointed to by v
c_api::printf("incremented: %d\n", *v); // read the value pointed to by v
}
fn main() {
let value = 4;
let points_to_value = &value; // point to value
increment(points_to_value);
c_api::printf("now also incremented: %d\n", value);
}
Here we take a reference of value
by using the &
prefix operator.
To access the inner value inside the pointer, we need to dereference it using the *
prefix operator.
Note: The cannonical name of a reference
&T
isptr<T>
. The compiler will give you a hint so that you may replace it with&T
. To see this in the online editor, enable thecompiler output
checkbox. To disable those warnigns entirely, set he--no-lint
flag. If you specifically want to use the canonical name, use the fully qualified path:::core::types::ptr<T>
to disable warnings.
A pointer is only valid as long as the original object is valid:
use std::*;
fn create_pointer() -> &i32 {
let value = 4;
let p = &value; // create pointer to value
p // value still exists...
} // value gets removed at end of function, p is now invalid!
fn main() {
// p is invalid here since value was already removed
let p = create_pointer();
}
p
may now have any and all states, or may even crash the program when used.
Even if the original object is still around but was moved, p
is still invalid, since in both cases the pointed-to value is no longer where p
points to.
Unlike c, pointers of literals are supported:
use std::*;
fn increment(v: &i32) { // v points to value
*v += 1; // increment the value pointed to by v
c_api::printf("incremented: %d\n", *v); // read the value pointed to by v
}
fn main() {
increment(&1);
}
This is equivalent to the following c code:
#include <stdio.h>
void increment(int* v) { // v points to value
*v += 1; // increment the value pointed to by v
printf("incremented: %d\n", *v); // read the value pointed to by v
}
int main() {
int _temp = 1;
increment(&_temp);
}
The pointer is only valid for the duration of the function call:
use std::*;
fn identity<T>(x: T) -> T {
// x is still valid here
x // passthrough
}
fn main() {
let now_invalid_ptr = identity(&1);
}
Structs
A struct can combine small types to a bigger type. Here four u8
s represent an RGBA color:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
You then can instanciate the struct using literal syntax:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
fn main() {
let c = Color { r: 255, g: 150, b: 180, a: 255 };
}
You can also modify the fields:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
fn main() {
let c = Color { r: 255, g: 150, b: 180, a: 255 };
c.r /= 2;
c_api::printf("Color { r: %u, g: %u, b: %u, a: %u }\n", c.r, c.g, c.b, c.a);
}
Note: Later we will use the
Fmt
trait to automatically format the color for printing
Pointers to structs automatically dereference on field access:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
fn main() {
let c = Color { r: 255, g: 150, b: 180, a: 255 };
let c_ref: &Color = &c; // points to c
(*c_ref).r /= 2; // manually dereference to access the inner value of c_ref
// ...or let the compiler do so automatically:
c_api::printf("Color { r: %u, g: %u, b: %u, a: %u }\n", c_ref.r, c_ref.g, c_ref.b, c_ref.a);
}
Methods
Methods for the most part function like normal functions:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color {
fn ansi_fg(c: &Color) -> c_str {
c_api::formatf("\x1b[38;2;%u;%u;%um", c.r, c.g, c.b)
}
fn ansi_reset() -> c_str {
"\x1b[0m"
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 180, a: 255 };
c_api::printf("Look, %scolorful text%s!\n", Color::<>::ansi_fg(&c), Color::<>::ansi_reset());
}
Note:
ansi_fg
allocates memory! Later we will use an allocation-free method via theFmt
trait and the modern print api, which can be adapted to this example
To invoke the method we insert <>
to disambiguate it from a function in another module:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color {
fn ansi_fg(c: &Color) -> c_str {
c_api::formatf("\x1b[38;2;%u;%u;%um", c.r, c.g, c.b)
}
fn ansi_reset() -> c_str {
"\x1b[0m"
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 180, a: 255 };
c_api::printf("Look, %scolorful text%s!\n",
Color::<>::ansi_fg(&c), Color::<>::ansi_reset()
);
}
Is the first parameter of a method Color
or &Color
, we can also use the direct method syntax:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color {
fn ansi_fg(c: &Color) -> c_str {
c_api::formatf("\x1b[38;2;%u;%u;%um", c.r, c.g, c.b)
}
fn ansi_reset() -> c_str {
"\x1b[0m"
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 180, a: 255 };
c_api::printf("Look, %scolorful text%s!\n",
c.ansi_fg(), Color::<>::ansi_reset()
);
}
Printing II: Printable
Now it is time to make our Color
struct printable.
Here it is again:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
To advertise its printablility, we need to implement the Fmt
trait:
use std::*;
use std::fmt::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color: Fmt {
fn dyn_fmt(self: &Color) -> DynFmt {
todo()
}
fn fmt(self: &Vehicle, fmt: &Formatter, stream: &FormatStream) {
todo()
}
}
Tne Fmt
trait has two methods: fmt
does the formatting and dyn_fmt
packs the format function together with the object itself together,
since all type information is lost when passing it to the println
function and otherwise we wouldn't know how to format it
use std::*;
use std::fmt::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color: Fmt {
fn dyn_fmt(self: &Color) -> DynFmt {
_ { object: typecast(self), fmt: Color::<>::fmt }
}
fn fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
stream.write_str("Color { r: ").write(fmt, &self.r)
.write_str(", g: ").write(fmt, &self.g)
.write_str(", b: ").write(fmt, &self.b)
.write_str(", a: ").write(fmt, &self.a)
.write_str(" }");
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 100, a: 255 };
println("We have the color $", c.dyn_fmt());
}
Note: We try to avoid using
format
orformatln
infmt
, since that would allocate a string. All theFormatStream::write*
methods are allocation-free
Now we can comfortably print our Color
anywhere:
use std::*;
use std::fmt::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color: Fmt {
fn dyn_fmt(self: &Color) -> DynFmt {
_ { object: typecast(self), fmt: Color::<>::fmt }
}
fn fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
stream.write_str("Color { r: ").write(fmt, &self.r)
.write_str(", g: ").write(fmt, &self.g)
.write_str(", b: ").write(fmt, &self.b)
.write_str(", a: ").write(fmt, &self.a)
.write_str(" }");
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 100, a: 255 };
println("We have the color $", c.dyn_fmt());
}
It is also possible to have more print methods on the same struct, albeit not under the Fmt
trait any more, which only houses the canonical print
Here we convert the example from the methods section to the allocation free print api
use std::*;
use std::fmt::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color: Fmt {
fn dyn_fmt(self: &Color) -> DynFmt {
_ { object: typecast(self), fmt: Color::<>::fmt }
}
fn fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
stream.write_str("Color { r: ").write(fmt, &self.r)
.write_str(", g: ").write(fmt, &self.g)
.write_str(", b: ").write(fmt, &self.b)
.write_str(", a: ").write(fmt, &self.a)
.write_str(" }");
}
}
impl Color {
fn dyn_ansi_fg(self: &Color) -> DynFmt {
_ { object: typecast(self), fmt: Color::<>::ansi_fg_fmt }
}
fn ansi_fg_fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
stream.write_str("\x1b[38;2;").write(fmt, &self.r)
.write_str(";").write(fmt, &self.g)
.write_str(";").write(fmt, &self.b)
.write_str("m");
}
fn dyn_ansi_reset() -> DynFmt {
// dummy object since this is a static method
_ { object: c_api::null, fmt: Color::<>::ansi_reset_fmt }
}
fn ansi_reset_fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
// Note: do not access `self` since its a null dummy value
stream.write_str("\x1b[0m");
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 100, a: 255 };
println("We have the color $", c.dyn_fmt());
println("Look, $colorful text$!\n", c.dyn_ansi_fg(), Color::<>::ansi_reset());
}
Generics
Sometimes structures and functions are supposed to work for multiple datataypes:
use std::*;
struct I32Container {
item: i32
}
struct BoolContainer {
item: bool
}
// and so on...
fn main() {
let i32b = I32Container { item: 4 };
let bb = BoolContainer { item: false };
}
As an alternative to creating a variant for each possible type, we can replace the concrete item
type with a generic placeholder type T
:
use std::*;
struct Container<T> {
item: T
}
fn main() {
let i32b = Container::<i32> { item: 4 };
// `_` wildcard is inferred to be bool
let bb = Container::<_> { item: false };
// omitted type is inferred to be c_str
let strb = Container { item: "hello" };
}
We can also make methods generic to work on any T
:
use std::*;
struct Container<T> {
item: T
}
// "for any T we want a Container of T with the follwing methods"
impl<T> Container<T> {
fn wrap(item: T) -> Container<T> {
// wildcard inferred to be Container::<T>
_ { item: item }
}
// `self` name is arbitrary
fn unwrap(self: Container<T>) -> T {
self.item
}
}
fn main() {
let i32b = Box::<i32>::wrap(4);
let bb = Box::<_>::wrap(false);
// we still need wildcard to disambiguate
let strb = Box::<_>::wrap("hello");
let i = i32b.unwrap();
let b = bb.unwrap();
let s = strb.unwrap();
}
Note: Technically we already know one generic:
ptr<T>
Traits
Unlike with concrete types, we do not have a lot information to work with for our T
.
This is where traits come to shine! Let's say we want to print some information about a Container.
We just need to tell the compiler that T
should itself implement Fmt
, by introducing a so-called trait bound.
This ensures a contract that T
is guaranteed to implement certain methods, in this case Fmt::fmt
and Fmt::dyn_fmt
:
use std::*;
use std::fmt::Fmt;
use core::intrinsics::short_typename;
struct Container<T> {
item: T
}
impl<T> Container<T> {
fn wrap(item: T) -> Container<T> {
_ { item: item }
}
fn unwrap(self: Container<T>) -> T {
self.item
}
}
impl<T: Fmt> Container<T> {
fn print_info(self: &Container<T>) {
println("Container contains $ [$]", short_typename::<T>().dyn_fmt(), self.item.dyn_fmt());
}
}
fn main() {
let i32b = Container::<i32>::wrap(4);
let bb = Container::<_>::wrap(false);
// we still need wildcard to disambiguate
let strb = Container::<_>::wrap("hello");
i32b.print_info();
bb.print_info();
strb.print_info();
}
Printing II: Printable
Now it is time to make our Color
struct printable.
Here it is again:
use std::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
To advertise its printablility, we need to implement the Fmt
trait:
use std::*;
use std::fmt::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color: Fmt {
fn dyn_fmt(self: &Color) -> DynFmt {
todo()
}
fn fmt(self: &Vehicle, fmt: &Formatter, stream: &FormatStream) {
todo()
}
}
Tne Fmt
trait has two methods: fmt
does the formatting and dyn_fmt
packs the format function together with the object itself together,
since all type information is lost when passing it to the println
function and otherwise we wouldn't know how to format it
use std::*;
use std::fmt::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color: Fmt {
fn dyn_fmt(self: &Color) -> DynFmt {
_ { object: typecast(self), fmt: Color::<>::fmt }
}
fn fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
stream.write_str("Color { r: ").write(fmt, &self.r)
.write_str(", g: ").write(fmt, &self.g)
.write_str(", b: ").write(fmt, &self.b)
.write_str(", a: ").write(fmt, &self.a)
.write_str(" }");
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 100, a: 255 };
println("We have the color $", c.dyn_fmt());
}
Note: We try to avoid using
format
orformatln
infmt
, since that would allocate a string. All theFormatStream::write*
methods are allocation-free
Now we can comfortably print our Color
anywhere:
use std::*;
use std::fmt::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color: Fmt {
fn dyn_fmt(self: &Color) -> DynFmt {
_ { object: typecast(self), fmt: Color::<>::fmt }
}
fn fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
stream.write_str("Color { r: ").write(fmt, &self.r)
.write_str(", g: ").write(fmt, &self.g)
.write_str(", b: ").write(fmt, &self.b)
.write_str(", a: ").write(fmt, &self.a)
.write_str(" }");
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 100, a: 255 };
println("We have the color $", c.dyn_fmt());
}
It is also possible to have more print methods on the same struct, albeit not under the Fmt
trait any more, which only houses the canonical print
Here we convert the example from the methods section to the allocation free print api
use std::*;
use std::fmt::*;
struct Color {
r: u8,
g: u8,
b: u8,
a: u8
}
impl Color: Fmt {
fn dyn_fmt(self: &Color) -> DynFmt {
_ { object: typecast(self), fmt: Color::<>::fmt }
}
fn fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
stream.write_str("Color { r: ").write(fmt, &self.r)
.write_str(", g: ").write(fmt, &self.g)
.write_str(", b: ").write(fmt, &self.b)
.write_str(", a: ").write(fmt, &self.a)
.write_str(" }");
}
}
impl Color {
fn dyn_ansi_fg(self: &Color) -> DynFmt {
_ { object: typecast(self), fmt: Color::<>::ansi_fg_fmt }
}
fn ansi_fg_fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
stream.write_str("\x1b[38;2;").write(fmt, &self.r)
.write_str(";").write(fmt, &self.g)
.write_str(";").write(fmt, &self.b)
.write_str("m");
}
fn dyn_ansi_reset() -> DynFmt {
// dummy object since this is a static method
_ { object: c_api::null, fmt: Color::<>::ansi_reset_fmt }
}
fn ansi_reset_fmt(self: &Color, fmt: &Formatter, stream: &FormatStream) {
// Note: do not access `self` since its a null dummy value
stream.write_str("\x1b[0m");
}
}
fn main() {
let c = Color { r: 255, g: 150, b: 100, a: 255 };
println("We have the color $", c.dyn_fmt());
println("Look, $colorful text$!\n", c.dyn_ansi_fg(), Color::<>::ansi_reset());
}
To Copy or not to Copy (or to Clone)
Take a look at the following code:
use std::*;
fn main() {
let a = 4;
let b = a;
}
As we learned in the chapter about pointers, a
and b
are now (althozgh [sic] with same value) independent objects.
However, this is not always so clear cut:
use std::*;
struct Foo {
x: i32,
r: &i32
}
fn main() {
let v = 4;
let a = Foo { x: 5, r: &v };
let b = a;
}
While Foo
itself gets copied with its fields r
and x
, the underlying value of that reference stays the same and as such both a.r
and b.r
point to the same value and as such a and b are in some way linked!
However, a.x
and b.x
are totally unlinked!
Having such a partially linked struct, while maybe desireable, would in most circumstances be unwanted or confusing.
Therefore in this instance, a
is not actually copied into b
, but moved thereinto and accessing a
after this would result in a compiler error.
use std::*;
struct Foo {
x: i32,
r: &i32
}
fn main() {
let v = 4;
let a = Foo { x: 5, r: &v };
let b = a;
let x = a.x; // Compiler error: value of a was already moved
}
There are 3 way of resolving this issue, depending on whether you want linake or not:
1. Taking a reference (full link)
This is recommended when you actually want to work on the same value, i.e. when passing to a method
use std::*;
struct Foo {
x: i32,
r: &i32
}
fn main() {
let v = 4;
let a = Foo { x: 5, r: &v };
let b = &a; // taking a reference and as such explicitly sharing the value
let x = a.x;
}
Now b
explicitly points to a
, so they are fully linked
2. Implementing core::copy::Copy (partial link)
We can implement the marker trait Copy to explicitly tell the compiler that a primitive copy is desired.
This is recommended when creating primitive datatypes like Color
or an Id
of some kind and is probably not appropiate here,
this is just for the sake of the example.
use std::*;
use core::copy::Copy;
struct Foo {
x: i32,
r: &i32
}
impl Foo: Copy {}
fn main() {
let v = 4;
let a = Foo { x: 5, r: &v };
let b = a; // gets copied like any other primitive value
let x = a.x; // is now alled
}
Now a
and b
are unlinked but a.r
and b.r
point to the same value
3. Implementing core::clone::Clone
This is recommended when you want to have two fully unlinked copies of a complex type, e.g. one with allocated fields like a vector.
use std::*;
use std::mem::new;
use core::clone::Clone;
struct Foo {
x: i32,
r: &i32
}
impl Foo: Clone {
fn clone(self: &Foo) -> Foo {
_ { x: self.x, r: new(*self.r) }
}
}
fn main() {
let v = 4;
let a = Foo { x: 5, r: &v };
let b = a.clone(); // explicitly create unlinked instance
let x = a.x; // is now alled
}
NOTE: A type which is not Copy may also not be dereferenced, as it could be dereferenced multiple times, creating multiple semi linked instances
Drop
Some resources require cleanup after use, e.g. freeing of heap allocations or closing of file handles.
use std::*;
use std::mem::new;
use std::mem::free;
use c_api::printf;
fn main() {
let x: &i32 = new(1); // allocting memory
printf("*%p = %d\n", x, *x);
*x = 2;
printf("*%p = %d\n", x, *x);
free(x); // freeing memory
}
However, this can easily be forgotten and result in a resource leak, or otherwise cause problems later when someone tries to access the same resource again.
The core::drop::Drop
trait, mutually exclusive with core::copy::Copy
, is run recursively for a struct and each of its fields once it goes out of scope.
use std::*;
use c_api::printf;
use core::drop::Drop;
struct Foo {}
impl Foo: Drop {
fn drop(self: &Foo) {
printf("Dropped Foo");
}
}
fn main() {
let f = Foo {}
} // f is dropped!
This even automatically works when f is a field in a struct:
use std::*;
use c_api::printf;
use core::drop::Drop;
struct Foo {}
impl Foo: Drop {
fn drop(self: &Foo) {
printf("Dropped Foo");
}
}
struct Wrapper {
f: Foo
}
fn main() {
let w = Wrapper { f: Foo {} };
} // w is dropped (trivial) and w.f is also dropped!
Note: This does not take lifetimes of references into account and using a reference after the value goes out of scope is undefined behavior, using the same rules as discussed in the chapter about pointers. This holds true for all pointers and is not a drop specific behavior.
use std::*;
use c_api::printf;
use core::drop::Drop;
struct Foo {}
impl Foo: Drop {
fn drop(self: &Foo) {
printf("Dropped Foo");
}
}
fn main() {
let invalid_ref = {
let f = Foo {};
let r = &f;
r
}; // f is dropped but the reference survives
}
Now back to our allocation example, we will build a type which frees allocated memory after it goes out of scope
We call this type a Box
use std::*;
use core::clone::Clone;
use core::drop::Drop;
use std::mem::new;
pub struct Box<T> {
item: &T
}
impl<T> Box<T> {
pub fn new(t: T) -> Box<T> {
_ { item: new(t) }
}
pub fn ref(self: &Box<T>) -> &T {
self.item
}
}
Now we implement the drop trait and try it out:
use std::*;
use core::clone::Clone;
use core::drop::Drop;
use core::intrinsics::short_typename;
use std::mem::new;
use std::mem::free;
use c_api::printf;
pub struct Box<T> {
item: &T
}
impl<T> Box<T> {
pub fn new(t: T) -> Box<T> {
_ { item: new(t) }
}
pub fn ref(self: &Box<T>) -> &T {
self.item
}
}
impl<T> Box<T>: Drop {
pub fn drop(self: &Box<T>) {
printf("%s freed\n", short_typename::<Box<T>>());
free(self.item);
}
}
fn main() {
let b = Box::<_>::new(42); // allocated
*b.ref() = 7;
printf("%d\n", *b.ref());
} // destructor is run and the allocation is freed
Outout:
7
Box2<i32> freed
However, we are not quite done yet, as can be seen by the following scenario:
use std::*;
use core::clone::Clone;
use core::drop::Drop;
use core::intrinsics::short_typename;
use std::mem::new;
use std::mem::free;
use c_api::printf;
pub struct Box<T> {
item: &T
}
impl<T> Box<T> {
pub fn new(t: T) -> Box<T> {
_ { item: new(t) }
}
pub fn ref(self: &Box<T>) -> &T {
self.item
}
}
impl<T> Box<T>: Drop {
pub fn drop(self: &Box<T>) {
printf("%s freed\n", short_typename::<Box<T>>());
free(self.item);
}
}
fn main() {
let b = Box::<_>::new(42);
let bb = Box::<_>::new(b);
}
Output:
Box<Box<i32>> freed
The inner box is not dropped!
As stated previously, all fields are recursively dropped for any noncopy struct, without any exra effort needed.
This however, is a special case since the type of item
is &T
and not `T, therefore only the reference is dropped and not the underlying value itself, requiring us to do a bit of manual work:
use std::*;
use core::clone::Clone;
use core::drop::Drop;
use core::intrinsics::short_typename;
use std::mem::new;
use std::mem::free;
use std::mem::take_unchecked;
use c_api::printf;
pub struct Box<T> {
item: &T
}
impl<T> Box<T> {
pub fn new(t: T) -> Box<T> {
_ { item: new(t) }
}
pub fn ref(self: &Box<T>) -> &T {
self.item
}
}
impl<T> Box<T>: Drop {
pub fn drop(self: &Box<T>) {
take_unchecked(self.item); // takes the value of the item reference and returns it, so that it can be dropped immediately
printf("%s freed\n", short_typename::<Box<T>>());
free(self.item);
}
}
fn main() {
let b = Box::<_>::new(42);
let bb = Box::<_>::new(b);
}
Now the Box finally works:
Box<i32> freed
Box<Box<i32>> freed
Note: This implementation can be found at std::mem::box::Box
Language concepts
The following section lists language constructs
Primitives
Primitive datatypes are defined in core::types and are often used within the compiler.
Unit
The unit
type is a zero sized type and the implicit return type for functions without an explicit return type.
fn main() -> unit { // implicit unit type as return
unit { } // unit implicitly returned
}
Note: unlike void in languages like C or Java, it can be assigned to, passed around and used in generics.
Functionally, it is no different than a struct with no fields, and that is in fact how it is defined.
Note: Use
unit
in ffi where void should be returned. As the type is zero sized, the type is not unititialized if created in such a way.
Ints
An \( n \) bit integer is stored in memory with \( n \) bits or \( \frac{n}{8} \) bytes.
An integer literal is assumed to be of type i32
but can be suffixed by an integer type to override this behavior: 4u8
Note: Use
std::numcast::<T, U>(t: T) -> U
to convert between numeric types (both float and int)
Endianess
Depending on platfom these can be represented in memory in-order, with the highest-order byte first (big endian), or in reverse order, with the highest order byte last (little endian). This is only important when interpreting raw data as numbers or similar operations and does not cause issues when using bitwise ops.
Note: Most modern systems such as Unix or Windows use little endian these days. It is highly unlikely that you will encounter big endian systems in the wild, as these are often highly specialized or deprecated technologies, like Mips or PowerPc. The only time this may become important is when dealing with networking, as network numbers like ip adresses are represented as big endian.
Signed
Fixed size signed integers take the form \( \text{i}n \) and have a range from \( -2^{n-1} \) to \( 2^{n-1}-1 \) and are stored in memory with \( n \) bits or \( \frac{n}{8} \) bytes.
Signed ints use the two's complement representation
usize
This type has the size of a pointer and is used to index array-like structures.i8
i16
i32
Note: This is the type most commonly found in c ffi and also the default type for a number literal. YOu should use i32 in most cases dhat do not require a specific dimension.
i64
i128
Unsigned
Fixed size unsigned integers take the form \( \text{u}n \) and have a range from \( 0 \) to \( 2^n-1 \) and are stored in memory with \( n \) bits or \( \frac{n}{8} \) bytes
isize
This type has the same size asusize
and can be used to compute index deltas.u8
Note:
char
is an alias to u8 and can be used interchanably
u16
u32
u64
u128
Floats
An \( n \) bit float is stored in memory with \( n \) bits or \( \frac{n}{8} \) bytes. Floats conform to the IEEE 754 floating poitn standard.
A float literal aliases to either f32
or f64
as required.
Note: Use
std::numcast::<T, U>(t: T) -> U
to convert between numeric types (both float and int)
f16
Note: This type is only supported as a storage type and will be interpreted as a
f32
during math operations. This conversion might make things slower and as such the usage of this type is discouraged unless memory is of concern.
f32
f64
f128
Note: Due to lack of hardware support this type might be very slow to use. Consider employing different technologies and reflect whether such a precicion is really required.
Booleans
Booleans have a size of one byte.
Strings
c_str
this is the type of string literals and used to interface with c functions
Note: If you want to access individual chars in the string, consider working with slices
Pointers
Multiple pointer types exist and are each used in different circumstances
**Note:**use
std::typecast<T, U>(t: T) -> U
to convert between a pointer typeT
and a pointer typeU
ptr<T>
The most common pointer type- The unary
&
prefix operator turns a typeT
into aptr<T>
(read: pointer of type T) - The unary
&
prefix operator dereferences aptr<T>
into the pointed-to data of type T
- The unary
use std::*;
fn main() {
let int: i32 = 4;
let int_ptr: ptr<i32> = ∫
let int: i32 = *int_ptr;
}
raw_ptr
an opaque ptr type. Used for c ffic_str
a pointer equal toptr<u8>
which points to a string literal.
Note: If you want to access individual chars in the string, consider working with slices
function_ptr<T>
Pointer to a function which returns a value of typeT
The function pointer needs to be wrapped in parenthesis to be called.
Note: The types of the arguments are not conveyed in the type information and you are assumed to know the right type via api convention. As such function pointers should be wrapped in a usecase-specific safer api
use std::*;
use c_api::math::sin;
fn main() {
let sin_ptr: function_ptr<f32> = sin;
let sin_of_one: f32 = (sin_ptr)(1.0);
}
Structs
use std::*;
struct Car {
wheels: u32,
max_speed: f32
}
fn main() {
// creating struct using struct literal syntax
let car = Car { wheels: 4, max_speed: 50.0 };
// accessing fields of a struct
let speed_per_wheel = car.max_speed / numcast(car.wheels);
let car_ptr = &car;
// A struct automatically dereferences when accessing a field for convenience
let speed = car_ptr.max_speed;
// Equivalent to
let speed = (*car_ptr).max_speed;
}
Size and alignment
\( \textrm{let } \{ T_0, ..., T_n \} = \mathbb{T} \textrm{ be a compound struct type, then} \)
\( \text{alignof}(\mathbb{T}) = \left\{ \begin{array}{ c l } 0 & \quad \textrm{if } \mathbb{T} = \emptyset \\ \max_{T_i \in \mathbb{T}}{T_i} & \quad otherwise \end{array} \right. \)
\( \text{offsetof}(T_i) = \left\{ \begin{array}{ c l } 0 & \quad \textrm{if } i = 0 \\ \lceil \frac{\text{offsetof}(T_{i-1})+\text{sizeof}(T_{i-1})}{\text{alignof}(T_{i-1})} \rceil \text{alignof}(T_{i-1}) & \quad otherwise \end{array} \right. \)
\( \text{sizeof}(\mathbb{T}) = \left\{ \begin{array}{ c l } 0 & \quad \textrm{if } \mathbb{T} = \emptyset \\ \lceil \frac{\text{offsetof}(T_n)+\text{sizeof}(T_n)}{\text{alignof}(\mathbb{T})} \rceil \text{alignof}(\mathbb{T}) & \quad otherwise \end{array} \right. \)
Size and alignment for primitives is hardcoded.
Note: A struct is a product type in type theory: \( |\mathbb{T}| = \prod_{T_i \in \mathbb{T}}{ |T_i| } \)
where \( |\mathbb{T}| \) is the number of possible states of \(\mathbb{T}\).
This may not be confused with physical size, as foór example \( |\text{bool}| = 2 \)
but \(\text{sizeof}(bool) = 1 \text{byte} = 8 \text{bit} \). As such, \( \text{sizeof}(\mathbb{T}) \ge |\mathbb{T}| \)
Functions
Functions have zero or more argument types and an optional return type. Is the return type omitted, it is inferred to be unit. Any other type is required to be explicit.
fn average(a: f32, b: f32) -> f32 {
return (a + b)/ 2.0;
}
A block yields it's last value if the semicolon is omitted. As such this function achieves the same thing:
fn average(a: f32, b: f32) -> f32 {
(a + b) / 2.0
}
This is the preferred syntax. return
is only used to return conditionally:
fn choose(condition: bool, a: i32, b: i32) -> f32 {
if condition {
return a;
};
b
}
However, this can be written more nicely:
fn choose(condition: bool, a: i32, b: i32) -> f32 {
if condition {
a
} else {
b
}
}
RAII
Move
Implementation of analysis
NOTE: This section is mostly compiler documentation and not that useful for language use. Better explanation will follow
Values not implementing core::copy::Copy
are dropped at the end of the declaring scope, unless they are moved.
There are rules dictating when a value is valid to move and when it is considered moved. These rules primarily apply to variables since immediate values and function return values can only be accessed directly once.
- By default a variable is free to be moved
- A variable declared inside a block does not get affected by application of rules on outer blocks
- After an if condition, the variable is considered moved if it is moved in at least one of the branches
- A variable defined outside a loop may not be moved in the loop as it could be moved during the next iteration
- A variable may be moved if the current block
break
s orreturn
s unconditionally- a
return
inside a block thatbreak
s orreturn
s still allows movement as it does not weaken exit beahvior - a
break
inside a block thatbreak
s still allows movement as it does not weaken exit behavior - a
break
inside a blockreturn
s prohibits movement as the return might be circumvented - a
continue
inside a block thatbreak
s orreturn
s prohibits movement as the exit might be circumvented
- a
Variables are stored in a stack and each know their position in the stack ###Stack flags and counters
- loop index/threshold
- signals which variables were created inside the loop vs outside
- returning flag
- signals that we are returning in this block or this block is returned
- breaking flag
- signals that we are breaking in this block or this block is returned
- returing move
- move done due to the returning flag, ideally a list of all but a single one is fine since we only report one error at a time anyways
- breaking move
- move done due to the breaking flag, ideally a list of all but a single one is fine since we only report one error at a time anyways
Loops
- store the current size of the stack as a threshold
- unset the unconditional break flag
- clear the old breaking move
- a variable with an index smaller than the threshold was created before the loop and may not be moved if
- the unconditional return flag is not set
- the unconditional break flag is not set
- after resolving the loop
- restore the old threshold
- restore the old break flag
- restore the old breaking move
Conditionals
- create a copy of the stack for each branch
- after resolving each branch, join the stacks back together
- a variable moved in at least one stack is also moved in the main stack
- a variable nowhere moved remains valid and moveable
Return
Returned blocks return { ... };
- the uncondititonal return flag is set
- the return move is reset
- the stack is copied
- we resolve the expression/block
- at the end
- if the unconditional return flag is unset
- we replace the stack variables with the copied stack variables
- the unconditional return flag is restored to its previous state
- the old return move is restored
- if the unconditional return flag is unset
Blocks containing unconditional (direct) return
- we need to look ahead to see whether we find a return or not the next steps are analogous to returned blocks
- any variables after the return follow the same rules as if the block didn't contain return
- we continue evaluation as if we exited a returned block
Break
analogous to return
- the returning invalid flag is set since a return might have been circumvented by this
- any previous move possible due to returning flag is invalid and creates an error
Continue (blocks containing continue)
- the returning flag is unset since a return might have been circumvented by this
- the breaking flag is unset since a break might have been circumvented by this
- any previous move possible due to returning flag or breaking flag is invalid and creates an error
Language concepts
The following section lists various std
api features
std::vec::Vec<T>
Example examples/api/vectors.kdo
TODO: These docs need to be expanded
std::slice::Slice<T>
TODO: These docs need to be expanded
std::hashmap::MashMap<T>
Example examples/api/map.kdo
TODO: These docs need to be expanded
std::option::Option<T>
Example examples/api/optional.kdo
TODO: These docs need to be expanded
std::result::Result<T, E>
TODO: These docs need to be expanded