Zig: The Good Parts
Table of contents
- What I wrote
- Variable-width integers
- Overflow
- Comptime
- Packed structs
- Strict pointer types
- Modules
- Switch statement
- Error handling
- Testing
- C compatibility
- Assembly support
- Build system
- Builtin docs
- Multiline strings
- Only-public struct declarations
- No multiline comments
- Unicode identifiers
- Non-profit Zig Software Foundation (ZSF)
- Summary
What I wrote §
- An Apache Arrow library
- Flatbuffers library and code generator
- LZ4
- A TLS1.3 client and server
- An ASN.1 parser
- A generic DateTime library
Now, onto what I liked about Zig while writing them!
Variable-width integers §
Variable-width integers help the programmer and the compiler. Suppose you need
to represent a value between 0 and 128 and choose Zig’s u7
:
- The programmer enjoys overflow checking to verify assertions
- The compiler enjoys the tightest constraint possible to generate fast code
I never really thought of math in terms of bounded ranges until using Zig. Now it seems natural to me because arbitrary precision doesn’t physically exist.
Overflow §
Zig gets arithmetic operators right by exposing the ugly truth: your math can overflow. It assumes that integer types won’t overflow and will panic unless you specify saturating or overflowing arithmetic:
@as(u8, 0xff) + 1; // panic: overflow@as(u8, 0xff) +% 1; // overflow allowed: 0@as(u8, 0xff) +| 1; // saturating allowed: 0xff
It even works with the variable-width types (albeit with ugly generated code):
@as(u7, 0xef) + 1; // panic: overflow
Comptime §
I have found myself in so many situations in C++ or Rust where the templating
language doesn’t allow me to dynamically generate the code I want.
Well, Zig does so via
comptime
and
@Type
.
It’s simple and beautiful:
fn List(comptime T: type) type { // You can do (almost) anything with T. return struct { items: []T, len: usize, };}
var buffer: [10]i32 = undefined;var list = List(i32){ .items = &buffer, .len = 0,};
I really enjoyed using it and its cousins inline for
and inline else
. Also,
arbitrary precision does exist with the comptime_int
type which prevents
compiler bugs and permits the user to do arbitrary integer path.
Sometimes to write a program you really do need to write another program. Might as well do it in the same language!
Packed structs §
In the current memory-constrained world packing structs can boost performance. Thanks to variable-width integers, packed structs have a really nice syntax:
const Divided = packed struct(u16) { half1: u8, quarter3: u4, // Accessed like any other member. quarter4: u4, // Generates code that uses bit masks.};
Strict pointer types §
Arrays are slices by default. As they should be. Slices allow bounds checking, which prevent a whole host of errors.
I like the extra safety of sentinel pointer types. Putting alignment in pointer types is also good for safety. Opaque pointer types are a good compromise for compatibility with C.
These pointer types came in handy when implementing Apache Arrow since its in-memory layout requires aligned pointers pointers. They were also useful for C compatibility.
The official docs have good examples of the syntax and usage.
Modules §
Module systems are the best way for programmers to share code.
They can make or break language ecosystems.
Zig declarations are file-scoped unless you add a pub
prefix.
const foo = 3;pub const bar = 4;
Zig chooses the file system as the module tree and requires explicit imports:
const root = @import("./root.zig");const myBar = root.bar;
You can import other modules by specifying them in build.zig
. The special
std
module is always available:
const std = @import("std");
Dependencies are decentralized tarballs stored with the hashes of their
file contents in zig.zon
.
In a world where tarball hosting is mostly solved, this is a great system.
Switch statement §
Switch statements are exhaustive to catch logic errors. I enjoy the syntax:
switch (foo) { 2...4 => |v| { // love the range and capture }, else => unreachable, // nice!}
…and the special syntax for enums:
switch (my_enum) { .one => true, .two, .three => false, _ => false, // if we add a `.four` this will be a compile error};
Error handling §
Error return traces are the best crash experience you can have.
As for error handling, the ErrorSet!Rettype
and try
syntaxes are just
annoying enough to remind you that they’re there:
const std = @import("std");
pub fn main() !void { // == anyerror!void var file = try std.fs.cwd().openFile("foo.txt", .{}); defer file.close();}
Testing §
Inline unit tests are good for any complex code you want to verify:
const std = @import("std");const expect = std.testing.expect;
test "coercion to error unions" { const x: anyerror!i32 = 1234; const y: anyerror!i32 = error.Failure;
try expect((try x) == 1234); try std.testing.expectError(error.Failure, y);}
You can run them with zig test
, and they end up in the docs!
You also get Valgrind-like memory checking for free at runtime:
const std = @import("std");
fn leaky(allocator: std.mem.Allocator) !void { _ = try allocator.alloc(u8, 100); // defer allocator.free();}
pub fn main() !void { var gpa = std.heap.GeneralPurposeAllocator(.{}){}; defer _ = gpa.deinit(); try leaky(gpa.allocator());}
error(gpa): memory address 0x7a68cc9c0000 leaked:/home/thesm/src/test.zig:4:28: 0x10e08f2 in leaky (test) _ = try allocator.alloc(u8, 100); ^/home/thesm/src/test.zig:11:14: 0x10e0973 in main (test) try leaky(gpa.allocator()); ^/home/thesm/.local/share/zvm/0.14.0/lib/std/start.zig:656:37: 0x10e07fa in posixCallMainAndExit (test) const result = root.main() catch |err| { ^/home/thesm/.local/share/zvm/0.14.0/lib/std/start.zig:271:5: 0x10e03ad in _start (test) asm volatile (switch (native_arch) {
C compatibility §
@cImport
and friends work surprisingly well, even on hacky macros.
Exporting C libraries naturally works well.
This is a win because of the large existing C ABI ecosystem.
Assembly support §
Often required for systems programmers:
pub fn syscall1(number: usize, arg1: usize) usize { return asm volatile ("syscall" : [ret] "={rax}" (-> usize), : [number] "{rax}" (number), [arg1] "{rdi}" (arg1), : "rcx", "r11" );}
Build system §
Zig is the best build system for binaries that I’ve ever used. You write Zig to build Zig — no need for an extra declarative language. You can hook into whatever you want and have an entire systems language at your disposal.
Zig supports tons of target triplets, including GPUs. You can build any target from any host. The task runner builds a dependency graph and parallelizes tasks across your cores.
It’s a breath of fresh air. Why don’t all languages have first-class build tools?
Builtin docs §
Zig has different syntax for documenting files and declarations. It generates reasonable HTML documentation.
//! Date types for my cool app. This will be a module comment.
/// Seconds and nanoseconds.const Timestamp = struct { /// The number of seconds since the epoch seconds: i64, /// The number of nanoseconds past the second nanos: u32, /// Returns a `Timestamp` struct representing the Unix epoch; that is, the /// moment of 1970 Jan 1 00:00:00 UTC pub fn unixEpoch() Timestamp { return Timestamp{ .seconds = 0, .nanos = 0, }; }};
You can’t learn the language or use other people’s code without good docs!
Multiline strings §
No other language I know lets me keep my tabulation with multiline strings.
function foo() { const string = `thisis NOT tabbed over and isdifficult to read and write`;}
void foo() { char* string = "this\n" + "is tabbed over, BUT\n" + "requires trailing newlines which are readable,\n" + "but annoying to write";}
fn foo() { const string = \\this \\is tabbed over and is \\easy to read and write \\...except for that pesky trailing semicolon ;}
Only-public struct declarations §
You cannot hide struct fields. You must trust your users. I LOVE this because your users can always workaround you by forking your code.
const Foo = struct { bar: i32, // There is no way to hide this field. fn baz(self: @This()): void {} // However, you can hide functions. pub fn bing(self: @This()): void {} // ...or expose functions.};
No multiline comments §
This makes tokenization simpler for both humans and machines.
/** * This multiline comment is valid in most C-style languages, but invalid in * Zig. I personally dislike these comments because while they're perfectly * readable, they're not perfectly writeable without a smart editor inserting * extra text for you. Yuck. */
Unicode identifiers §
These can come in handy with codegen:
const @"~@#$%^😊" = 32;
Or with math:
const @"π" = 3.14159265;
…although the @"
and "
syntax can get in your way.
Non-profit Zig Software Foundation (ZSF) §
This is a boon once your project enters maintenance mode. You don’t want Zig disappearing or only a single person maintaining it with sketchy funding.
A US non-profit is a great way to transparently handle funds for a mission statement that aligns with Zig users.
Summary §
Zig is currently the best systems programming language to write. I’m not sure it will overtake C, but it sure will try! Keep your eye on it as it works towards its 1.0.0 release.