pancake

I did (half of) Advent of Code 2023 in Hare

My semi-informed review of the Hare programming language.


Hare is a new programming language with manual memory management and C-like syntax.

Here is an example of Hare source code, pretty much what I've been using for Advent of Code:

use bufio;
use fmt;
use fs;
use io;
use os;
use types;


export fn main() void = {
    if (len(os::args) != 2) {
        fmt::fatalf("usage: {} [FILE]", os::args[0]);
    };

    const input = match (os::open(os::args[1])) {
    case let file: io::file  => yield file;
    case let err:  fs::error =>
        fmt::fatalf(
            "Error opening {}: {}",
            os::args[1], fs::strerror(err),
        );
    };
    defer io::close(input)!;

    const scan = bufio::newscanner(input, types::SIZE_MAX);
    for (let y: i64 = 0; true; y += 1) {
        const line = match (bufio::scan_line(&scan)!) {
        case           io::EOF   => break;
        case let line: const str => yield line;
        };

		fmt::printfln("#{}:\t{}", y, line)!;
    };
};

I started AoC with basically nothing; the only thing I have prepared before starting day one was the language itself. As of December 2023, you cannot install it through Fedora repositories; you have to clone several repositories go install it locally.

To their credit, it is really straightforward.

sudo dnf install glibc-static
D="/tmp/hare-installation"
mkdir -p $D
cd $D && git clone git://c9x.me/qbe.git && \
  cd qbe && \
  make && make check && sudo make install
cd $D && git clone https://git.sr.ht/~sircmpwn/scdoc && \
  cd scdoc && \
  make && make check && sudo make install
cd $D && git clone https://git.sr.ht/~sircmpwn/harec && \
  cd harec && \
  ./configure && make && make check && sudo make install
cd $D && git clone https://git.sr.ht/~sircmpwn/hare && \
  cd hare && \
  cp config.example.mk config.mk && \
  make && make check && sudo make install
rm -rf $D

Having nearly zero experience with non-garbage-collected languages, I got fluent-ish quite fast, on day six I think. That translates to something around 30-40 hours of actively writing in that language. That's less than I expected it to be.

Don't get me wrong, I am still not able to write more complicated code (what's defer free()?), but I can read and understand most of what is happening in someone else's code. I am not sure that would be a thing if I started using Rust this December.

Disclaimer

At the time of writing, the language does not have any versioning scheme in place. So when I refer to Hare, I mean the HEAD (c067d698) commit from Dec 9th 2023. (I know something is cooking up, but that RFC has not been accepted yet.)

The following chapters contain sections, but they are not in an order of significance.

What I did like about Hare

Readability

I like that the code is expressive. Even if you did not read the language tutorial, you understand what each line does. Sure, you don't know what happens in the background, what exactly is returned and whether you need to copy the data or you can keep and free it yourself. But you know what the function does in general and you can follow the flow of the code, knowing nearly nothing.

Connected to that is Python-like syntax for types. There are multiple ways programming languages declare the type of the variable or function input. To me, the colon separator makes perfect sense.

let result: []u32 = [];
let iter: strings::iterator = strings::iter(line);
let Point = struct {coordinates: (i64, i64)};
for (let i = 0z; i < len(numbers); i += 1) { ... };
for (let i: size = 0; i < len(numbers); i += 1) { ... };

Plus, it is not required when the variable will be initialized by a function, just like in Python. In that case, you only do it when you feel like it.

let iter = strings::iter(line);
const tokens = parse_line(line);

Tuple access

Maybe a small thing, but I like how tuples are different from "lists" (unlike in Python). You access the values inside of a tuple using an attribute syntax:

let Point = struct {
	coordinates: (i64, i64),
	color: (u8, u8, u8),
};
fn Point_print_coordinates(p: Point) = {
	fmt::printfln("x={}, y={}",
        p.coordinates.0, p.coordinates.1)!;
};

...and not as when you access an array.

A small thing, but I like how it makes the tuples something more like an anonymous struct rather than immutable array.

Tagged unions

let re: (regex|error) = regex::compile(`[0-9]+`);
const line: (const str|io::EOF) = bufio::scan_line(&scan)!;

Hare's match compares variable types, not their values. You can see that in the first snippet at the beggining of the blog post.

Unions is quite frequent in Python, but it is much more explicit in Hare. In Python, you usually return something|None, the non-ok states are usually error states and they are handled through exceptions. Some Python projects return multiple non-error types through unions, but that's usually an antipattern and terrible to work with (unless they are duck typed and essentially transparent to the user).

Errors

In Hare, errors are fancy strings.

Most of the standard library packages define errors like so:

type error = !str;

Putting an exclamation mark after the statement that may return an error means that you are aware of that:

I think that is a good compromise.

I am split whether I more like tagged union errors or Go's errors in return tuples, but you use them in the same way.

The following snippets in Hare and Go do the same, in similar amount of lines.

const result: (void|fs::error) = os::remove("/tmp/cache.json");
match (result) {
case let err: fs::error =>
    fmt::printf("unable to remove file: {}\n", fs::strerror(err))!;
case void =>
    fmt::print("file was removed\n")!;
};
var err error = os.Remove("/tmp/cache.json")
if (err != nil) {
    _, _ = fmt.Printf("unable to remove file: %s\n", err)
} else {
    _, _ = fmt.Print("file was removed\n")
}

For sure there could be a better example for this (e.g. where you can get different types of errors and want to handle them differently), but the point should be clear.

Considering all my experience with Python, Go and Java, I think I enjoy explicit error handling more than raising exceptions. You are forced to resolve them at every level, instead of trusting some generic try/except at the top of the program tree.

UTF-8 by default

All Hare strings are UTF-8. When you iterate over them, you get runes, not bytes.

That's something old languages struggle with. Like C, which Hare kind of wants to replace. Having multi-byte characters abstracted away is a really great feature of a language, once you start doing non-English-only human-readable I/O.

Standard library is actually pretty good

It has quality-of-life string manipulation functions. It has regex matching.

Following parts of standard library I haven't used (yet), but are great to have available to use by default: getopt, mime, uuid.

Use after free protection

Most of the functions in the documentation say:

The caller must free the return value.

Some do not. On day twelve, I was converting a number in its binary representation: 11 -> "1011". And I was getting empty strings back from the function this was happening in. I finally figured out that strconv::u64tosb's description says

The return value is statically allocated and will be overwritten on subsequent calls; see strings::dup to duplicate the result.

I like that I can't read raw memory by accident.

Helpful compilation error messages

I am not sure how helpful C, C++ or Rust compilation errors are, but I found Hare's good enough to immediately understand what I am doing wrong.

Missing return value:

/aoc/15/part1.ha:47:26: syntax error: expected type (found '=')
47 |	fn hash_line(input: str) = {
   |	                         ^

A semicolon missing (on the end of the line above this one):

/aoc/15/part1.ha:53:5: syntax error: unexpected 'return', expected ';'
53 |	    return result;
   |	    ^

Incorrect return type:

/aoc/15/part1.ha:51:32: error: rvalue type (u64) is not assignable to lvalue (u32)
51 |	        result += hash(raws[r]);
   |	                               ^

Unhandled error:

/aoc/15/part1.ha:44:18: error: Cannot ignore error here
44 |	    fmt::printfln("part 1: {}", sum);
   |	                 ^

Incorrect type hint that does not match the function return type:

/aoc/12/part1.ha:80:10: error: Initializer is not assignable to binding type
80 |	    const re: regex::regex = regex::compile(`[?]`);
   |	         ^

/aoc/12/part1.ha:81:14: error: Expected a tagged union type or a nullable pointer
81 |	    if (re is regex::error) {
   |	             ^

Well-known files for architectures, systems and tests

Files and directories act as flags.

+test, +linux, +freebsd, +aarch64, +x86_64 and so on.

What I found confusing about Hare

This chapter is rather small, because many thing that confused me I ended up not liking, so there were pushed to the chapter below.

The things mentioned here I got used to.

The size type

When doing len() over an array or slice, you get a size type back, not something like u64:

length: size = len(numbers);

For example, types::SIZE_MAX is an type alias to types::U64_MAX. I don't see much value of having separate size alias, but I'd like to be educated on this, there must be a reason.

Type conversions: as vs :

There is the as keyword that extracts a specific type out of a tagged union:

const index: (size|void) = strings::index(haystack, needle);
const index = index as size;

The second line will pass if you really got a size, and abort if not.

Then there is the : expression that performs lossy conversion:

const size = len(numbers): u64;

In handsight it makes sense, and both are quite elegant, but they are (as of today) not properly explained in the language tutorial (imo), and they felt confusing before I understood the difference from the compilation output.

And I'd love to have the ability to define this lossy conversion for any type I created.

// please?
const here: Point = get_current_location();
fmt::printfln("currently at {}", here: str);

The styleguide

I haven't read it before doing the Advent of Code, so I have broken many rules in there.

As any developer, I have my subjective preference about the formatting. I will not be describing them, it would inflate the size of this text too much.

I'd like to see some recommendations on how to name functions that return themselves as strings, for example. Is it struct_string, struct_str, structtostr, or something else?

What I did not like about Hare

The newness of the language

As far as I know, there is only one resource on the whole internet where you can learn anything about the language: https://harelang.org. Being used to read blogposts, Stack Overflow, and other codebases, this was quite a big jump to being able to only use the official documentation (which is mostly without examples) and the source of the language itself.

There are some libraries and projects written in Hare already, but when you need to understand that one specific function, you are kind of screwed.

That happened to me when I wanted to learn how to use sort::sort. The documentation says:

type cmpfunc = fn(a: const *opaque, b: const *opaque) int;

...

fn sort

fn sort(items: []opaque, itemsz: size, cmp: *cmpfunc) void;

Sorts a slice of items in place. This function provides a stable sort - relative order of equal elements is preserved.

Note that this function may (temporarily) allocate, and will abort on allocation failure.

I do not know how to use *opaque to compare my two structs. I would really like to see an example on something that's not numbers or strings.

(I ended up implementing bubblesort for this AoC solution.)

Undocumented known issues

The regex::compile does not cover 100 % of the regex grammar.

(I ended up splitting it to smaller capture groups and using manual string manipulations a bit more).

Allocating big arrays can cause the compiler to get stuck for minutes, which is currently something that you just have to know. I asked on the mailing list and got told it is a known issue.

(I ended up creating the 140*140-items-long list dynamically.)

And this is all fine, the authors are busy people with priorities both elsewhere in the language and elsewhere in the life, but I'd like to see somewhere that this is a thing.

Standard library naming inconsistencies

This is a follow-up to the styleguide subsection above.

There is bufio::newscanner but bufio::scan_line.

There is io::EOF but strings::end.

And to break the promise of not commenting on the style guide: I dislike that structs are lowercase.

Standard library objects not being printable

I am a debug-printing type of person. And I would like to be able to just fmt::printfln() my way through.

As I mentioned above, why not create something that would convert the object to string via : str? It is possible for numbers...

No sets and no maps

In the Hare blog, Drew says:

Hare does not support generics, and our approach to data structures is much like C: DIY.

...

If the absence of a particular data structure truly is your application’s bottleneck, then writing it yourself may ultimately be the better approach. You’ll have to familiarize yourself with the data structures and algorithms that manipulate them, so you can have an intimate understanding of the processes most important to your application. You can also tune and tweak them to keep it lean and mean within your use-case, only making them as complex as your application calls for.

I get it.

But come on, providing generic map wouldn't hurt you, and it would prevent (potentially) hundreds of home-grown map implementations from existing. Doing something from scratch is a option even when there is a generic code in the standard library.

For AoC context, writing the map implementation does not make much sense, I'd rather pay by using an O(n) or even O(n^2) lookups, or by hardcoding the input size directly into the program, making it more tied to my specific puzzle.

Only half of Advent of Code?

I didn't have time to do the daily challenge on 13th and 14th, and I didn't feel like going back. It frequently took me two to six hours to solve both parts, and I figured there are better ways to spend my limited time.

My goal of this December was to learn the basics of Hare, which I successfully managed. I know the principles, the goals and the ways of navigating through existing code.

And then there is the AoC itself. I know what I am good at as a software engineer, and coming up with algorithms is not really my leading ability (or a priority). Personally I find long-term maintenance of larger codebase and managing its architecture much more interesting than writing parsers and maze navigators.

And one more thing; this year's AoC specialized in giving underspecified documentation and examples. I just hate that.