# RFC: Overrideable allocation functions for H3

* **Authors**: Isaac Brodsky (@isaacbrodsky)
* **Date**: February 7, 2020
* **Status**: Draft

## Overview

This is a proposal for adding a mechanism for users of the H3 library to provide heap allocator instead of the default
malloc implementation.

## Motivation

This will address the following use cases:

* H3 is used inside of another application which has its own heap management scheme. For example, using the allocation
  functions provided by Postgres or the Java Virtual Machine.
* Testing of failure cases of H3, by simulating allocation failures.

Most H3 functions accept memory from the caller in order to avoid this problem. This will still be the preferred way
to handle memory management in H3. Stack allocation is avoided because H3 cannot know whether there is sufficient stack
memory available. (Note that `_kRingInternal`/`kRingDistances` implicitly uses stack allocation because it implements DFS
recursively.)

A few functions in H3 do heap allocate memory because it is not feasible to do otherwise, or as a convenience. The
functions that heap allocate are:

| Function | Reason
| --- | ---
| `kRing`| Convenience wrapper around `kRingDistances`
| `polyfill` | Convenience (could be passed in, requires internal knowledge)
| `compact` | Convenience (could be passed in, requires internal knowledge)
| `h3SetToLinkedGeo` | Requires knowledge of how to initialize the internal struct
| `destroyLinkedPolygon` | Required for `h3SetToLinkedGeo`

## Prior Art

Reading materials to reference:

* [C++ `vector`](http://www.cplusplus.com/reference/vector/vector/) (via templates)
* [SDL](https://discourse.libsdl.org/t/sdl-2-0-7-prerelease/23232) (via `SDL_SetMemoryFunctions`)
* [PostgreSQL](https://www.postgresql.org/docs/10/xfunc-c.html) (via `palloc`)
* [SQLite](https://sqlite.org/malloc.html)

## Approaches

All approaches assume the user has defined the following functions:

```
void* my_malloc(size_t size);
void* my_calloc(size_t count, size_t size);
void my_free(void* pointer);

// TODO: Do we want my_realloc?
```

### Global statics

In this approach, H3 stores the allocation functions in a set of static variables.

```
h3SetAllocator(&my_alloc, &my_calloc, &my_free);

// call into H3 as before
polyfill(geoPolygon, res, out);
```

Pro:
* Allows the user to replace allocators at run time.

Con:
* Not thread safe, or an additional, complicated dependency is needed to ensure thread safety.
* Global state.

### Templates

This approach is similar to how C++ handles allocator replacement in its standard library, by accepting the allocator
as a template argument. However, H3 is written in C and must implement templates using macros.

```
POLYFILL_WITH_ALLLOCATORS(my_polyfill, my_malloc, my_calloc, my_free);

// Call the function created by the template
my_polyfill(geoPolygon, res, out);
```

Pro:
* Allows the user to have multiple allocator replacements in use at once.

Con:
* Exposes a complicated build process to the user in the form of macros.

### Allocator argument approach

In this approach, every function call includes allocators.

```
H3MemoryManager allocFunctions = {
  .malloc = &my_malloc,
  .calloc = &my_calloc,
  .free = &my_free
};

polyfill(geoPolygon, res, out, &allocFunctions);
```

Pro:
* Allowing replacement on a per-call basis allows for maximum control by the user.

Con:
* The user must always specify allocators, which is unlikely to be needed by most users.
* Alternately, additional overloads of all H3 functions that heap-allocate are needed.

### `#define` approach

In this approach, the allocators are specified at build time.

```
# In build process:
cmake -DH3_ALLOC_PREFIX=my_ ...

// in source file, functions are used as before.
```

Alternately, instead of setting a prefix, the build could accept individual options
for functions, such as `-DH3_MALLOC=my_malloc -DH3_CALLOC=my_calloc`. (Although this
could allow a user to accidentally override `malloc` but not `free`, which is generally
very bad.)

Pro:
* Minimal overhead for users and developers when allocator replacement is not needed.

Con:
* Complex allocator replacement (i.e. different allocators for different calls) is possible but requires implementation
  by the user.

## Proposal

`#define` based allocator replacement seems like the clearest and lowest overhead to implement, while still supporting
the full range of use cases. A user could optionally implement a more complicated replacement inside their custom
allocator functions.