Using Unsafe for Fun and Profit

Given Rust's popularity and position as a systems programming language, you'll probably reach a point where you want to integrate a Rust module into some existing application. This guide was created to fill the current gap in knowledge when it comes to doing more in-depth FFI tasks than simply calling one or two functions from a C library.

This guide is written from the perspective of someone implementing a simple REST client. The client lets you craft custom HTTP messages and send them to some server, allowing you to inspect the response. It is composed of a Qt GUI which calls out to a Rust library for all of the business logic.

We'll be using cmake as the host build system, deferring to cargo to manage and compile the Rust components. The guide was originally written on a linux machine, but there is no reason why it shouldn't work on Windows or Mac, possibly with a couple small platform-specific tweaks (filenames, etc.).

TODO: Insert final screenshot here

Useful Links and References

Here are a couple links and resources which you may find useful along the way.

Objectives

The end objectives of this guide are:

  • Integrate cargo into a wider build system
  • Call Rust functions from C++ (or any other language)
  • Passing strings, structs, and arrays between Rust and C++
  • Robust error handling and exception safety
  • Creating a C interface for a Rust library
  • Multithreading and asynchronous programming (because we'll need to wait for the server's response without blocking the UI)
  • Create flexible abstractions which encapsulate common patterns used when writing foreign function interfaces.

The ffi-helpers crate was written in parallel with this guide. It takes advantage of the patterns and abstractions we'll come up with and allows you to reuse them for your own application.

Setting Up

Before we can start doing any coding we need to get a build environment set up and run a hello world program to check everything works.

This chapter will cover:

  • Setting up a C++ build system
  • Integrating cargo into the build system transparently
  • A "hello world" to test that C++ can call Rust functions

Setting up Qt and the Build System

First, create a new cmake project in a directory of your choosing.

$ mkdir rest_client && cd rest_client
$ mkdir gui
$ touch gui/main.cpp
$ touch CMakeLists.txt

You'll then want to make sure your CMakeLists.txt file (the file specifying the project and build settings) looks something like this.

# CMakeLists.txt

cmake_minimum_required(VERSION 3.7)
project(rest-client)

enable_testing()
add_subdirectory(client)
add_subdirectory(gui)

This says we're building a project called rest-client that requires at least cmake version 3.7. We've also enabled testing and added two subdirectories to the project (client and gui).

Our main.cpp is still empty, lets rectify that by adding in a button.

// gui/main.cpp

#include <QtWidgets/QPushButton>
#include <QtWidgets/QApplication>

int main(int argc, char **argv) {
  QApplication app(argc, argv);

  QPushButton button("Hello World");
  button.show();

  app.exec();
}

We need to add a CMakeLists.txt to the gui/ directory to let cmake know how to build our GUI.

# gui/CMakeLists.txt

set(CMAKE_CXX_STANDARD 14)

set(CMAKE_AUTOMOC ON)
set(CMAKE_AUTOUIC ON)
set(CMAKE_AUTORCC ON)
set(CMAKE_INCLUDE_CURRENT_DIR ON)
find_package(Qt5Widgets)

set(SOURCE main.cpp)
add_executable(gui ${SOURCE})
target_link_libraries(gui Qt5::Widgets)
add_dependencies(gui client)

This is mostly concerned with adding the correct options so Qt's meta-object compiler can do its thing and we can locate the correct Qt libraries, however right down the bottom you'll notice that we create a new executable with add_executable(). This says our gui target has right now one source file, main.cpp. It also needs to link to Qt5::Widgets and depends on our client (the Rust library), which hasn't yet been configured.

Building Rust with CMake

Next we need to create the Rust project.

$ cargo new --lib client

To make it accessible from C++ we need to make sure cargo generates a dynamically linked library. This is just a case of tweaking our Cargo.toml to tell cargo we're creating a cdylib instead of the usual library format.

# client/Cargo.toml

[package]
name = "client"
version = "0.1.0"
authors = ["Michael Bryan <michaelfbryan@gmail.com>"]
description = "The business logic for a REST client"
repository = "https://github.com/Michael-F-Bryan/rust-ffi-guide"

[dependencies]

[lib]
crate-type = ["cdylib"]

If you then compile the project you'll see cargo build a shared object (libclient.so) instead of the normal *.rlib file.

$ cargo build
$ ls target/debug/
build  deps  examples  incremental  libclient.d  libclient.so  native

Note: You don't technically need to make a dynamic library (cdylib) for your Rust code to be callable from other languages. You can always use static linking with a staticlib, however that can be a bit more annoying to set up because you need to remember to link in a bunch of other things that the Rust standard library uses (mainly libc and the C runtime).

With a dynamic library all the work for dependency resolution is handled by the loader when your program gets loaded into memory on startup. Meaning things should Just Work.

Now we know the Rust compiles natively with cargo, we need to hook it up to cmake. We do this by writing a CMakeLists.txt in the client/ directory. As a general rule, you'll have one CMakeLists.txt for every "area" of your code. This usually up being one per directory, but not always.

# client/CMakeLists.txt

if (CMAKE_BUILD_TYPE STREQUAL "Debug")
    set(CARGO_CMD cargo build)
    set(TARGET_DIR "debug")
else ()
    set(CARGO_CMD cargo build --release)
    set(TARGET_DIR "release")
endif ()

set(CLIENT_SO "${CMAKE_CURRENT_BINARY_DIR}/${TARGET_DIR}/libclient.so")

add_custom_target(client ALL
    COMMENT "Compiling client module"
    COMMAND CARGO_TARGET_DIR=${CMAKE_CURRENT_BINARY_DIR} ${CARGO_CMD} 
    COMMAND cp ${CLIENT_SO} ${CMAKE_CURRENT_BINARY_DIR}
    WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR})
set_target_properties(client PROPERTIES LOCATION ${CMAKE_CURRENT_BINARY_DIR})

add_test(NAME client_test 
    COMMAND cargo test
    WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR})

This is our first introduction to the difference between a debug and release build. So we know whether to compile our program using different optimisation levels and debug symbols, cmake will set a CMAKE_BUILD_TYPE variable containing either Debug or Release.

Here we're just using an if statement to set the cargo build command and the target directory, then using those to add a custom target which will first build the library, then copy the generated binary to the CMAKE_BINARY_DIR.

For good measure, lets add a test (client_test) which lets cmake know how to test our Rust module.

To make sure cargo puts all compiled artefacts in the correct spot within build/, we set the CARGO_TARGET_DIR environment variable while invoking the CARGO_CMD. The compiled library is then copied from into CMAKE_CURRENT_BINARY_DIR and we set the LOCATION property on the overall target to be CMAKE_CURRENT_BINARY_DIR.

The purpose of that little dance is so that no matter what type of build (release or debug) we do, the compiled library will be in the same spot. We then set the client target's LOCATION property so that anyone else who needs to use client's outputs knows which directory they'll be in.

Now we know where the compiled client module will be, we can tell our gui to link to it.

# gui/CMakeLists.txt

...

set(SOURCE main.cpp)
add_executable(gui ${SOURCE})
+ get_target_property(CLIENT_DIR client LOCATION)
target_link_libraries(gui Qt5::Widgets)
+ target_link_libraries(gui ${CLIENT_DIR}/libclient.so)
add_dependencies(gui client)

Now we can compile and run this basic program to make sure everything is working. You'll probably want to create a separate build/ directory so you don't pollute the rest of the project with random build artefacts.

$ mkdir build && cd build
$ cmake ..
$ make
$ ./gui/gui

Calling Rust from C++

So far we've just made sure everything compiles, however the C++ and Rust code are still completely independent. The next task is to check the Rust library is linked to properly by calling a function from C++.

First we add a dummy function to the lib.rs.


# #![allow(unused_variables)]
#fn main() {
#[no_mangle]
pub extern "C" fn hello_world() {
    println!("Hello World!");
}
#}

There's a lot going on here, so lets step through it bit by bit.

The #[no_mangle] attribute indicates to the compiler that it shouldn't mangle the function's name during compilation. According to Wikipedia, name mangling:

In compiler construction, name mangling (also called name decoration) is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

It provides a way of encoding additional information in the name of a function, structure, class or another datatype in order to pass more semantic information from the compilers to linkers.

The need arises where the language allows different entities to be named with the same identifier as long as they occupy a different namespace (where a namespace is typically defined by a module, class, or explicit namespace directive) or have different signatures (such as function overloading).

TL:DR; it's a way for compilers to generate multiple instances of a function which accepts different types or parameters. Without it we wouldn't be able to have things like generics or function overloading without name clashes.

If this function is going to be called from C++ we need to specify the calling convention (the extern "C" bit). This tells the compiler low level things like how arguments are passed between functions. By far the most common convention is to "just do what C does".

The rest of the function declaration should be fairly intuitive.

After recompiling (cd build && cmake .. && make) you can inspect the generated binary using nm to make sure the hello_world() function is there.

$ nm libclient.so | grep ' T '
0000000000003330 T hello_world          <-- the function we created
00000000000096c0 T __rdl_alloc
00000000000098d0 T __rdl_alloc_excess
0000000000009840 T __rdl_alloc_zeroed
0000000000009760 T __rdl_dealloc
0000000000009a20 T __rdl_grow_in_place
0000000000009730 T __rdl_oom
0000000000009780 T __rdl_realloc
0000000000009950 T __rdl_realloc_excess
0000000000009a30 T __rdl_shrink_in_place
0000000000009770 T __rdl_usable_size
0000000000015ad0 T rust_eh_personality

The nm tool lists all the symbols in a binary as well as their addresses (the hex bit in the first column) and what type of symbol they are. All functions are in the Text section of the binary, so you can use grep to view only the exported functions.

Now we have a working library, why don't we make the GUI program less like a contrived example and more like a real-life application?

The first thing is to pull our main window out into its own source files.

$ touch gui/main_window.hpp gui/main_window.cpp
# gui/CMakeLists.txt

...

- set(SOURCE main.cpp)
+ set(SOURCE main_window.cpp main_window.hpp main.cpp)
add_executable(gui ${SOURCE})
// gui/main_window.hpp

#include <QtWidgets/QMainWindow>
#include <QtWidgets/QPushButton>

class MainWindow : public QMainWindow {
  Q_OBJECT

public:
  MainWindow(QWidget *parent = nullptr);
private slots:
  void onClick();

private:
  QPushButton *button;
};

Here we've declared a MainWindow class which contains our trusty QPushButton and has a single constructor and click handler.

We also need to fill out the MainWindow methods and hook up the button's released signal to our onClick() click handler.

// gui/main_window.cpp

#include "main_window.hpp"

extern "C" {
void hello_world();
}

void MainWindow::onClick() { 
    // Call the `hello_world` function to print a message to stdout
    hello_world(); 
}

MainWindow::MainWindow(QWidget *parent) : QMainWindow(parent) {
  button = new QPushButton("Click Me", this);
  
  // Connect the button's `released` signal to `this->onClick()`
  connect(button, SIGNAL(released()), this, SLOT(onClick()));
}

Don't forget to update main.cpp to use the new MainWindow.

// gui/main.cpp

#include "main_window.hpp"
#include <QtWidgets/QApplication>

int main(int argc, char **argv) {
  QApplication app(argc, argv);

  MainWindow mainWindow;
  mainWindow.show();

  app.exec();
}

Now when you compile and run ./gui, "Hello World" wil be printed to the console every time you click on the button.

If you got to this point then congratulations, you've just finished the most difficult part - getting everything to build!

The Core Client Library

Before we can do anything else we'll need to create the core client library that the GUI calls into. To reduce the amount of state being maintained, each request will create a new reqwest::Client and accept a Request object, returning some generic Response.

This isn't overly specific to doing FFI, in fact we probably won't write any FFI bindings or C++ in this chapter. That said, it's still a very important stage because poor architecture decisions here can often make life hard for you down the road. In general, making the interface as small and high level as possible will vastly reduce the implementation complexity.

The first thing to do is set up error handling using error-chain. I have cargo-edit installed (cargo install cargo-edit), so adding it to my Cargo.toml is as simple as running

$ cargo add error-chain

You'll then need to add the corresponding extern crate statement to lib.rs. While you're at it, add also the reqwest, cookie, chrono, fern, log and libc crates both to Cargo.toml and lib.rs, as we are going to use them as well afterwards.


# #![allow(unused_variables)]
#fn main() {
// client/src/lib.rs

extern crate chrono;
extern crate cookie;
#[macro_use]
extern crate error_chain;
extern crate fern;
extern crate libc;
#[macro_use]
extern crate log;
extern crate reqwest;
#}

Now create an errors.rs module.


# #![allow(unused_variables)]
#fn main() {
// client/src/errors.rs

error_chain!{
    foreign_links {
        Reqwest(::reqwest::Error);
    }
}
#}

First lets create a Request object;


# #![allow(unused_variables)]
#fn main() {
// client/src/request.rs

use cookie::CookieJar;
use reqwest::{self, Method, Url};
use reqwest::header::{Cookie, Headers};


/// A HTTP request.
#[derive(Debug, Clone)]
pub struct Request {
    pub destination: Url,
    pub method: Method,
    pub headers: Headers,
    pub cookies: CookieJar,
    pub body: Option<Vec<u8>>,
}
#}

Add a constructor method as used by request_create().


# #![allow(unused_variables)]
#fn main() {
impl Request {
    pub fn new(destination: Url, method: Method) -> Request {
        let headers = Headers::default();
        let cookies = CookieJar::default();
        let body = None;

        Request {
            destination,
            method,
            headers,
            cookies,
            body,
        }
    }
}
#}

We'll also need to be able to convert our Request into a reqwest::Reqwest before we can send it so lets add a helper method for that.


# #![allow(unused_variables)]
#fn main() {
impl Request {
    pub(crate) fn to_reqwest(&self) -> reqwest::Request {
        let mut r = reqwest::Request::new(self.method.clone(), self.destination.clone());

        r.headers_mut().extend(self.headers.iter());

        let mut cookie_header = Cookie::new();

        for cookie in self.cookies.iter() {
            cookie_header.set(cookie.name().to_owned(), cookie.value().to_owned());
        }
        r.headers_mut().set(cookie_header);

        r
    }
}
#}

We also want to create our own vastly simplified Response so it can be accessed by the C++ GUI, it gets a helper method too.


# #![allow(unused_variables)]
#fn main() {
// client/src/response.rs

use std::io::Read;
use reqwest::{self, StatusCode};
use reqwest::header::Headers;

use errors::*;


#[derive(Debug, Clone)]
pub struct Response {
    pub headers: Headers,
    pub body: Vec<u8>,
    pub status: StatusCode,
}

impl Response {
    pub(crate) fn from_reqwest(original: reqwest::Response) -> Result<Response> {
        let mut original = original.error_for_status()?;
        let headers = original.headers().clone();
        let status = original.status();

        let mut body = Vec::new();
        original
            .read_to_end(&mut body)
            .chain_err(|| "Unable to read the response body")?;

        Ok(Response {
            status,
            body,
            headers,
        })
    }
}
#}

Note: everything in a Request and Response has been marked as public because it's designed to be a dumb container of everything necessary to build a request.

To help out with debugging the FFI bindings later on we'll add in logging via the log and fern crates. In a GUI program it's often not feasible to add in println!() statements and logging is a great substitute. Having a log file is also quite useful if you want to look back over a session to see what requests were sent and what the server responded with.


# #![allow(unused_variables)]
#fn main() {
// client/src/utils.rs

use std::sync::{Once, ONCE_INIT};
use fern;
use log::LogLevelFilter;
use chrono::Local;

use errors::*;

/// Initialize the global logger and log to `rest_client.log`.
///
/// Note that this is an idempotent function, so you can call it as many
/// times as you want and logging will only be initialized the first time.
#[no_mangle]
pub extern "C" fn initialize_logging() {
    static INITIALIZE: Once = ONCE_INIT;
    INITIALIZE.call_once(|| {
        fern::Dispatch::new()
            .format(|out, message, record| {
                let loc = record.location();

                out.finish(format_args!(
                    "{} {:7} ({}#{}): {}{}",
                    Local::now().format("[%Y-%m-%d][%H:%M:%S]"),
                    record.level(),
                    loc.module_path(),
                    loc.line(),
                    message,
                    if cfg!(windows) { "\r" } else { "" }
                ))
            })
            .level(LogLevelFilter::Debug)
            .chain(fern::log_file("rest_client.log").unwrap())
            .apply()
            .unwrap();
    });
}
#}

Initializing logging will usually panic if you call it multiple times, therefore we're using std::sync::Once so that initialize_logging() will only ever set up fern once.

The logging initializing itself looks pretty gnarly, although that's mainly because of the large format_args!() statement and having to make sure we add in line endings appropriately.

We'll also add a backtrace() helper to the utils module. This just takes an Error and iterates through it, logging a nice stack trace.


# #![allow(unused_variables)]
#fn main() {
// client/src/utils.rs

/// Log an error and each successive thing which caused it.
pub fn backtrace(e: &Error) {
    error!("Error: {}", e);

    for cause in e.iter().skip(1) {
        warn!("\tCaused By: {}", cause);
    }
}
#}

We'll also create a generic send_request() function which takes a Request object and sends it, retrieving the resulting Response. Thanks to our two helper functions the implementation is essentially trivial (modulo some logging stuff).


# #![allow(unused_variables)]
#fn main() {
// client/src/lib.rs

use reqwest::Client;
pub use request::Request;
pub use response::Response;
use errors::*;


/// Send a `Request`.
pub fn send_request(req: &Request) -> Result<Response> {
    info!("Sending a GET request to {}", req.destination);
    if log_enabled!(::log::LogLevel::Debug) {
        debug!("Sending {} Headers", req.headers.len());
        for header in req.headers.iter() {
            debug!("\t{}: {}", header.name(), header.value_string());
        }
        for cookie in req.cookies.iter() {
            debug!("\t{} = {}", cookie.name(), cookie.value());
        }

        trace!("{:#?}", req);
    }

    let client = Client::builder()
        .build()
        .chain_err(|| "The native TLS backend couldn't be initialized")?;

    client
        .execute(req.to_reqwest())
        .chain_err(|| "The request failed")
        .and_then(|r| Response::from_reqwest(r))
}
#}

You'll notice that chain_err() has been used whenever anything may fail. This allows us to give the user some sort of stack trace of errors and what caused them, providing a single high level error message (i.e. "The native TLS backend couldn't be initialized"), while still retaining the low level context if they want to drill down and find out exactly what went wrong.

This method of error handling ties in quite nicely with the backtrace() helper defined earlier. As you'll see later on, they can prove invaluable for debugging issues when passing things between languages.

Register the four new modules in lib.rs.


# #![allow(unused_variables)]
#fn main() {
// client/src/lib.rs

pub mod errors;
pub mod utils;
mod request;
mod response;
#}

Now we've got something to work with, we can start writing some FFI bindings.

Constructing a Basic Request

In this step we want to construct a very simple Request which we can later use to tell the client module to fetch http://google.com/. This requires roughly three steps:

  • Create a C interface which exposes our Rust Request in a way that can be used and manipulated from our C++ application,
  • Write a thin C++ wrapper class which gives us an abstraction over the raw C-style interface, and
  • Update the form so it can accept user inputs and create our Request.

We'll also touch on the following topics:

  • Exposing a FFI interface in Rust
  • Calling Rust functions from C++
  • Passing strings back and forth across the FFI barrier
  • Passing an opaque Rust struct to C++ and ensuring it gets free'd at the correct time

Creating the C Interface

First we need to add a couple small extern "C" functions to the Rust client module. The easiest way to do this is by creating a separate ffi.rs module to isolate all unsafe code to one place.

The bare minimum we need to do at this point is create a constructor and destructor for Request. The constructor can take in the target URL (as a char * string) and then fill in all the other fields with their defaults.

Because our Request contains Rust-specific things like generics we need to hide it behind a raw pointer. This is actually pretty easy to do; you move the Request to the heap with Box::new(), then call Box::into_raw() to get a raw pointer to the Request. The dangerous part here is that the compiler will no longer make sure the Request is destroyed once it goes out of scope, so we need to drop it manually.

By far the most annoying bit in the constructor will be converting a raw C string into a valid Url. This requires a couple transformations along the way, all of which may fail, and we need to make sure this is dealt with correctly so the program doesn't blow up at runtime.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

//! The foreign function interface which exposes this library to non-Rust 
//! languages.

use std::ffi::CStr;
use std::ptr;
use libc::c_char;
use reqwest::{Url, Method};

use Request;


/// Construct a new `Request` which will target the provided URL and fill out 
/// all other fields with their defaults.
/// 
/// # Note
/// 
/// If the string passed in isn't a valid URL this will return a null pointer.
/// 
/// # Safety
/// 
/// Make sure you destroy the request with [`request_destroy()`] once you are
/// done with it.
/// 
/// [`request_destroy()`]: fn.request_destroy.html
#[no_mangle]
pub unsafe extern "C" fn request_create(url: *const c_char) -> *mut Request {
    if url.is_null() {
        return ptr::null_mut();
    }

    let raw = CStr::from_ptr(url);

    let url_as_str = match raw.to_str() {
        Ok(s) => s,
        Err(_) => return ptr::null_mut(),
    };

    let parsed_url = match Url::parse(url_as_str) {
        Ok(u) => u,
        Err(_) => return ptr::null_mut(),
    };

    let req = Request::new(parsed_url, Method::Get);
    Box::into_raw(Box::new(req))
}
#}

That looks like a large chunk of code, but the vast majority is either documentation for indicating constraints which need to be maintained, or error handling. You can see that we use the CStr type from the std::ffi module which acts as a safe wrapper around a C string. We then convert the CStr to a normal str which may fail if the string isn't UTF-8, returning a null pointer (using the ptr::null_mut() helper) to indicate failure.

Converting from a str to a Url is almost identical.

Finally we can create the Request using Request::new(), then box it and return a raw pointer to the Request to the caller.

We also inserted a check for null pointers at the top as a bit of a sanity check.

The destructor is significantly easier to write. All we need to do is accept a raw pointer to some Request, convert it back to a Box with Box::from_raw(), then the Box<Request> can either be explicitly dropped or allowed to fall out of scope to destroy it like normal.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

/// Destroy a `Request` once you are done with it.
#[no_mangle]
pub unsafe extern "C" fn request_destroy(req: *mut Request) {
    if !req.is_null() {
        drop(Box::from_raw(req));
    }
}
#}

You will notice that both functions were prefixed with request_. This is a common convention used to indicate that the function "belongs" to some type, conceptual the equivalent of a normal method.

Note the new module as a public one in lib.rs.


# #![allow(unused_variables)]
#fn main() {
// client/src/lib.rs

pub mod ffi;
#}

The C++ Wrapper

Although we could use the raw C-style FFI bindings throughout this application, that usually ends up with non-idiomatic and more error-prone code. Instead, it'd be really nice if we could use C++'s destructors to ensure memory gets free'd appropriately, as well as the ability to use methods to group functions logically.

We'll put the definition for these wrappers in their own wrappers.hpp header file so the main application only uses the public interface. For now we'll only create a constructor and destructor.

// gui/wrappers.hpp

#include <string>

class Request {
public:
  Request(const std::string);
  ~Request();

private:
  void *raw;
};

The implementation is equally as trivial. It just declares that there are a couple external functions somewhere that we want to use, and the linker can resolve them for us at link time.

// gui/wrappers.cpp

#include "wrappers.hpp"

extern "C" {
void *request_create(const char *);
void request_destroy(void *);
}

Request::Request(const std::string url) {
  raw = request_create(url.c_str());
  if (raw == nullptr) {
    throw "Invalid URL";
  }
}

Request::~Request() { request_destroy(raw); }

Note: You may have noticed that even though request_create() accepts a raw C-style string (char *), the Request wrapper's constructor takes in a normal std::string.

This is what we were talking about earlier about wrappers being more idiomatic and easier to use. It may sound like a trivial thing now, but in real projects where the application is much more complex and has many moving parts, an idiomatic class is much less likely to introduce bugs because the users won't need to read through a load of source code to see how to use it. Everything will Just Work.

We will also need to update the CMakeLists.txt file for our gui/ directory so that these new files are compiled in.

# gui/CMakeLists.txt

set(CMAKE_CXX_STANDARD 14)

set(CMAKE_AUTOMOC ON)
set(CMAKE_AUTOUIC ON)
set(CMAKE_AUTORCC ON)
set(CMAKE_INCLUDE_CURRENT_DIR ON)
find_package(Qt5Widgets)

set(SOURCE main_window.cpp main_window.hpp wrappers.cpp wrappers.hpp main.cpp)
add_executable(gui ${SOURCE})
get_target_property(CLIENT_DIR client LOCATION)
target_link_libraries(gui Qt5::Widgets)
target_link_libraries(gui ${CLIENT_DIR}/libclient.so)
add_dependencies(gui client)

As a sanity check to make sure everything is working and that memory is being free'd properly. By far the easiest way to do this is to update the GUI's click handler to create a new C++ Request and add a bunch of print statements to ffi.rs to see what actually gets called.

The updated main_window.cpp:

// gui/main_window.cpp

#include "main_window.hpp"
#include "wrappers.hpp"
#include <iostream>

void MainWindow::onClick() {
  std::cout << "Creating the request" << std::endl;
  Request req("https://google.com/");
  std::cout << "Request created in C++" << std::endl;
}

...

And ffi.rs:


# #![allow(unused_variables)]
#fn main() {
#[no_mangle]
pub unsafe extern "C" fn request_create(url: *const c_char) -> *mut Request {
    ...

    println!("Request created in Rust: {}", url_as_str);
    Box::into_raw(Box::new(req))
}

...

#[no_mangle]
pub unsafe extern "C" fn request_destroy(req: *mut Request) {
    if !req.is_null() {
        println!("Request was destroyed");
        drop(Box::from_raw(req));
    }
}
#}

If you compile and run the GUI program then click our button you should see something like the following printed to stdout.

$ cmake .. && make
$ ./gui/gui
Creating the request
Request created in Rust: https://google.com/
Request created in C++
Request was destroyed
Creating the request
Request created in Rust: https://google.com/
Request created in C++
Request was destroyed

This tells us that the request is being constructed and that the URL was passed to Rust correctly, and that it is also being destroyed when the C++ Request falls out of scope.

This little test also shows how easy it is to interoperate between C++ and Rust. Sure, it may be a little annoying to create wrappers and FFI bindings but looking at it differently, this allows us to create a very definitive line, separating the GUI code from the HTTP client module.

Sending the Request

Now we can create a Request it'd be nice if we could actually send it and get back a response that can be displayed to the user. This will require bindings to the send_request() function in our Rust client module. While we're at it we also need a wrapper which lets us access the response body (as a bunch of bytes) and destroy it when we're done.

This chapter will cover:

  • Passing arrays between languages (our response is a byte buffer)
  • MOAR wrappers
  • Fleshing out the Qt GUI

Rust FFI Bindings

The FFI bindings for send_request() are dead simple. We do a null pointer sanity check, pass the Request to our send_request() function, then box up the response so it can be returned to the caller.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

use {Response, Request, send_request};

...

/// Take a reference to a `Request` and execute it, getting back the server's 
/// response.
/// 
/// If something goes wrong, this will return a null pointer. Don't forget to 
/// destroy the `Response` once you are done with it!
#[no_mangle]
pub unsafe extern "C" fn request_send(req: *const Request) -> *mut Response {
    if req.is_null() {
        return ptr::null_mut();
    }

    let response = match send_request(&*req){
        Ok(r) => r,
        Err(_) => return ptr::null_mut(),
    };

    Box::into_raw(Box::new(response))
}
#}

You'll notice the funny &*req when calling send_request(). This converts a raw pointer into a normal borrow by dereferencing and immediately reborrowing. The only reason this function is unsafe is because this dereferencing has the possibility of blowing up if the pointer passed in points to invalid memory.

The destructor for a Response is equally as trivial - in fact it's pretty much the exact same as our Request destructor.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

/// Destroy a `Response` once you are done with it.
#[no_mangle]
pub unsafe extern "C" fn response_destroy(res: *mut Response) {
    if !res.is_null() {
        drop(Box::from_raw(res));
    }
}
#}

Getting the body response is a little trickier. We could give C++ a pointer to the body and tell it how long the body is, however that introduces the possibility that C++ will keep a reference to it after the Response is destroyed. Further attempts to read the body will be a use-after-free and cause the entire application to crash.

Instead, it'd be better to give C++ its own copy of the response body so it can be destroyed whenever it wants to. This involves a two-stage process where we first ask how long the body is so we can allocate a large enough buffer, then we'll give Rust a pointer to that buffer (and its length) so the body can be copied across.

The length function is easiest, so lets create that one first.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

use libc::{c_char, size_t};

...

/// Get the length of a `Response`'s body.
#[no_mangle]
pub unsafe extern "C" fn response_body_length(res: *const Response) -> size_t {
    if res.is_null() {
        return 0;
    }

    (&*res).body.len() as size_t
}
#}

To copy the response body to some buffer supplied by C++ we'll want to first turn it from a pointer and a length into a more Rust-ic &mut [u8]. Luckily the slice::from_raw_parts_mut() exists for just this purpose. We can then do the usual length checks before using ptr::copy_nonoverlapping() to copy the buffer contents across.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

use libc::{c_char, c_int, size_t};
use std::slice;

...

/// Copy the response body into a user-provided buffer, returning the number of
/// bytes copied.
///
/// If an error is encountered, this returns `-1`.
#[no_mangle]
pub unsafe extern "C" fn response_body(
    res: *const Response,
    buffer: *mut c_char,
    length: size_t,
) -> c_int {
    if res.is_null() || buffer.is_null() {
        return -1;
    }

    let res = &*res;
    let buffer: &mut [u8] = slice::from_raw_parts_mut(buffer as *mut u8, 
                                                      length as usize);

    if buffer.len() < res.body.len() {
        return -1;
    }

    ptr::copy_nonoverlapping(res.body.as_ptr(), 
                             buffer.as_mut_ptr(), 
                             res.body.len());

    res.body.len() as c_int
}
#}

In general, whenever you are wanting to pass data in the form of arrays from one language to another, it's easiest to ask the caller to provide some buffer the data can be written into. If you were to instead return a Vec<u8> or similar dynamically allocated type native to a particular language, that means the caller must return that object to the language so it can be free'd appropriately. This can get pretty error-prone and annoying after a while.

A good rule of thumb is that if a language creates something on the stack, you should return the object to the original language once you're done with it so it can be free'd properly. Failing to do this could end up either confusing the allocator's internal bookkeeping or even result in segfaults because one allocator (e.g. libc's malloc) is trying to free memory belonging to a completely different allocator (e.g. Rust's jemalloc).

C++ Wrapper

The next thing we'll need to do is create a wrapper around a Response. This will be almost identical to the Request wrapper, although we'll need to add a read_body() method so people can access the response body.

The Response class definition isn't overly interesting:

// gui/wrappers.hpp

...

#include <vector>

...

class Response {
public:
  Response(void *raw) : raw(raw){}
  ~Response();
  std::vector<char> read_body();

private:
  void *raw;
};

In the implementation, we need to update the extern block to include the new Rust functions.

// gui/wrappers.cpp

extern "C" {
...
void response_destroy(void *);
int response_body_length(void *);
int response_body(void *, char *, int);
}

As was mentioned earlier when writing the Rust bindings, in order to read the response body callers will need to create their own buffer and pass it to Rust. We've chosen to use a std::vector<char> as the buffer, throwing an exception with a semi-useful message if something fails (don't worry, we'll be doing proper error handling later).

// gui/wrappers.cpp

#include <cassert>

...

Response::~Response() { response_destroy(raw); }

std::vector<char> Response::read_body() {
  int length = response_body_length(raw);
  if (length < 0) {
    throw "Response body's length was less than zero";
  }

  std::vector<char> buffer(length);

  int bytes_written = response_body(raw, buffer.data(), buffer.size());
  if (bytes_written != length) {
    throw "Response body was a different size than what we expected";
  }

  return buffer;
}

The next step is to add a send() method to a Request. This is just a case of adding a new public method to Request and then deferring to send_request in our client module. You'll probably need to move Response above Request at this point so Request can use it.

// gui/wrappers.hpp

...

class Request {
public:
  Request(const std::string);
  ~Request();
  Response send();

private:
  void *raw;
};

Next we'll need to actually implement this send() method.

// gui/wrappers.cpp

...

extern "C" {
...
void *request_send(void *);
}

...

Response Request::send() {
  void *raw_response = request_send(raw);

  if (raw_response == nullptr) {
    throw "Request failed";
  }

  return Response(raw_response);
}

Here we simply called the request_send() function, checked whether the result was a null pointer (indicating an error), then created a new Response and returned it.

Testing the Process

We've finally got all the infrastructure set up to send a single GET request to a server and then read back the response. To make sure it actually works, lets hook it up to our GUI's button.

// gui/main_window.cpp

void MainWindow::onClick() {
  std::cout << "Creating the request" << std::endl;
  Request req("https://www.rust-lang.org/");
  std::cout << "Sending Request" << std::endl;
  Response res = req.send();
  std::cout << "Received Response" << std::endl;

  std::vector<char> raw_body = res.read_body();
  std::string body(raw_body.begin(), raw_body.end());
  std::cout << body << std::endl;
}

If you compile and run this then click the button you should see something similar to this printed to the terminal.

$ cmake .. && make
$ ./gui/gui
Creating the request
Sending Request
Received Response
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>The Rust Programming Language</title>
    ...
  </head>
  <body>
    <p><a href="/en-US/">Click here</a> to be redirected.</p>
  </body>
</html>

If you've gotten this far, take a second to give yourself a pat on the back. You deserve it.

Generating a Header File

Instead of having to constantly keep ffi.rs and the various extern blocks scattered through out our C++ code in sync, it'd be really nice if we could generate a header file that corresponds to ffi.rs and just #include that. Fortunately there exists a tool which does exactly this called cbindgen!

Adding Cbindgen

You can use cbindgen to generate header files in a couple ways, the first is to use cargo install and run the binding generator program.

$ cargo install cbindgen
$ cd /path/to/my/project && cbindgen . -o target/my_project.h

However running this after every change can get quite repetitive, therefore the README includes a minimal build script which will automatically generate the header every time you compile.

First add cbindgen as a build dependency (cargo-edit makes this quite easy).

$ cargo add --build cbindgen

You also need to make sure you have a build script entry in your Cargo.toml.

...
description = "The business logic for a REST client"
name = "client"
repository = "https://github.com/Michael-F-Bryan/rust-ffi-guide"
version = "0.1.0"
+ build = "build.rs"
+
+ [build-dependencies]
+ cbindgen = "0.1.29"

[dependencies]
chrono = "0.4.0"
...

Finally you can flesh out the build script itself. This is fairly straightforward, although because we want to put the generated header file in the target/ directory we need to take special care to detect when cmake overrides the default.

// client/build.rs

extern crate cbindgen;

use std::env;
use std::path::PathBuf;
use cbindgen::Config;


fn main() {
    let crate_dir = env::var("CARGO_MANIFEST_DIR").unwrap();

    let package_name = env::var("CARGO_PKG_NAME").unwrap();
    let output_file = target_dir()
        .join(format!("{}.hpp", package_name))
        .display()
        .to_string();

    let config = Config {
        namespace: Some(String::from("ffi")),
        ..Default::default()
    };

    cbindgen::generate_with_config(&crate_dir, config)
      .unwrap()
      .write_to_file(&output_file);
}

/// Find the location of the `target/` directory. Note that this may be 
/// overridden by `cmake`, so we also need to check the `CARGO_TARGET_DIR` 
/// variable.
fn target_dir() -> PathBuf {
    if let Ok(target) = env::var("CARGO_TARGET_DIR") {
        PathBuf::from(target)
    } else {
        PathBuf::from(env::var("CARGO_MANIFEST_DIR").unwrap()).join("target")
    }
}

Note that the build.rs build script also creates a custom Config that specifies everything in the generated header file should be under the ffi namespace. This means we won't get name clashes between the opaque Request and Response types generated by cbindgen and our own wrapper classes.

If you go back to the build/ directory and recompile, you should now see the client.hpp header file.

$ cd build/
$ cmake -DCMAKE_BUILD_TYPE=Debug ..
$ make
$ ls client
client.hpp  cmake_install.cmake  CMakeFiles  CTestTestfile.cmake  debug  libclient.so  Makefile

$ cat client/client.hpp
#include <cstdint>
#include <cstdlib>

extern "C" {

namespace ffi {

// A HTTP request.
struct Request;

struct Response;

// Initialize the global logger and log to `rest_client.log`.
//
// Note that this is an idempotent function, so you can call it as many
// times as you want and logging will only be initialized the first time.
void initialize_logging();

// Construct a new `Request` which will target the provided URL and fill out
// all other fields with their defaults.
//
// # Note
...

This header file is also #include-able from all your C++ code, meaning you no longer need to write all those manual extern "C" declarations. It also lets you rely on the compiler to do proper type checking instead of getting ugly linker errors if your forward declarations become out of sync (or crashes and data corruption if function arguments change).

To actually #include the generated header file we need to make a couple adjustments to the CMakeLists.txt file to let cmake know to add the build/client/ output directory to the include path.

# gui/CMakeLists.txt

set(CMAKE_INCLUDE_CURRENT_DIR ON)
find_package(Qt5Widgets)

+ set(CLIENT_BUILD_DIR ${CMAKE_BINARY_DIR}/client)
+ include_directories(${CLIENT_BUILD_DIR})
+
set(SOURCE main_window.cpp main_window.hpp wrappers.cpp wrappers.hpp main.cpp)

add_executable(gui ${SOURCE})

Now you just need to update the wrappers.cpp and wrappers.hpp files to #include this new client.hpp, delete the extern "C" block, and update the Rust function call sites to be prefixed with the ffi:: namespace. As a bonus we can also replace a bunch of void * pointers with proper strongly-typed pointers.

This step may take a couple iterations to make sure all the types match up and everything compiles again. Make sure to test it all works by running the gui program and hitting our GUI's dummy button. If everything is okay then you should see HTML for the Rust website printed to the console.

Better Error Handling

So far whenever something goes wrong we've just returned a null pointer to indicate failure... This isn't overly ideal. Instead, it'd be nice if we could get some context behind an error and possibly present a nice friendly message to the user.

To improve our application's error handling story we're going to use several techniques, all of which nicely complement each other.

We'll add in logging with the log crate, and the ability to initialize the Rust logger from C/C++.

Next we'll add a mechanism which lets C callers detect when an error has occurred by inspecting return values and then access the most recent error message.

We also need to make sure our FFI bindings are Exception Safe. This means that any Rust panics are wholly contained to Rust and we can't accidentally unwind across the FFI boundary (which is UB).

Return Types

A very powerful error handling mechanism in C-style programs (technically this is one because our FFI bindings export a C interface) is modelled on errno.

This employs a thread-local variable which holds the most recent error as well as some convenience functions for getting/clearing this variable. The theory is if a function fails then it should return an "obviously invalid" value (typically -1 or 0 when returning integers or null for pointers). The user can then check for this and consult the most recent error for more information. Of course that means all fallible operations must update the most recent error if they fail and that you must check the returned value of any fallible operation.

While it isn't as elegant as Rust's monad-style Result<T, E> with ? and the various combinators, it actually turns out to be a pretty solid error handling technique in practice.

Note: It is highly recommended to have a skim through libgit2's error handling docs. The error handling mechanism we'll be using takes a lot of inspiration from libgit2.

Working With Errors

We'll start off by defining a thread-local static variable with the thread_local!() macro and put it in the ffi module.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

thread_local!{
    static LAST_ERROR: RefCell<Option<Box<Error>>> = RefCell::new(None);
}
#}

Notice that we haven't declared the error value public, this is so people are forced to access the error via getter and setter functions.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

/// Update the most recent error, clearing whatever may have been there before.
pub fn update_last_error<E: Error + 'static>(err: E) {
    error!("Setting LAST_ERROR: {}", err);

    {
        // Print a pseudo-backtrace for this error, following back each error's
        // cause until we reach the root error.
        let mut cause = err.cause();
        while let Some(parent_err) = cause {
            warn!("Caused by: {}", parent_err);
            cause = parent_err.cause();
        }
    }

    LAST_ERROR.with(|prev| {
        *prev.borrow_mut() = Some(Box::new(err));
    });
}

/// Retrieve the most recent error, clearing it in the process.
pub fn take_last_error() -> Option<Box<Error>> {
    LAST_ERROR.with(|prev| prev.borrow_mut().take())
}
#}

Neither of these are terribly interesting once you look at the thread_local!() macro's documentation. Notice that the actual type we're using needs to be RefCell<Option<_>> so we can have both interior mutability, and represent the fact that there may not have been any recent errors. It's annoying, but luckily due to the API's design the complexity won't leak into client code.

While the getters and setters we currently have are quite powerful, it's still not possible to use them outside of Rust. To remedy this we're going to add a function that is callable from C and will give the caller the most recent error message.

The idea is the caller will give us a buffer to write the string into. This part can be a little tricky because we have a couple edge cases and ergonomics issues to deal with.

For example, if the caller passing a buffer which isn't big enough to hold the error message we should return an error, obviously. However how does the caller know what a reasonable buffer size is to begin with? Making them guess isn't exactly a practical solution, so it'd be nice if we included a mechanism for calculating the error message's length without consuming the error message itself.

To deal with this we add an extra last_error_length() function.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

/// Calculate the number of bytes in the last error's error message **not**
/// including any trailing `null` characters.
#[no_mangle]
pub extern "C" fn last_error_length() -> c_int {
    LAST_ERROR.with(|prev| match *prev.borrow() {
        Some(ref err) => err.to_string().len() as c_int + 1,
        None => 0,
    })
}
#}

The second issue is more problematic. For all unix-based systems, the string type used pretty much ubiquitously is UTF-8, meaning we should be able to copy a Rust String's contents directly into the provided buffer without any issues. However, the "unicode" string type most commonly used on Windows is not UTF-8. Instead they use UTF-16 (well... technically it's not even valid UTF-16) which is completely incompatible with UTF-8.

Therefore on Windows if we want to be correct we should convert the String representation of an error message into a native Windows string with encode_wide() from std::os::windows::ffi::OsStrExt and copy that into a &mut [u16] buffer (not &mut [u8]!) that the user gives to us... Eww.

There's no easy way to get around this without a bunch of conditional compilation (#[cfg]) and adding a lot of complexity to the implementation, therefore we're going to cheat and say it's the caller's responsibility to deal with any UTF-8/UTF-16 conversions.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

/// Write the most recent error message into a caller-provided buffer as a UTF-8
/// string, returning the number of bytes written.
///
/// # Note
///
/// This writes a **UTF-8** string into the buffer. Windows users may need to
/// convert it to a UTF-16 "unicode" afterwards.
///
/// If there are no recent errors then this returns `0` (because we wrote 0
/// bytes). `-1` is returned if there are any errors, for example when passed a
/// null pointer or a buffer of insufficient size.
#[no_mangle]
pub unsafe extern "C" fn last_error_message(buffer: *mut c_char, length: c_int) -> c_int {
    if buffer.is_null() {
        warn!("Null pointer passed into last_error_message() as the buffer");
        return -1;
    }

    let last_error = match take_last_error() {
        Some(err) => err,
        None => return 0,
    };

    let error_message = last_error.to_string();

    let buffer = slice::from_raw_parts_mut(buffer as *mut u8, length as usize);

    if error_message.len() >= buffer.len() {
        warn!("Buffer provided for writing the last error message is too small.");
        warn!(
            "Expected at least {} bytes but got {}",
            error_message.len() + 1,
            buffer.len()
        );
        return -1;
    }

    ptr::copy_nonoverlapping(
        error_message.as_ptr(),
        buffer.as_mut_ptr(),
        error_message.len(),
    );

    // Add a trailing null so people using the string as a `char *` don't
    // accidentally read into garbage.
    buffer[error_message.len()] = 0;

    error_message.len() as c_int
}
#}

Our last_error_message() function turned out to be rather long, although most of it is taken up by checking for errors and edge cases.

Note: Notice that we're writing into a buffer provided by the caller instead of returning a Rust String. This makes memory management a lot easier because the caller can clean up the buffer like normal instead of needing to remember to call some Rust destructor afterwards.

Writing into borrowed buffers instead of returning an owned object is a common pattern when doing FFI. It helps simplify things and avoid errors due to object lifetimes (in general, not just in the Rust sense of the word) and forgetting to call destructors.

Adding update_last_error() To The FFI Bindings

Now we've got a shiny new error handling mechanism, we need to go back over the FFI bindings in client/src/ffi.rs and make sure any fallible operations set call update_last_error() when they fail.

Because we are using error-chain for error handling in Rust, we've got the ability to add extra context to errors before calling update_last_error(), automatically updating the error's cause.

For example, you may refactor the request_create() function to look something like this:

// client/src/ffi.rs

#[no_mangle]
 pub unsafe extern "C" fn request_create(url: *const c_char) -> *mut Request {
     if url.is_null() {
+        let err = Error::from("No URL provided");
+        update_last_error(err);
         return ptr::null_mut();
     }
 
     let raw = CStr::from_ptr(url);
 
      let url_as_str = match raw.to_str() {
          Ok(s) => s,
-        Err(_) => return ptr::null_mut(),
+        Err(e) => {
+            let err = Error::with_chain(e, "Unable to convert URL to a UTF-8 string");
+            update_last_error(err);
+            return ptr::null_mut();
+        }
      };
 
      let parsed_url = match Url::parse(url_as_str) {
          Ok(u) => u,
-        Err(_) => return ptr::null_mut(),
+        Err(e) => {
+            let err = Error::with_chain(e, "Unable to parse the URL");
+            update_last_error(err);
+            return ptr::null_mut();
+        }
      };

    ...

For brevity's sake, we won't show how all the FFI bindings have been updated because it's largely tedious refactoring. However, feel free to inspect the source code for this guide if you are curious to see the final version.

C++ Error Bindings

We're going to expose this error handling mechanism to C++ in two ways, there will be a low level C++ equivalent to last_error_message() which simply calls the Rust function and does the necessary work to convert the error message into a std::string.

There will also be a more high-level WrapperException class which can be thrown whenever an operation fails. This should then be caught higher up by the Qt application and an appropriate error message will be displayed to the user.

First we need to add last_error_message() to our wrappers.hpp header file.

// gui/wrappers.hpp

std::string last_error_message();

Then we need to implement it.

// gui/wrappers.cpp

std::string last_error_message() {
  int error_length = ffi::last_error_length();

  if (error_length == 0) {
    return std::string();
  }

  std::string msg(error_length, '\0');
  int ret = ffi::last_error_message(&msg[0], msg.length());
  if (ret <= 0) {
    // If we ever get here it's a bug
    throw new WrapperException("Fetching error message failed");
  }

  return msg;
}

Notice that if there was no error we return an empty std::string instead of blowing up.

We also want to define a WrapperException class in the wrappers.hpp header file. To make things easier, we're defining a public static helper function which will create a new WrapperException from the most recent error.

// gui/wrappers.hpp

class WrapperException : std::exception {
public:
  WrapperException(const std::string& msg) : msg(msg){};
  static WrapperException last_error();
  const char * what () const throw () {
      return msg.c_str();
   }

private:
  std::string msg;
};

The last_error() method has a fairly simple definition, it fetches the last error message and creates a new WrapperException with it. If the error message was empty then we use a default message.

// gui/wrappers.cpp

WrapperException WrapperException::last_error() {
  std::string msg = last_error_message();

  if (msg.length == 0) {
    return WrapperException("(no error available)");
  } else {
    return WrapperException(msg);
  }
}

Integrating In The Error Handling Mechanism

Now we've got a proper error handling mechanism, we need to go back and make sure everything uses it. This is just a case of finding all throw statements in wrappers.cpp (using grep or your editor's "find" function) and converting them to use throw WrapperException::last_error().

The easiest way to check our error handling mechanism is to edit the click handler we've been using for testing and make it try to send a request to some invalid URL.

// gui/main_window.cpp

void MainWindow::onClick() {
  Request req("this is an invalid URL");
}

Now run the program and click the button.

$ ./gui/gui
...
terminate called after throwing an instance of 'WrapperException'
[1]    1016 abort (core dumped)  ./gui/gui

It... aborts?

This is because Qt, by default, won't try to catch any thrown exceptions, meaning they'll just bubble up to the top of the program and crash.

It'd be a much better user experience if the GUI would catch all thrown exceptions and pop up a nice dialog box saying what went wrong.

According to Qt's documentation on exceptions, it is undefined behaviour when a handler throws an exception. In this case it looks like the exception bubbled up the stack, unhandled, until it hit the program's entry point and triggered an abort. This isn't exactly ideal, so how about we wrap the click hander's contents in a big try/catch block?

// gui/main_window.cpp

void MainWindow::onClick() {
  try
  {
    Request req("this is an invalid URL");
  }
  catch (const WrapperException& e)
  {
    QMessageBox::warning(this, "Error", e.what());
  }
}

That's much better. Now our application can deal with errors in a sane way, and is much more robust.

TODO: Add the error handling patterns developed in this chapter to the ffi-helpers crate.

Logging

TODO: Talk about using logging to help determine the state of a program

Exception Safety

Asynchronous Operations

At the moment sending our request will block until it returns, meaning the entire GUI will lock up. This is bad both from a user experience point of view, and because the window is no longer responding to events so the operating system will think it's gone into zombie mode (popping up the standard "This program is not responding" dialog).

A much better way of doing things would be to spin the request onto a background thread, periodically polling it and getting the result if the job is completed.

TODO: create a Task abstraction using futures and futures_cpupool to spawn an arbitrary closure.

TODO: Add the Task abstraction to ffi-helpers, as well as maybe a couple macros for generating the extern "C" functions for things like polling, creation, and destruction to deal with the fact that generics aren't FFI-safe.

More Complex Requests

Now that we've got the basics working and things like error handling and asynchrony are dealt with, lets flesh out this application.

At the moment the entire application is composed of a button where the handler will fire off HTTP requests in the background. The GUI doesn't even display the response body, instead printing it to the parent console. This is fine while developing the backend, but not overly useful for end users.

This chapter will deal with:

  • Mutating heap-allocated Rust objects (the request headers)
  • The separation between UI state and business logic state
  • Best practices when building a larger GUI program

TODO: Complete this chapter, making a basic UI which looks something like this and successfully pings httpbin.

Testing

You'd typically want to be testing the application from the very beginning, but because of the complexity of this tutorial we've left it until a later chapter when you are (hopefully) more familiar with the C++/Rust interop workflow.

This is that chapter.

As well as the usual unit tests which you will be accustomed to writing in your Rust code, we want to be able to test the entire backend from end-to-end. This would require using the C++ wrappers to send off requests under various conditions and making sure we get the expected behaviour.

We will cover:

  • Integrating cargo test into cmake's built-in testing facilities
  • Creating C++ integration tests to exercise the entire backend under various conditions, including
    • The "happy path" (e.g. getting a valid web page like https://google.com/)
    • Sending requests to non-existent locations (e.g. "http://imprettysurethiswebsitedoesntexist.com/")
    • Invalid URLs (i.e. bang on your keyboard)
    • Making sure cookies and headers are actually set
    • streaming and timeouts

TODO: Flesh out this chapter.

Dynamic Loading & Plugins

What application wouldn't be complete without the ability to add user-defined plugins? In this chapter we take a small detour to visit the concept of dynamically loading a library at runtime and registering it with our parent application.

The end goal is to allow users to provide a shared library (DLL, *.so, etc) which contains a set of pre-defined functions. These functions will then allow us to manipulate a request before it is sent and then manipulate/inspect the response before displaying it to the user.

From the Rust side of things, by far the easiest way to establish this is to define a Plugin trait which does the various manipulations, then add in a macro users can run which will define all the unsafe function declarations.

Our Plugin trait may look something like this:


# #![allow(unused_variables)]
#fn main() {
pub trait Plugin {
    fn name(&self) -> &'static str;
    fn on_plugin_load(&self) {}
    fn pre_send(&self, _request: &mut Request) {}
    fn post_receive(&self, _response: &mut Response) {}
}
#}

The macro would then declare an extern "C" constructor which exports a trait object (Box<Plugin>) with some pre-defined symbol (e.g. _plugin_create()).

Note: This is actually the exact pattern used by the Linux kernel for loading device drivers. Each driver must expose a function which returns a vtable (struct of function pointers) that define the various commands necessary for talking with a device (read, write, etc).

Before diving into the complexity of real code, it's probably going to be easier if we figure out how dynamic loading works using a contrived example.

Contrived Example

For this the function being exported doesn't need to be very interesting, seeing it's just an example.


# #![allow(unused_variables)]
#fn main() {
#[no_mangle]
pub extern "C" fn add(a: isize, b: isize) -> isize {
    a + b
}
#}

This then can then be compiled into a cdylib.

Note: Up uptil now it hasn't mattered whether you compile as a dynamic library or a static one. However for dynamically loading a library on the fly you must compile as a cdylib.

$ rustc --crate-type cdylib adder.rs

The symbols exported by this dynamic library can now be inspected using the nm tool from GNU binutils.

$ nm libadder.so | grep 'add'
00000000000005f0 T add

As you can see, the add function is exposed and fully accessible to other programs.

Loading the Contrived Example

Loading a function from this library and then calling it is then surprisingly easy. The key is to use something like the libloading crate. This abstracts over the various mechanisms provided by the operating system for dynamically loading a library.


# #![allow(unused_variables)]
#fn main() {
extern crate libloading;

use std::env;
use libloading::{Library, Symbol};
#}

It's also a good idea to add a type alias for the add() function's signature. This isn't required, but when things start getting more complex and having more interesting arguments the extra readability really pays off.


# #![allow(unused_variables)]
#fn main() {
type AddFunc = unsafe fn(isize, isize) -> isize;
#}

The main() function takes the DLL as its first command line argument:

fn main() {
    let library_path = env::args().nth(1).expect("USAGE: loading <LIB>");
    println!("Loading add() from {}", library_path);

Loads the library and gets a symbol (casting the function pointer so it has the desired signature)


# #![allow(unused_variables)]
#fn main() {
    let lib = Library::new(library_path).unwrap();

    unsafe {
        let func: Symbol<AddFunc> = lib.get(b"add").unwrap();
#}

Then you can finally call the imported function.


# #![allow(unused_variables)]
#fn main() {
        let answer = func(1, 2);
        println!("1 + 2 = {}", answer);
    }
}
#}

Now compiling and running with cargo gives exactly what we'd expect:

$ cargo run -- ../libadder.so
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/loading ../libadder.so`
Loading add() from ../libadder.so
1 + 2 = 3

The entire main.rs looks like this:

extern crate libloading;

use std::env;
use libloading::{Library, Symbol};

type AddFunc = fn(isize, isize) -> isize;

fn main() {
    let library_path = env::args().nth(1).expect("USAGE: loading <LIB>");
    println!("Loading add() from {}", library_path);

    let lib = Library::new(library_path).unwrap();

    unsafe {
        let func: Symbol<AddFunc> = lib.get(b"add").unwrap();

        let answer = func(1, 2);
        println!("1 + 2 = {}", answer);
    }
}

Setting Up Plugins

Now that we have a better understanding of how dynamically loading a library on the fly works, we can start adding plugins to our application.

First we'll define a Plugin trait which all plugins must implement. This has been copied pretty much verbatim from the beginning of the chapter.


# #![allow(unused_variables)]
#fn main() {
// client/src/plugins.rs

use std::ffi::OsStr;
use std::any::Any;
use libloading::{Library, Symbol};

use errors::*;
use {Request, Response};


/// A plugin which allows you to add extra functionality to the REST client.
pub trait Plugin: Any + Send + Sync {
    /// Get a name describing the `Plugin`.
    fn name(&self) -> &'static str;
    /// A callback fired immediately after the plugin is loaded. Usually used 
    /// for initialization.
    fn on_plugin_load(&self) {}
    /// A callback fired immediately before the plugin is unloaded. Use this if
    /// you need to do any cleanup.
    fn on_plugin_unload(&self) {}
    /// Inspect (and possibly mutate) the request before it is sent.
    fn pre_send(&self, _request: &mut Request) {}
    /// Inspect and/or mutate the received response before it is displayed to
    /// the user.
    fn post_receive(&self, _response: &mut Response) {}
}
#}

This is all pretty standard. Notice that the Plugin must be sendable between threads and that all callbacks take &self instead of &mut self. This means that any mutation must be done using interior mutability. the Send + Sync bound also means you need to use the appropriate synchronisation mechanisms (e.g. a Mutex).

We also define a convenience macro that users can call to export their Plugin in a safe manner. This just declares a new extern "C" function called _plugin_create() which will call the constructor and return a new boxed Plugin.


# #![allow(unused_variables)]
#fn main() {
// client/src/plugins.rs

/// Declare a plugin type and its constructor.
///
/// # Notes
///
/// This works by automatically generating an `extern "C"` function with a
/// pre-defined signature and symbol name. Therefore you will only be able to
/// declare one plugin per library.
#[macro_export]
macro_rules! declare_plugin {
    ($plugin_type:ty, $constructor:path) => {
        #[no_mangle]
        pub extern "C" fn _plugin_create() -> *mut $crate::Plugin {
            // make sure the constructor is the correct type.
            let constructor: fn() -> $plugin_type = $constructor;

            let object = constructor();
            let boxed: Box<$crate::Plugin> = Box::new(object);
            Box::into_raw(boxed)
        }
    };
}
#}

Another thing we're going to need is a way to manage plugins and make sure they are called at the appropriate time. This is usually done with a PluginManager.

Something we need to keep in mind is that any Library we load will need to outlive our plugins. This is because they contain the code for executing the various Plugin methods, so if the Library is dropped too early our plugins' vtable could end up pointing at garbage... Which would be bad.

First lets add the struct definition and a constructor,


# #![allow(unused_variables)]
#fn main() {
// client/src/plugins.rs

pub struct PluginManager {
    plugins: Vec<Box<Plugin>>,
    loaded_libraries: Vec<Library>,
}

impl PluginManager {
    pub fn new() -> PluginManager {
        PluginManager {
            plugins: Vec::new(),
            loaded_libraries: Vec::new(),
        }
    }
#}

Next comes the actual plugin loading part. Make sure to add libloading as a dependency to your Cargo.toml, then we can use it to dynamically load the plugin and call the _plugin_create() function. We also need to make sure the on_plugin_load() callback is fired so the plugin has a chance to do any necessary initialization.


# #![allow(unused_variables)]
#fn main() {
// client/src/plugins.rs

    pub unsafe fn load_plugin<P: AsRef<OsStr>>(&mut self, filename: P) -> Result<()> {
        type PluginCreate = unsafe fn() -> *mut Plugin;

        let lib = Library::new(filename.as_ref()).chain_err(|| "Unable to load the plugin")?;

        // We need to keep the library around otherwise our plugin's vtable will
        // point to garbage. We do this little dance to make sure the library
        // doesn't end up getting moved.
        self.loaded_libraries.push(lib);

        let lib = self.loaded_libraries.last().unwrap();

        let constructor: Symbol<PluginCreate> = lib.get(b"_plugin_create")
            .chain_err(|| "The `_plugin_create` symbol wasn't found.")?;
        let boxed_raw = constructor();

        let plugin = Box::from_raw(boxed_raw);
        debug!("Loaded plugin: {}", plugin.name());
        plugin.on_plugin_load();
        self.plugins.push(plugin);


        Ok(())
    }
#}

Now our PluginManager can load plugins, we need to make sure it has methods for firing the various plugin callbacks.


# #![allow(unused_variables)]
#fn main() {
// client/src/plugins.rs

    /// Iterate over the plugins, running their `pre_send()` hook.
    pub fn pre_send(&mut self, request: &mut Request) {
        debug!("Firing pre_send hooks");

        for plugin in &mut self.plugins {
            trace!("Firing pre_send for {:?}", plugin.name());
            plugin.pre_send(request);
        }
    }

    /// Iterate over the plugins, running their `post_receive()` hook.
    pub fn post_receive(&mut self, response: &mut Response) {
        debug!("Firing post_receive hooks");

        for plugin in &mut self.plugins {
            trace!("Firing post_receive for {:?}", plugin.name());
            plugin.post_receive(response);
        }
    }

    /// Unload all plugins and loaded plugin libraries, making sure to fire 
    /// their `on_plugin_unload()` methods so they can do any necessary cleanup.
    pub fn unload(&mut self) {
        debug!("Unloading plugins");

        for plugin in self.plugins.drain(..) {
            trace!("Firing on_plugin_unload for {:?}", plugin.name());
            plugin.on_plugin_unload();
        }

        for lib in self.loaded_libraries.drain(..) {
            drop(lib);
        }
    }
}
#}

Those last three methods should be fairly self-explanatory.

Something else we may want to do is add a Drop impl so that our plugins are always unloaded when the PluginManager gets dropped. This gives them a chance to do any necessary cleanup.


# #![allow(unused_variables)]
#fn main() {
// client/src/plugins.rs

impl Drop for PluginManager {
    fn drop(&mut self) {
        if !self.plugins.is_empty() || !self.loaded_libraries.is_empty() {
            self.unload();
        }
    }
}
#}

A thing to keep in mind is something called panic-on-drop. Basically, if the program is panicking it'll unwind the stack, calling destructors when necessary. However, because our PluginManager tries to unload plugins if it hasn't already, a Plugin who's unload() method also panics will result in a second panic. This usually results in aborting the entire program because your program is most probably FUBAR.

To prevent this, we'll want to make sure the C++ code explicitly unloads the plugin manager before destroying it.

Writing C++ Bindings

As usual, once we've added a piece of functionality to the core Rust crate we'll need to expose it to C++ in our ffi module, then add the C++ bindings to wrappers.cpp.

Writing FFI bindings should be quite familiar by now. All you are doing is converting raw pointers into references, then calling a method.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

use PluginManager;

...

/// Create a new `PluginManager`.
#[no_mangle]
pub extern "C" fn plugin_manager_new() -> *mut PluginManager {
    Box::into_raw(Box::new(PluginManager::new()))
}

/// Destroy a `PluginManager` once you are done with it.
#[no_mangle]
pub unsafe extern "C" fn plugin_manager_destroy(pm: *mut PluginManager) {
    if !pm.is_null() {
        let pm = Box::from_raw(pm);
        drop(pm);
    }
}

/// Unload all loaded plugins.
#[no_mangle]
pub unsafe extern "C" fn plugin_manager_unload(pm: *mut PluginManager) {
    let pm = &mut *pm;
    pm.unload();
}

/// Fire the `pre_send` plugin hooks.
#[no_mangle]
pub unsafe extern "C" fn plugin_manager_pre_send(pm: *mut PluginManager, request: *mut Request) {
    let pm = &mut *pm;
    let request = &mut *request;
    pm.pre_send(request);
}

/// Fire the `post_receive` plugin hooks.
#[no_mangle]
pub unsafe extern "C" fn plugin_manager_post_receive(
    pm: *mut PluginManager,
    response: *mut Response,
) {
    let pm = &mut *pm;
    let response = &mut *response;
    pm.post_receive(response);
}

#}

Plugin loading is a bit more interesting because we need to convert a *const c_char into a &str, but other than that it's all pretty straightforward.


# #![allow(unused_variables)]
#fn main() {
// client/src/ffi.rs

#[no_mangle]
pub unsafe extern "C" fn plugin_manager_load_plugin(
    pm: *mut PluginManager,
    filename: *const c_char,
) -> c_int {
    let pm = &mut *pm;
    let filename = CStr::from_ptr(filename);
    let filename_as_str = match filename.to_str() {
        Ok(s) => s,
        Err(_) => {
            // TODO: proper error handling
            return -1;
        }
    };

    // TODO: proper error handling and catch_unwind
    match pm.load_plugin(filename_as_str) {
        Ok(_) => 0,
        Err(_) => -1,
    }
}
#}

Next we need to add a PluginManager wrapper class to our wrappers.hpp. We should also say that PluginManager is a friend of Request and Response so it can access their raw pointers.

// gui/wrappers.hpp

class Request {
  friend class PluginManager;
  ...
};

class Response {
  friend class PluginManager;
  ...
};

class PluginManager {
public:
  PluginManager();
  ~PluginManager();
  void unload();
  void pre_send(Request& req);
  void post_receive(Response& res);

private:
  ffi::PluginManager *raw;
};

Similar to when we were writing the Rust FFI bindings, on the C++ side you just need to make sure the arguments are in the right shape before deferring to the corresponding functions.

// gui/wrappers.cpp

PluginManager::PluginManager() { raw = ffi::plugin_manager_new(); }

PluginManager::~PluginManager() { ffi::plugin_manager_destroy(raw); }

void PluginManager::unload() { ffi::plugin_manager_unload(raw); }

void PluginManager::pre_send(Request& req) {
  ffi::plugin_manager_pre_send(raw, req.raw);
}

void PluginManager::post_receive(Response& res) {
  ffi::plugin_manager_post_receive(raw, res.raw);
}

Hooking Up The Plugin Manager

Now that our PluginManager is finally accessible from the GUI we can thread it through the request sending process.

First we'll need to add the PluginManager to our main window.

// gui/main_window.hpp

#include "wrappers.hpp"

...

class MainWindow : public QMainWindow {
  ...

private:
  ...
  PluginManager pm;
};

Next we need to make sure that whenever we send a request we also pass it to the plugin manager so it can do the appropriate pre/post processing.

...

pm.pre_send(req);
Response res = req.send();
pm.post_receive(res);

...

We also want to make sure that plugins are unloaded when the window is closed, the easiest way to do this is to override MainWindow's closeEvent() method.

To do this we update the main_window.hpp header file:

// gui/main_window.hpp

class MainWindow : public QMainWindow {
  ...

protected:
  void closeEvent(QCloseEvent *event) override;

  ...
};

Then add the implementation to main-window.cpp.

// gui/main_window.cpp

void MainWindow::closeEvent(QCloseEvent *event) {
  pm.unload();
  QMainWindow::closeEvent(event);
}

Now the plugin manager is plumbed into the existing request pipeline, we need a way of actually loading plugins at runtime. We'll use a simple file dialog and button for this.

TODO: Once the main UI is done, step through adding a "load plugin" button and hooking it up to the plugin manager.

Lets Make A Plugin

Now we have all the plugin infrastructure set up lets actually make (and load) a plugin! This plugin will inject a special header into each request, then if it's also present in the response we'll remove it so it's not viewable by the end user.

First lets create a new library.

$ cargo new injector-plugin

We also want to update the Cargo.toml to depend on the client library and generate a cdylib so it's loadable by our plugin manager. While we're at it, add the log crate so we can log what's happening.

// injector-plugin/Cargo.toml

[package]
name = "injector-plugin"
version = "0.1.0"
authors = ["Michael Bryan <michaelfbryan@gmail.com>"]
+ description = "A plugin which will stealthily inject a special header into your requests."

[dependencies]
+ log = "0.3.8"
+ client = { path = "../client"}
+
+ [lib]
+ crate-type = ["cdylib", "rlib"]

We also want to add a cmake build rule so the injector-plugin crate is built along with the rest of the project. The CMakeLists.txt file for this crate is identical to the one we wrote for client so just copy that across and change the relevant names.

$ cp ./client/CMakeLists.txt ./injector-plugin/CMakeLists.txt

Don't forget to make sure cmake includes the injector-plugin directory!

# ./CMakeLists.txt

add_subdirectory(client)
+ add_subdirectory(injector-plugin)
add_subdirectory(gui)

As we link the plugin to the client library the Rust way, we need adjust its Cargo.toml to generate it also as rlib.

# client/Cargo.toml

...

[lib]
crate-type = ["cdylib", "rlib"]

And then we do a quick build as a sanity check to make sure everything built.

$ mkdir build && cd build
$ cmake -DCMAKE_BUILD_TYPE=Debug ..
$ make

...

The plugin body itself isn't overly interesting.


# #![allow(unused_variables)]
#fn main() {
// injector-plugin/src/lib.rs

#[macro_use]
extern crate log;
#[macro_use]
extern crate client;

use std::str;
use client::{Request, Response, Plugin};


#[derive(Debug, Default)]
pub struct Injector;

impl Plugin for Injector {
    fn name(&self) -> &'static str  {
        "Header Injector"
    }

    fn on_plugin_load(&self) {
        info!("Injector loaded");
    }

    fn on_plugin_unload(&self) {
        info!("Injector unloaded");
    }

    fn pre_send(&self, req: &mut Request) {
        req.headers.set_raw("some-dodgy-header", "true");
        debug!("Injected header into Request, {:?}", req);
    }

    fn post_receive(&self, res: &mut Response) {
        debug!("Received Response");
        debug!("Headers: {:?}", res.headers);
        if res.body.len() < 100 && log_enabled!(log::LogLevel::Debug) {
            if let Ok(body) = str::from_utf8(&res.body) {
                debug!("Body: {:?}", body);
            }
        }
        res.headers.remove_raw("some-dodgy-header");
    }
}
#}

Finally, to make this plugin library actually work we need to call the declare_plugin!() macro.


# #![allow(unused_variables)]
#fn main() {
// injector-plugin/src/lib.rs

declare_plugin!(Injector, Injector::default);
#}

If you then compile this and inspect it with our trusty nm tool you'll see that the library contains our _plugin_create symbol.

$ cd build
$ make
$ nm injector-plugin/libinjector_plugin.so | grep ' T '

...
0000000000030820 T _plugin_create
...

Running The Plugin

Now that we've got a plugin and everything is hooked up to the GUI, we can try it out and benefit from all the hard work put in so far.

Make sure to do one last compile,

$ cd build
$ make

Then run the GUI and load the plugin from build/injector-plugin/libinjector_plugin.so. To see what headers are sent you can send a GET request to http://httpbin.org/get. With any luck you should see something like this:

$ RUST_LOG=client=debug,injector_plugin=debug ./gui/gui

DEBUG:client::ffi: Loading plugin, "/home/michael/Documents/ffi-guide/build/injector-plugin/libinjector_plugin.so"
DEBUG:client::plugins: Loaded plugin: Header Injector
INFO:injector_plugin: Injector loaded
Creating the request
Sending Request
DEBUG:client::plugins: Firing pre_send hooks
DEBUG:injector_plugin: Injected header into Request, Request { destination: "http://httpbin.org/get", method: Get, headers: {"some-dodgy-header": "true"}, cookies: CookieJar { original_cookies: {}, delta_cookies: {} }, body: None }
INFO:client: Sending a GET request to http://httpbin.org/get
DEBUG:client: Sending 1 Headers
DEBUG:client: 	some-dodgy-header: true
DEBUG:client::ffi: Received Response
DEBUG:client::plugins: Firing post_receive hooks
DEBUG:injector_plugin: Received Response
DEBUG:injector_plugin: Headers: {"Connection": "keep-alive", "Server": "meinheld/0.6.1", "Date": "Tue, 07 Nov 2017 14:29:39 GMT", "Content-Type": "application/json", "Access-Control-Allow-Origin": "*", "Access-Control-Allow-Credentials": "true", "X-Powered-By": "Flask", "X-Processed-Time": "0.000864028930664", "Content-Length": "303", "Via": "1.1 vegur"}
Received Response
Body:
{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip", 
    "Connection": "close", 
    "Cookie": "", 
    "Host": "httpbin.org", 
    "Some-Dodgy-Header": "true", 
    "User-Agent": "reqwest/0.8.0"
  }, 
  "origin": "122.151.115.164", 
  "url": "http://httpbin.org/get"
}

DEBUG:client::plugins: Unloading plugins
INFO:injector_plugin: Injector unloaded

Now if you look very carefully you'll see that the plugin was indeed fired at the correct time, and httpbin replied saying we had Some-Dodgy-Header in our headers. If you've stayed with us up to this point then give yourself a pat on the back, you just accomplished one of the most difficult FFI tasks possible!

If dynamic loading is still confusing you, you may want to check out some of these links:

Break All The Things!!1!

What would a guide on unsafe Rust be without an exploration into how (not) to abuse it? This section takes a bit of a detour from the rest of the document in that we'll explicitly be trying to break things in as many ways as possible.

The exercise will be as follows, each problem will contain the source code for a small program which deliberately does something horribly wrong, incorrect, or dangerous (memory safety, data races, undefined behaviour, that sort of thing). It's then your job to figure out what the issue is and why it could end up hurting your application.

Problems

Problem 1

Here's an easy one to get you started. It contains a Rust library:


# #![allow(unused_variables)]
#fn main() {
// adder.rs

pub extern "C" fn add(a: u32, b: u32) -> u32 {
    a + b
}
#}

And a C/C++ program which uses it.

// main.cpp

#include <iostream>
#include <cstdint>

extern "C" {
    uint32_t add(uit32_t, uit32_t);
}

int main() {
    uint32_t a = 5, b = 10;

    uint32_t sum = add(a, b);
    std::cout << "The sum of " << a 
              << " and " << b 
              << " is " << sum 
              << std::endl;
}

Building and running:

$ rustc --crate-type cdylib adder.rs
$ clang++ -std=c++14 -c main.cpp
$ clang++ -std=c++14 -o main -L. -ladder main.o
$ ./main

Problem 2

This problem is similar to the previous one in that it has a Rust library called by a C++ program.


# #![allow(unused_variables)]
#fn main() {
// foo.rs

#[no_mangle]
pub extern "C" fn foo() {
    panic!("Oops...");
}
#}

The main program:

// main.cpp

extern "C" {
    void foo();
}

int main() {
    foo();
}

Compiling and running is also pretty similar:

$ rustc --crate-type cdylib foo.rs
$ clang++ -std=c++14 -c main.cpp
$ clang++ -std=c++14 -o main -L. -lfoo main.o
$ ./main

Problem 3


# #![allow(unused_variables)]
#fn main() {
// home.rs

use std::ffi::CString;
use std::env;
use std::ptr;
use std::os::c_char;


#[no_mangle]
pub extern "C" fn home_directory() -> *const c_char {
    let home = match env::home_dir() {
        Some(p) => p,
        None => return ptr::null(),
    };

    let c_string = match CString::new(home){
        Ok(s) => s,
        Err(_) => return ptr::null(),
    };

    c_string.as_ptr()
}
#}
// main.cpp

#include <iostream>

extern "C" {
    char *home_directory();
}

int main() {
    char* home = home_directory();

    if (home == nullptr) {
        std::cout << "Unable to find the home directory" << std::endl;
    } else {
        std::cout << "Home directory is " << home << std::endl; 
    }
}

Compiling and running:

$ rustc --crate-type cdylib home.rs
$ clang++ -std=c++14 -c main.cpp
$ clang++ -std=c++14 -o main -L. -lhome main.o
$ ./main

Problem 4


# #![allow(unused_variables)]
#fn main() {
// logging.rs

use std::os::raw::c_char;
use std::ffi::CStr;


#[derive(Debug, Copy, Clone, PartialEq)]
#[repr(C)]
pub enum LogLevel {
    Off = 0x00,
    Error = 0x01,
    Warn = 0x02,
    Info = 0x04,
    Debug = 0x08,
    Trace = 0x0a,
}

#[no_mangle]
pub unsafe extern "C" fn log_message(level: LogLevel, message: *const c_char) {
    if level == LogLevel::Off {
        return;
    }

    let message = CStr::from_ptr(message);
    eprintln!("{:?}: {}", level, message.to_string_lossy());
}
#}
// main.cpp

#include <iostream>
#include <string>

extern "C" {
void log_message(int, const char *);
}

int main() {
  std::string message = "Hello World";
  log_message(0x04 | 0x01, message.c_str());
}

Solutions

Problem 1

TODO: Explore name mangling and the #[no_mangle] attribute

The Cherno has a video which explains the linker quite well.

Problem 2

TODO: mention exception safety and why unwinding across the FFI behaviour is a bad idea

Problem 3

TODO: Talk about why Cstring.as_ptr() is unsafe because you are passing back a dangling pointer.

Problem 4

TODO: Talk about how it's UB to create a Rust enum from an integer with an invalid variant