Development Approaches
When building Ruby extensions with Rust and rb-sys, you have two main approaches to choose from:
- Direct rb-sys usage: Working directly with Ruby's C API through the rb-sys bindings
- Higher-level wrappers: Using libraries like Magnus that build on top of rb-sys
This chapter will help you understand when to use each approach and how to mix them when needed.
Direct rb-sys Usage
The rb-sys crate provides low-level bindings to Ruby's C API. This approach gives you complete control over how your Rust code interacts with Ruby.
When to Use Direct rb-sys
- When you need maximum control over Ruby VM interaction
- For specialized extensions that need access to low-level Ruby internals
- When performance is absolutely critical and you need to eliminate any overhead
- When implementing functionality not yet covered by higher-level wrappers
Example: Simple Extension with Direct rb-sys
Here's a simple example of a Ruby extension using direct rb-sys:
use rb_sys::{
rb_define_module, rb_define_module_function, rb_str_new_cstr,
rb_string_value_cstr, VALUE
};
use std::ffi::CString;
use std::os::raw::c_char;
// Helper macro for creating C strings
macro_rules! cstr {
($s:expr) => {
concat!($s, "\0").as_ptr() as *const c_char
};
}
// Reverse a string
unsafe extern "C" fn reverse(_: VALUE, s: VALUE) -> VALUE {
let mut s_copy = s;
let c_str = rb_string_value_cstr(&mut s_copy);
let rust_str = match std::ffi::CStr::from_ptr(c_str).to_str() {
Ok(s) => s,
Err(_) => return rb_str_new_cstr(c"".as_ptr()),
};
let reversed = rust_str.chars().rev().collect::<String>();
let c_string = match CString::new(reversed) {
Ok(s) => s,
Err(_) => return rb_str_new_cstr(c"".as_ptr()),
};
rb_str_new_cstr(c_string.as_ptr())
}
// Module initialization function
#[no_mangle]
pub extern "C" fn Init_string_utils() {
unsafe {
let module = rb_define_module(cstr!("StringUtils"));
rb_define_module_function(
module,
cstr!("reverse"),
Some(std::mem::transmute::<unsafe extern "C" fn(VALUE, VALUE) -> VALUE, _>(reverse)),
1,
);
}
}
Using rb_thread_call_without_gvl for Performance
When performing computationally intensive operations, it's important to release Ruby's Global VM Lock (GVL) to allow
other threads to run. The rb_thread_call_without_gvl
function provides this capability:
use magnus::{Error, Ruby, RString};
use rb_sys::rb_thread_call_without_gvl;
use std::{ffi::c_void, panic::{self, AssertUnwindSafe}, ptr::null_mut};
/// Execute a function without holding the Global VM Lock (GVL).
/// This allows other Ruby threads to run while performing CPU-intensive tasks.
///
/// # Safety
///
/// The passed function must not interact with the Ruby VM or Ruby objects
/// as it runs without the GVL, which is required for safe Ruby operations.
///
/// # Returns
///
/// Returns the result of the function or a magnus::Error if the function panics.
pub fn nogvl<F, R>(func: F) -> Result<R, Error>
where
F: FnOnce() -> R,
R: Send + 'static,
{
struct CallbackData<F, R> {
func: Option<F>,
result: Option<Result<R, String>>, // Store either the result or a panic message
}
extern "C" fn call_without_gvl<F, R>(data: *mut c_void) -> *mut c_void
where
F: FnOnce() -> R,
R: Send + 'static,
{
// Safety: We know this pointer is valid because we just created it below
let data = unsafe { &mut *(data as *mut CallbackData<F, R>) };
// Use take() to move out of the Option, ensuring we don't try to run the function twice
if let Some(func) = data.func.take() {
// Use panic::catch_unwind to prevent Ruby process termination if the Rust code panics
match panic::catch_unwind(AssertUnwindSafe(func)) {
Ok(result) => data.result = Some(Ok(result)),
Err(panic_info) => {
// Convert panic info to a string message
let panic_msg = if let Some(s) = panic_info.downcast_ref::<&'static str>() {
s.to_string()
} else if let Some(s) = panic_info.downcast_ref::<String>() {
s.clone()
} else {
"Unknown panic occurred in Rust code".to_string()
};
data.result = Some(Err(panic_msg));
}
}
}
null_mut()
}
// Create a data structure to pass the function and receive the result
let mut data = CallbackData {
func: Some(func),
result: None,
};
unsafe {
// Release the GVL and call our function
rb_thread_call_without_gvl(
Some(call_without_gvl::<F, R>),
&mut data as *mut _ as *mut c_void,
None, // No unblock function
null_mut(),
);
}
// Extract the result or create an error if the function failed
match data.result {
Some(Ok(result)) => Ok(result),
Some(Err(panic_msg)) => {
// Convert the panic message to a Ruby RuntimeError
let ruby = unsafe { Ruby::get_unchecked() };
Err(Error::new(
ruby.exception_runtime_error(),
format!("Rust panic in nogvl: {}", panic_msg)
))
},
None => {
// This should never happen if the callback runs, but handle it anyway
let ruby = unsafe { Ruby::get_unchecked() };
Err(Error::new(
ruby.exception_runtime_error(),
"nogvl function was not executed"
))
}
}
}
How Direct rb-sys Works
When using rb-sys directly:
- You define C-compatible functions with the
extern "C"
calling convention - You manually convert between Ruby's
VALUE
type and Rust types - You're responsible for memory management and type safety
- You must use the
#[no_mangle]
attribute on the initialization function so Ruby can find it - All interactions with Ruby data happen through raw pointers and unsafe code
Higher-level Wrappers (Magnus)
Magnus provides a more ergonomic, Rust-like API on top of rb-sys. It handles many of the unsafe aspects of Ruby integration for you.
When to Use Magnus
- For most standard Ruby extensions where ease of development is important
- When you want to avoid writing unsafe code
- When you want idiomatic Rust error handling
- For extensions with complex type conversions
- When working with Ruby classes and objects in an object-oriented way
Example: Simple Extension with Magnus
Let's look at a simple example using Magnus, based on real-world usage patterns:
use magnus::{function, prelude::*, Error, Ruby};
fn hello(subject: String) -> String {
format!("Hello from Rust, {subject}!")
}
#[magnus::init]
fn init(ruby: &Ruby) -> Result<(), Error> {
let module = ruby.define_module("StringUtils")?;
module.define_singleton_method("hello", function!(hello, 1))?;
Ok(())
}
Looking at a more complex example from a real-world project (lz4-flex-rb):
use magnus::{function, prelude::*, Error, RModule, Ruby, RString};
// Placeholder functions for the example
fn compress(input: RString) -> Result<RString, Error> {
// Compression implementation would go here
Ok(input)
}
fn decompress(input: RString) -> Result<RString, Error> {
// Decompression implementation would go here
Ok(input)
}
fn compress_varint(input: RString) -> Result<RString, Error> {
// VarInt compression implementation would go here
Ok(input)
}
fn decompress_varint(input: RString) -> Result<RString, Error> {
// VarInt decompression implementation would go here
Ok(input)
}
#[magnus::init]
fn init(ruby: &Ruby) -> Result<(), Error> {
let module = ruby.define_module("Lz4Flex")?;
// Define error classes
let base_error = module.define_error("Error", magnus::exception::standard_error())?;
let _ = module.define_error("EncodeError", base_error)?;
let _ = module.define_error("DecodeError", base_error)?;
// Define methods
module.define_singleton_method("compress", function!(compress, 1))?;
module.define_singleton_method("decompress", function!(decompress, 1))?;
// Define aliases
module.singleton_class()?.define_alias("deflate", "compress")?;
module.singleton_class()?.define_alias("inflate", "decompress")?;
// Define nested module
let varint_module = module.define_module("VarInt")?;
varint_module.define_singleton_method("compress", function!(compress_varint, 1))?;
varint_module.define_singleton_method("decompress", function!(decompress_varint, 1))?;
Ok(())
}
How Magnus Works
Magnus builds on top of rb-sys and provides:
- Automatic type conversions between Ruby and Rust
- Rust-like error handling with
Result
types - Memory safety through RAII patterns
- More ergonomic APIs for defining modules, classes, and methods
- A more familiar development experience for Rust programmers
When to Choose Each Approach
```
Direct rb-sys:
✅ Maximum performance
✅ Low-level Ruby VM control
✅ Fine-grained GVL management
✅ Version-specific behavior
❌ Lots of unsafe code
❌ Manual memory management
❌ More verbose type conversions
❌ Steeper learning curve
```
```
Magnus Wrapper:
✅ Higher developer productivity
✅ Better memory safety
✅ Ergonomic Ruby class integration
✅ Idiomatic Rust error handling
❌ Small performance overhead
❌ Less control over Ruby internals
❌ Slightly higher learning curve for Ruby devs
❌ Fewer GVL optimization opportunities
```
Mixing Approaches
You can also mix the two approaches when appropriate. Magnus provides access to the underlying rb-sys functionality when needed:
use magnus::{function, prelude::*, Error, Ruby, value::ReprValue, IntoValue};
use std::os::raw::c_char;
fn high_level() -> String {
"High level".to_string()
}
// Helper macro for C strings
macro_rules! cstr {
($s:expr) => {
concat!($s, "\0").as_ptr() as *const c_char
};
}
unsafe extern "C" fn low_level(_: rb_sys::VALUE) -> rb_sys::VALUE {
// Direct rb-sys implementation
let c_string = match std::ffi::CString::new("Low level") {
Ok(s) => s,
Err(_) => return rb_sys::rb_str_new_cstr(c"".as_ptr()),
};
rb_sys::rb_str_new_cstr(c_string.as_ptr())
}
#[magnus::init]
fn init(ruby: &Ruby) -> Result<(), Error> {
let module = ruby.define_module("MixedExample")?;
// Use Magnus for most things
module.define_singleton_method("high_level", function!(high_level, 0))?;
// Use rb-sys directly for special cases
unsafe {
rb_sys::rb_define_module_function(
unsafe { std::mem::transmute::<_, rb_sys::VALUE>(module.as_value()) },
cstr!("low_level"),
Some(std::mem::transmute::<unsafe extern "C" fn(rb_sys::VALUE) -> rb_sys::VALUE, _>(low_level)),
0,
);
}
Ok(())
}
Enabling rb-sys Feature in Magnus
To access rb-sys through Magnus, enable the rb-sys
feature:
# Cargo.toml
[dependencies]
magnus = { version = "0.7", features = ["rb-sys"] }
Common Mixing Patterns
-
Use Magnus for most functionality, rb-sys for specific optimizations:
- Define your public API using Magnus for safety and ease
- Drop down to rb-sys in critical performance paths, especially when using
nogvl
-
Use rb-sys for core functionality, Magnus for complex conversions:
- Build core functionality with rb-sys for maximum control
- Use Magnus for handling complex Ruby objects or collections
-
Start with Magnus, optimize with rb-sys over time:
- Begin development with Magnus for rapid progress
- Profile your code and replace hot paths with direct rb-sys
Real-World Examples
Let's look at how real projects decide between these approaches:
Blake3-Ruby (Direct rb-sys)
Blake3-Ruby is a cryptographic hashing library that uses direct rb-sys to achieve maximum performance:
// Based on blake3-ruby - simplified example showing direct rb-sys usage
use rb_sys::{
rb_define_module, rb_define_module_function,
rb_str_new, VALUE, RSTRING_LEN, RSTRING_PTR,
};
use std::os::raw::c_char;
// Helper macro for creating C strings
macro_rules! cstr {
($s:expr) => {
concat!($s, "\0").as_ptr() as *const c_char
};
}
#[no_mangle]
pub extern "C" fn Init_digest_ext() {
unsafe {
// Create module hierarchy
let digest_module = rb_define_module(cstr!("Digest"));
// Define methods directly using rb-sys for maximum performance
rb_define_module_function(
digest_module,
cstr!("simple_hash"),
Some(std::mem::transmute::<unsafe extern "C" fn(VALUE, VALUE) -> VALUE, _>(rb_simple_hash)),
1,
);
}
}
unsafe extern "C" fn rb_simple_hash(_klass: VALUE, string: VALUE) -> VALUE {
// Extract data from Ruby VALUE
let data_ptr = RSTRING_PTR(string) as *const u8;
let data_len = RSTRING_LEN(string) as usize;
let data_slice = std::slice::from_raw_parts(data_ptr, data_len);
// Simple hash calculation (just for demonstration)
let mut hash: u32 = 0;
for &byte in data_slice {
hash = hash.wrapping_mul(31).wrapping_add(byte as u32);
}
// Convert hash to string
let hash_str = format!("{:08x}", hash);
let hash_bytes = hash_str.as_bytes();
// Return result as Ruby string
rb_str_new(hash_bytes.as_ptr() as *const c_char, hash_bytes.len() as i64)
}
LZ4-Flex-RB (Mixed Approach)
The LZ4-Flex-RB gem demonstrates a more sophisticated approach mixing Magnus with direct rb-sys calls:
// Based on lz4-flex-rb
use magnus::{function, prelude::*, Error, RModule, Ruby, RString, value::ReprValue, IntoValue};
use rb_sys::{rb_str_locktmp, rb_str_unlocktmp, RSTRING_PTR, RSTRING_LEN};
#[magnus::init]
fn init(ruby: &Ruby) -> Result<(), Error> {
let module = ruby.define_module("Lz4Flex")?;
// High-level API using Magnus
module.define_singleton_method("compress", function!(compress, 1))?;
module.define_singleton_method("decompress", function!(decompress, 1))?;
Ok(())
}
// Functions that mix high-level Magnus with low-level rb-sys
fn compress(input: RString) -> Result<RString, Error> {
let input_locked = LockedRString::new(input);
let bufsize = lz4_flex::block::get_maximum_output_size(input_locked.as_slice().len());
// Create output buffer
let mut output_vec = vec![0u8; bufsize];
// Compress the data
let outsize = lz4_flex::block::compress_into(
input_locked.as_slice(),
&mut output_vec
).map_err(|e| Error::new(magnus::exception::standard_error(), e.to_string()))?;
// Resize to actual output size
output_vec.truncate(outsize);
// Convert to Ruby string
let ruby = unsafe { magnus::Ruby::get_unchecked() };
Ok(RString::from_slice(&output_vec))
}
fn decompress(input: RString) -> Result<RString, Error> {
let input_locked = LockedRString::new(input);
// Decompress the data (need to provide max output size)
let max_size = input_locked.as_slice().len() * 20; // Conservative estimate
let decompressed = lz4_flex::block::decompress(input_locked.as_slice(), max_size)
.map_err(|e| Error::new(magnus::exception::standard_error(), e.to_string()))?;
// Convert to Ruby string
let ruby = unsafe { magnus::Ruby::get_unchecked() };
Ok(RString::from_slice(&decompressed))
}
// Helper for locked RString (uses rb-sys directly)
struct LockedRString(RString);
impl LockedRString {
fn new(string: RString) -> Self {
unsafe { rb_str_locktmp(std::mem::transmute::<_, rb_sys::VALUE>(string.as_value())) };
Self(string)
}
fn as_slice(&self) -> &[u8] {
unsafe {
let ptr = RSTRING_PTR(std::mem::transmute::<_, rb_sys::VALUE>(self.0.as_value())) as *const u8;
let len = RSTRING_LEN(std::mem::transmute::<_, rb_sys::VALUE>(self.0.as_value())) as usize;
std::slice::from_raw_parts(ptr, len)
}
}
}
impl Drop for LockedRString {
fn drop(&mut self) {
unsafe { rb_str_unlocktmp(std::mem::transmute::<_, rb_sys::VALUE>(self.0.as_value())) };
}
}