Compacting Instruction Data with Codama

Mar 11, 2025

Solana

Engineering

As we shift away from using Anchor, Shank and Codama have emerged as key tools for us. Shank handles IDL generation, while Codama generates clients—such as Rust SDKs—for Solana programs. These tools are powerful, but still have there limitations and frankly lack sufficient documentation.

One issue encountered involves Shank's default behavior with types of unknown length. Shank restricts certain types, rejecting slices like [T] and requiring Vec<T> instead. By default, Codama uses Borsh for (de)serialization, which prefixes every Vec with a 4-byte (u32) length indicator. For applications demanding minimal instruction data size—where every byte counts—this 4-byte overhead proves excessive, especially when vectors are guaranteed to stay under 255 elements, fitting within a single u8.
Initial attempts to use slices with manually embedded length headers in structs proved cumbersome. A more effective solution emerged: override Codama’s default serialization by removing the BorshSerialize trait and implementing a custom (de)serializer with a u8 length prefix.

The Solution: Custom Serialization with traitOptions
Here’s the approach:

1. Define Structs Without Borsh Defaults: Use Vec<T> in Rust code as usual for instruction arguments.

2. Override traitOptions in Codama: Adjust traitOptions to prevent generation of default BorshSerialize and BorshDeserialize implementations.

Codama.accept(
  renderers.renderRustVisitor(path.join(dir, "src", "generated"), {
    formatCode: true,        
    crateFolder: rustClientsDir,        
    deleteFolderBeforeRendering: true,        
    traitOptions: {          
      // We skip adding Borsh to allow for a custom implementation          
      // so we can adjust the length prefix for Vecs          
      baseDefaults: ["Clone", "Debug", "Eq", "PartialEq"],            
      overrides: {          
        // specific overrides can be used when it's only required for certain types.          
        // However, this did not always generate correctly in practice.            
      }        
    },      
  })    
);

3. Implement Custom (De)Serializer: Create serialization logic that uses a u8 prefix. Example:

// src/serialization.rs - make sure this is not in the same folder as generated code
impl BorshSerialize for InstructionArgs {
  fn serialize<W: std::io::Write>(&self, writer: &mut W) -> std::io::Result<()> {
    self.field1.serialize(writer)?; 
    self.field2.serialize(writer)?;
    let len = self.variable_data.len(); 
    if len > 255 {
      return Err(io::Error::new(io::ErrorKind::InvalidInput, "variable_data length exceeds u8 max (255)"));
    }
    (len as u8).serialize(writer)?; // u8 length prefix
    for data in self.variable_data.iter() {
      data.serialize(writer)?;
    }
    Ok(())
  }
}

This method ensures instruction data remains compact, using just one byte for the length prefix.

Pros and Cons

This solution offers benefits and drawbacks:
Pros:
- Single Byte Efficiency: Reduces length prefix overhead by 75%, from 4 bytes to 1. For applications where transaction size is a limitation, keeping instruction data overhead is very important!

- No Post-Processing: Eliminates need for modifying generated IDL or code after generation.
Cons:
- Client Compatibility: Requires downstream client generators to adopt the same u8-based serialization, or mismatches with Borsh’s u32 default will occur.

Why It Matters

In Solana development, where instruction data impacts compute units and transaction costs, minimizing byte usage is critical. For high-throughput applications or those nearing the 1,232-byte transaction limit, such optimizations significantly enhance performance.

Final Thoughts

Overriding traitOptions in Codama to bypass Borsh defaults and implement custom serialization provides an effective way to minimize instruction data size. While coordination with client developers poses a challenge, the approach proves valuable for projects prioritizing compactness. When facing oversized Vec prefixes, this method offers a practical solution worth considering.

Compacting Instruction Data with Codama

Taylor Johnson

Mar 11, 2025

Solana

Engineering

The Solution: Custom Serialization with traitOptions
Here’s the approach:

1. Define Structs Without Borsh Defaults: Use Vec<T> in Rust code as usual for instruction arguments.

2. Override traitOptions in Codama: Adjust traitOptions to prevent generation of default BorshSerialize and BorshDeserialize implementations.

Codama.accept(
  renderers.renderRustVisitor(path.join(dir, "src", "generated"), {
    formatCode: true,        
    crateFolder: rustClientsDir,        
    deleteFolderBeforeRendering: true,        
    traitOptions: {          
      // We skip adding Borsh to allow for a custom implementation          
      // so we can adjust the length prefix for Vecs          
      baseDefaults: ["Clone", "Debug", "Eq", "PartialEq"],            
      overrides: {          
        // specific overrides can be used when it's only required for certain types.          
        // However, this did not always generate correctly in practice.            
      }        
    },      
  })    
);

3. Implement Custom (De)Serializer: Create serialization logic that uses a u8 prefix. Example:

// src/serialization.rs - make sure this is not in the same folder as generated code
impl BorshSerialize for InstructionArgs {
  fn serialize<W: std::io::Write>(&self, writer: &mut W) -> std::io::Result<()> {
    self.field1.serialize(writer)?; 
    self.field2.serialize(writer)?;
    let len = self.variable_data.len(); 
    if len > 255 {
      return Err(io::Error::new(io::ErrorKind::InvalidInput, "variable_data length exceeds u8 max (255)"));
    }
    (len as u8).serialize(writer)?; // u8 length prefix
    for data in self.variable_data.iter() {
      data.serialize(writer)?;
    }
    Ok(())
  }
}

This method ensures instruction data remains compact, using just one byte for the length prefix.

Compacting Instruction Data with Codama

The Solution: Custom Serialization with traitOptionsHere’s the approach:

Pros and Cons

Why It Matters

Final Thoughts

Compacting Instruction Data with Codama

The Solution: Custom Serialization with traitOptionsHere’s the approach:

Pros and Cons

Why It Matters

Final Thoughts

The Solution: Custom Serialization with traitOptions
Here’s the approach:

The Solution: Custom Serialization with traitOptions
Here’s the approach: