Buffers initialization
The code described in this section can be run from the
encase.rs example with
cargo run --features derive --example encase. Complete source code is also provided at the
end of this page.
wgcore exposes some utilities to ease the creation and initialization of wgpu::Buffer. All the structs mentioned
below are actually aliases for the more general GpuTensor for a n-dimensional array of values:
- GpuScalar: for a storage buffer with a single element.
- GpuVector: for a 1-dimensional array of values.
- GpuMatrix: for a 2-dimensional array of values.
Initializing webgpu buffers can be tricky as the buffer layouts must match the layout of the corresponding struct
defined on the WGSL shader. To facilitate the conversion between rust structure and raw bytes, wgcore can leverage
either bytemuck or encase.
Initialization with bytemuck
Initialization with bytemuck can be done using the ::init constructor of GpuScalar/GpuVector/GpuMatrix.
Note that not all types are capable of implementing the bytemuck::Pod trait due to alignment and padding restrictions.
If you can’t derive bytemuck::Pod for your own type, consider the solution based on encase.
- If a type Timplements thebytemuck::Podtrait, it can be passed toGpuScalar::init.
- Any type implementing AsRef<[T]>(likeVec<T>,&[T], orDVectorfromnalgebracan be passed toGpuVector::init.
- Any matrix type, parameterized by T, from thenalgebracrate can be passed toGpuMatrix::init.
#[derive(Copy, Clone, bytemuck::Pod, bytemuck::Zeroable)]
#[repr(C)]
pub struct BytemuckStruct {
    value: f32,
}
#[async_std::main]
async fn main() -> anyhow::Result<()> {
    let gpu = GpuInstance::new().await?;
    let data = (0..1000)
        .map(|x| BytemuckStruct { value: x as f32 })
        .collect::<Vec<_>>();
    let gpu_buffer = GpuVector::init(gpu.device(), &data, BufferUsages::STORAGE);
}
Initialization based on bytemuck will always be more efficient than initializing with encase
as it involves absolutely zero overhead.
Initialization with encase
If the structure cannot implement the bytemuck::Pod trait, it is possible to rely on the [encase::ShaderType] trait
instead by calling the ::encase constructor of GpuScalar/GpuVector/GpuMatrix.
- If a type Timplements theencaseShaderTypetrait, it can be passed toGpuScalar::encase.
- Any type implementing AsRef<[T]>(likeVec<T>,&[T], orDVectorfromnalgebracan be passed toGpuVector::encase.
- Any matrix type, parameterized by T, from thenalgebracrate can be passed toGpuMatrix::encase.
#[derive(Copy, Clone, encase::ShaderType)]
#[repr(C)]
pub struct EncaseStruct {
    value: f32,
    // This implies some internal padding, so we can’t rely on bytemuck.
    // Encase will handle that properly.
    value2: Vector4<f32>,
}
#[async_std::main]
async fn main() -> anyhow::Result<()> {
    let gpu = GpuInstance::new().await?;
    let a_data = (0..LEN)
        .map(|x| EncaseStruct {
            value: x as f32,
            value2: Vector4::repeat(x as f32 * 10.0),
        })
    let gpu_buffer = GpuVector::encase(gpu.device(), &data, BufferUsages::STORAGE);
}
The ::encase function will result in an extra allocation and a conversion of all elements in the provided vector or
matrix. Because of this overhead, it is recommended to initialize with bytemuck whenever possible.
Complete example
- main.rs
- kernel.wgsl
#[cfg(not(feature = "derive"))]
std::compile_error!(
    r#"
    ###############################################################
    ## The `derive` feature must be enabled to run this example. ##
    ###############################################################
"#
);
use nalgebra::Vector4;
use wgcore::gpu::GpuInstance;
use wgcore::kernel::{KernelInvocationBuilder, KernelInvocationQueue};
use wgcore::tensor::GpuVector;
use wgcore::Shader;
use wgpu::{BufferUsages, ComputePipeline};
#[derive(Copy, Clone, PartialEq, Debug, Default, bytemuck::Pod, bytemuck::Zeroable)]
#[repr(C)]
pub struct BytemuckStruct {
    value: f32,
}
#[derive(Copy, Clone, PartialEq, Debug, Default, encase::ShaderType)]
#[repr(C)]
pub struct EncaseStruct {
    value: f32,
    // This implies some internal padding, so we can’t rely on bytemuck.
    // Encase will handle that properly.
    value2: Vector4<f32>,
}
#[derive(Shader)]
#[shader(src = "encase.wgsl", composable = false)]
struct ShaderEncase {
    main: ComputePipeline,
}
#[async_std::main]
async fn main() -> anyhow::Result<()> {
    // Initialize the gpu device and its queue.
    //
    // Note that `GpuInstance` is just a simple helper struct for initializing the gpu resources.
    // You are free to initialize them independently if more control is needed, or reuse the ones
    // that were already created/owned by e.g., a game engine.
    let gpu = GpuInstance::new().await?;
    // Load and compile our kernel. The `from_device` function was generated by the `Shader` derive.
    // Note that its dependency to `Composable` is automatically resolved by the `Shader` derive
    // too.
    let kernel = ShaderEncase::from_device(gpu.device())?;
    // Create the buffers.
    const LEN: u32 = 1000;
    let a_data = (0..LEN)
        .map(|x| EncaseStruct {
            value: x as f32,
            value2: Vector4::repeat(x as f32 * 10.0),
        })
        .collect::<Vec<_>>();
    let b_data = (0..LEN)
        .map(|x| BytemuckStruct { value: x as f32 })
        .collect::<Vec<_>>();
    // Call `encase` instead of `init` because `EncaseStruct` isn’t `Pod`.
    // The `encase` function has a bit of overhead so bytemuck should be preferred whenever possible.
    let a_buf = GpuVector::encase(gpu.device(), &a_data, BufferUsages::STORAGE);
    let b_buf = GpuVector::init(gpu.device(), &b_data, BufferUsages::STORAGE);
    // Queue the operation.
    let mut queue = KernelInvocationQueue::new(gpu.device());
    KernelInvocationBuilder::new(&mut queue, &kernel.main)
        .bind0([a_buf.buffer(), b_buf.buffer()])
        .queue(LEN.div_ceil(64));
    // Encode & submit the operation to the gpu.
    let mut encoder = gpu.device().create_command_encoder(&Default::default());
    queue.encode(&mut encoder, None);
    gpu.queue().submit(Some(encoder.finish()));
    Ok(())
}
@group(0) @binding(0)
var<storage, read_write> a: array<EncaseStruct>;
@group(0) @binding(1)
var<storage, read> b: array<BytemuckStruct>;
struct BytemuckStruct {
    value: f32,
}
struct EncaseStruct {
    value: f32,
    value2: vec4<f32>
}
@compute @workgroup_size(64, 1, 1)
fn main(@builtin(global_invocation_id) invocation_id: vec3<u32>) {
    let i = invocation_id.x;
    if i < arrayLength(&a) {
        a[i].value += b[i].value;
        a[i].value2 += vec4(b[i].value);
    }
}