Round Egg 6: Compute shaders in Bevy


Pedro Burgos, Dominykas Jogela

Note: During this project we have gone through several Bevy releases. At this time we have updated the version to 0.11, we recommend reading through the wonderful release announcements. The 0.11 release, in particular changed shader imports.

The next step is to displace the vertices of our sphere to deform it. Our initial idea is to offload the computation to a compute shader. This is definitely unnecessary (and introduces a lot of problems, as we will see later), however this project is for exploration purposes so we will just go with it.

# Compute shaders (the hard way)

The first step is to setup our pipeline (similarly to how we did at the beginning of the series).

A useful debug tool is bevy_mod_debugdump. We can dump our render graph into a .dot file to later visualize it with graphviz.

let settings = bevy_mod_debugdump::render_graph::Settings::default();
let dot = bevy_mod_debugdump::render_graph_dot(&mut app, &settings);
std::fs::write("", dot).expect("Failed to write");

# Bevy's main world vs render world

Our first idea was to define a Compute Buffer, that we can access both from the shader and from our bevy code. This compute buffer will be used to calculate the displaced positions of the vertices. Those positions will be later introduced to the Sphere mesh.

Turns out this is not so simple, and it makes sense, transferring data from the CPU memory to the GPU requires copying the data. If we look at bevy_render's definition of Buffer we can see that it is just a handle to a buffer, not the buffer itself.

// bevy_render::render_resource::Buffer,
#[derive(Clone, Debug)]
pub struct Buffer {
    id: BufferId,
    value: ErasedBuffer,

We decided to simply duplicate or data structure, and have two structs, one for the CPU and one for the GPU (with a reference to the aforementioned Buffer).

#[derive(Debug, Deserialize, TypeUuid, TypePath, Clone, Resource, Deref)]
#[uuid = "3ecbac0f-f545-4473-ad43-e1f4243af51e"] // Be careful with the first digit DO NOT use "8.."
struct ComputeUsableBuffer {
    buffer: Vec<u8>,

struct GPUComputeUsableBuffer {
    buffer: bevy_render::render_resource::Buffer,

To translate between main (CPU) and render (GPU) world, there exists the trait RenderAsset, which "prepares" a resource for rendering. We simply implement it for our buffer types and the conversion should be handled by Bevy automatically.

impl RenderAsset for ComputeUsableBuffer {
    type ExtractedAsset = ComputeUsableBuffer;
    type PreparedAsset = GPUComputeUsableBuffer;
    type Param = SRes<RenderDevice>;

    // Extracts the asset into the render world
    fn extract_asset(&self) -> Self::ExtractedAsset {
    // Prepares the asset for rendering
    fn prepare_asset(
        buffer: Self::ExtractedAsset,
        render_device: &mut bevy::ecs::system::SystemParamItem<Self::Param>,
    ) -> Result<
    > {
        let vertex_buffer_data = buffer.buffer;
        let vertex_buffer = render_device.create_buffer_with_data(&BufferInitDescriptor {
            usage: BufferUsages::VERTEX
                | BufferUsages::COPY_DST // Destination for copy operations
                | BufferUsages::UNIFORM  // The data contained is uniform
                | BufferUsages::STORAGE, 
            label: Some("Mesh Vertex Buffer"), 
            contents: &vertex_buffer_data,
        Ok(GPUComputeUsableBuffer {
            buffer: vertex_buffer,

There is one more thing left to implement: A render pipeline for compute shaders (which is detached from the rendering pipeline) and a compute shader. Doing so is relatively similar to what was covered in the second post. Our rough prototype looks like this

Note: The code shown here is barely optimized, and it is not recommended for use. This is just a description of our process when trying to implement compute shaders.

A better example is surely:, to which we ended up moving later on.

struct SphereDeformationPipeline {
    texture_bind_group_layout: BindGroupLayout,
    init_pipeline: CachedComputePipelineId,
    update_pipeline: CachedComputePipelineId,

// FromWorld is the trait that we need for initialization - to gain 
// access to `&World`.
impl FromWorld for SphereDeformationPipeline {
    fn from_world(world: &mut World) -> Self {
        let texture_bind_group_layout =
                .create_bind_group_layout(&BindGroupLayoutDescriptor {
                    label: None,
                    entries: &[BindGroupLayoutEntry {
                        binding: 0,
                        visibility: ShaderStages::COMPUTE,
                        ty: BindingType::Buffer {
                            ty: BufferBindingType::Storage { read_only: false },
                            has_dynamic_offset: false,
                            min_binding_size: None, 
                        count: None,
        let shader = world
        let pipeline_cache = world.resource::<PipelineCache>();
        let init_pipeline = pipeline_cache.queue_compute_pipeline(ComputePipelineDescriptor {
            label: None,
            layout: vec![texture_bind_group_layout.clone()],
            push_constant_ranges: Vec::new(),
            shader: shader.clone(),
            shader_defs: vec![],
            entry_point: Cow::from("init"),
        let update_pipeline = pipeline_cache.queue_compute_pipeline(ComputePipelineDescriptor {
            label: None,
            layout: vec![texture_bind_group_layout.clone()],
            push_constant_ranges: Vec::new(),
            shader_defs: vec![],
            entry_point: Cow::from("update"),

        SphereDeformationPipeline {

And we need a function to take care of initialization before we run the pipeline:

fn queue_bind_group(
    mut commands: Commands,
    pipeline: Res<SphereDeformationPipeline>,
    render_device: Res<RenderDevice>,
    buffer: Res<ComputeUsableBuffer>,
) {
    // There is definitely a much better way to do this
    let vertex_buffer_data = buffer.clone().buffer;
    let vertex_buffer = render_device.create_buffer_with_data(&BufferInitDescriptor {
        usage: BufferUsages::VERTEX
            | BufferUsages::COPY_DST
            | BufferUsages::UNIFORM
            | BufferUsages::STORAGE,
        label: Some("Mesh Vertex Buffer"),
        contents: &vertex_buffer_data,

    let buffer = GPUComputeUsableBuffer {
        buffer: vertex_buffer,

    let buffer_binding = BufferBinding {
        buffer: &buffer.buffer,
        offset: 0,
        size: None,

    let bind_group = render_device.create_bind_group(&BindGroupDescriptor {
        label: None,
        layout: &pipeline.texture_bind_group_layout,
        entries: &[BindGroupEntry {
            binding: 0,
            resource: BindingResource::Buffer(buffer_binding),

# Packing everything into a Bevy plugin

Bevy has a system of plugins with which users can modularize and distribute functionality. We created a very simple plugin that encapsulates our pipeline by implementing the Plugin trait.

pub struct SphereDeformationPlugin;

impl Plugin for SphereDeformationPlugin {
    fn build(&self, app: &mut App) {
        let render_app = app.sub_app_mut(RenderApp);
        render_app.add_systems(Render, queue_bind_group.in_set(RenderSet::Queue));

    fn finish(&self, app: &mut App) {
        let render_app = app.sub_app_mut(RenderApp);