Async/Await on the GPU
Summary
VectorWare reports a milestone: Rust's async/await can run on the GPU, enabling structured concurrency in GPU-native code. The piece surveys warp specialization, compares approaches like JAX, Triton, and CUDA Tile, and details the use of futures and executors (block_on and Embassy) to run concurrent workloads on GPUs, while noting current limitations and future work.