Most command buffers here are rather small (fewer than 256 words); it's a waste of time to dynamically allocate memory for such a small buffer when it could easily fit on the stack. Conditionally using an on-stack command buffer when the size is small enough eliminates the need for using a dynamically-allocated buffer most of the time, reducing GPU command submission latency. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Danny Lin <danny@kdrag0n.dev>