zoned-storage

Zoned Block Devices (ZBDs) divide the LBA space into block regions called zones that are larger than the LBA size. They can only allow sequential writes, which can reduce write amplification in SSDs, and potentially lead to higher throughput and increased capacity. More details about ZBDs can be found at:

https://zonedstorage.io/docs/introduction/zoned-storage

1. Block layer APIs for zoned storage

QEMU block layer supports three zoned storage models: - BLK_Z_HM: The host-managed zoned model only allows sequential writes access to zones. It supports ZBD-specific I/O commands that can be used by a host to manage the zones of a device. - BLK_Z_HA: The host-aware zoned model allows random write operations in zones, making it backward compatible with regular block devices. - BLK_Z_NONE: The non-zoned model has no zones support. It includes both regular and drive-managed ZBD devices. ZBD-specific I/O commands are not supported.

The block device information resides inside BlockDriverState. QEMU uses BlockLimits struct(BlockDriverState::bl) that is continuously accessed by the block layer while processing I/O requests. A BlockBackend has a root pointer to a BlockDriverState graph(for example, raw format on top of file-posix). The zoned storage information can be propagated from the leaf BlockDriverState all the way up to the BlockBackend. If the zoned storage model in file-posix is set to BLK_Z_HM, then block drivers will declare support for zoned host device.

The block layer APIs support commands needed for zoned storage devices, including report zones, four zone operations, and zone append.

2. Emulating zoned storage controllers

When the BlockBackend’s BlockLimits model reports a zoned storage device, users like the virtio-blk emulation or the qemu-io-cmds.c utility can use block layer APIs for zoned storage emulation or testing.

For example, to test zone_report on a null_blk device using qemu-io is:

$ path/to/qemu-io --image-opts -n driver=host_device,filename=/dev/nullb0 -c "zrp offset nr_zones"

To expose the host’s zoned block device through virtio-blk, the command line can be (includes the -device parameter):

-blockdev node-name=drive0,driver=host_device,filename=/dev/nullb0,cache.direct=on \
-device virtio-blk-pci,drive=drive0

Or only use the -drive parameter:

-driver driver=host_device,file=/dev/nullb0,if=virtio,cache.direct=on

Additionally, QEMU has several ways of supporting zoned storage, including: (1) Using virtio-scsi: –device scsi-block allows for the passing through of SCSI ZBC devices, enabling the attachment of ZBC or ZAC HDDs to QEMU. (2) PCI device pass-through: While NVMe ZNS emulation is available for testing purposes, it cannot yet pass through a zoned device from the host. To pass on the NVMe ZNS device to the guest, use VFIO PCI pass the entire NVMe PCI adapter through to the guest. Likewise, an HDD HBA can be passed on to QEMU all HDDs attached to the HBA.