Easy methods to scale back information storage prices by as much as 50% with Ceph

Faheem

Canonical Ceph with Intel Fast Help Expertise (QAT)

Picture by Mathieu Turle on Unsplash

In our final blog put up we talked about how you should use Intel® QAT with Canonical Ceph, in the present day we’ll cowl why this expertise is vital from a enterprise perspective – in different phrases, we’re speaking information storage prices.

Retaining and defending information has an inherent value based mostly on the underlying structure of the system used to retailer it. Within the public cloud that is very simple to know, as every GB saved incurs a per unit charge, with extra prices based mostly on how continuously the storage is accessed (learn extra about that in our weblog put up on cloud storage prices here).

On premise options are usually extra complicated to calculate a cost-per-gigabyte worth as you additionally must take note of the facility, cooling, bodily area, in addition to the {hardware} together with networking, and ongoing upkeep.  For simplicity’s sake, on this put up we are going to study a typical storage server configuration, specializing in the {hardware} elements themselves, as environmental prices can range extensively throughout the globe, however would stay fixed if both configuration was deployed.

A very powerful side is to know the impression of a {hardware} choice and the way it can have an effect on the full value of possession (TCO), as we’ll discover under.

Utilizing a well-known vendor’s on-line configuration instrument we will study the fee variations between related server configurations. As we’re specializing in compression the one distinction between the configurations would be the CPUs, with and with out {hardware} offload (QAT).

Nevertheless, it ought to be famous that the biggest driver of value in any storage system are the disks themselves, for instance the record worth of a 15.36TB NVMe drive is $9,396.63, which signifies that in a single cluster node simply the price of disks is over $225,000, with the rest of the elements costing roughly $20,000.

By swapping out the CPU in a server configuration we will discover how a lot extra value could be launched by including QAT offload engines, within the examples under we’ve chosen equal CPUs with and with and with out QAT.  Each server configurations present 368.64TB of uncooked cupboard space, and the per GB prices under assume a 3-Reproduction safety scheme.

Server Configuration No QAT QAT enabled
Processor 2x Xeon 6448Y 2x Xeon 6548N
Reminiscence 256GB RAM 256GB RAM
OSD disks 24x 15.36TB NVMe 24x 15.36TB NVMe
Networking Twin 100GbE Twin 100GbE
Boot disks 2x 1TB SSD  2x 1TB SSD 
Whole value $242,472 $243,032
Per GB value (3-Reproduction) $1.89/GB $1.90/GB
Comparable server configurations with and with out {hardware} offload.

Primarily based on the undiscounted record costs we will see that including QAT does certainly enhance the price of a Ceph storage server by $560, or a rise of 0.25%, however this could simply be justified when contemplating a dataset that may be compressed.  As may be seen under, even information that’s already saved in a compressed format resembling JPEG, can profit from inline compression in a Ceph storage system, with different information varieties seeing larger compression and thus area financial savings.

Knowledge Set Kind Compression Ratio House Saving
MinIO Warp (400GB) Artificial CSV 1.33 25%
Photos (1.1GB) 5000 Jpgs 1.01 1%
COVID-19 Analysis Knowledge (100GB) Json, CSV, Textual content 1.33 25%
Video (200GB) RAW YUV 3.13 68%
Video (110GB) H.264 1.00 0%
Compression ratios skilled with totally different information varieties.

Utilizing the server configuration that features QAT offload engines, the efficient value to retailer a single GB of information is $1.90. Because the compression ratio will increase so do the financial savings as may be seen under:

Compression % 0 10 20 30 40 50
Capability GB 383,386 425,984 479,232 547,694 638,976 766,771
Price per GB $1.90 $1.71 $1.52 $1.33 $1.14 $0.95
Lowering value per GB with larger compression ranges

Key takeaways

Whereas initially, it’s pure to assume that in the event you add extra {hardware}, or change a processor mannequin for one more with extra options, there will probably be extra prices.  Which as we will see in these server examples is technically true, at first look. Nevertheless, within the situation above altering the CPU mannequin is simply 0.25% of the full upfront value, however as soon as we begin to take note of the financial savings offered by the {hardware} offload compression to precise TCO per GB saved can fall dramatically, and with some varieties of information, like uncooked video over 50% is feasible!

Even so, for information units that may be compressed the extra value is definitely mitigated, and for information that has the best compression ratios the fee financial savings grow to be vital.

The larger the compression ratio, the much less capability required, which in turns reduces the variety of storage servers required, and fewer community ports, which results in a discount in energy consumption and facility prices too! These financial savings apply to all varieties of disks, together with decrease value NL-SAS, because the compression is utilized inline as the item storage damon (OSD) processes consumer IO.

You should definitely learn our previous blog put up if you wish to check out Ceph with QAT.

Be a part of our webinar

Discover out extra about how Ceph and QAT can be utilized to enhance storage effectivity in our upcoming webinar, Maximize your storage efficiency with Ceph, on twelfth February 2025, at 5PM CET, 11AM ET.

Further assets

Leave a Comment