Replicate or Erasure Code?

When using object-based storage in a private or hybrid cloud, you usually have a choice between using replication or erasure coding to protect your data. Replication creates multiple copies to protect you data objects.  Erasure coding splits your data objects into fragments and adds “parity” fragments for data protection. Whether to replicate or erasure code--that’s the question.

Replication typically defaults to making three copies or replicas of the object being stored, which will require 200 percent more storage space than used by the original object before replication.  For example, if the size of the object is 100MB, then having 3 replicas to protect the object would require a total of 300MB of storage, or 200MB more than the original size of the 100MB object. In object-based storage there is no concept of an “original” copy of the object. All of the copies are the same and they are all called replicas of the object.

Erasure coding, which is derived from Reed-Solomon Error Correcting Codes, typically requires just 20 to 50 percent more storage space than used by the original object before erasure coding it.  For example, if the size of the object is 100MB, using 4+2 erasure coding would split the object into 4 data fragments, and calculate 2 parity fragments, for a total of 6 fragments.  The availability of any 4 out of the 6 fragments (data or parity) is sufficient to retrieve the object. In this example, erasure coding adds 50 percent in storage overhead compared to the 200 percent storage overhead required when using 3x replication.  

So, how do you decide between using object replication or erasure coding to protect your data?  Well, a “rule of thumb” is to use replication for data objects that are “warmer” or likely to be accessed, and to use erasure coding for data objects that are “colder” or not likely to be accessed.  However, some types of data objects, like image file backups, are more likely to be erasure coded, and whether they are considered “warm” or “cold” may not matter.

The size of your data objects can give you an indication whether to replicate or erasure code them.  Small data objects, say less than 1MB in size, may not be good candidates for erasure coding because it requires additional computation on the nodes in the storage cluster.  If you have lots of very small data objects, it will take more compute resources in the storage cluster to erasure code them, and more time to read them compared to replicating them.

Another consideration is whether your data objects will be “dispersed” over more than one physical location.  An object-based storage cluster can “disperse” replicated objects over multiple physical locations.  This can be very useful in positioning data objects close to where they will be needed or to provide a remote data protection location.

If your Internet bandwidth between physical locations is not particularly fast, then using erasure coding to disperse fragments of data objects to multiple physical locations may result in poor read performance.  It also increases the chance of not being able to access your data if one of the physical locations becomes inaccessible.  That said, there is a solution to this problem.  You can erasure code data objects in one physical location, and replicate the erasure coded data objects to another physical location.  The result is replicated, erasure coded objects that are stored in different physical locations.

Also, think about the amount of data you need to protect because it could add up to some interesting numbers in terms of the cluster size, and it might have direct bearing on whether you choose replication or erasure coding to protect your data. For example, if you have 1PB of data to protect, and use 3x replication, you will need 3PB of storage capacity.  The same 1PB of data protected by 4+2 erasure coding would need 1.5PB of storage.  Remember that you can use replication for some “buckets” and erasure coding for other “buckets” which gives you some flexibility in how you choose to protect your data and the amount of physical storage that will be required to do it.

Are all these replication and erasure coding features common to every object-based storage software vendor?  The answer is replication and erasure coding are supported by most object storage software vendors.  The exceptions are several vendors who only erasure code data.  That said, you still need to investigate how each vendor implements these data protection schemes and how they are managed.

If you want to make short work of your investigation, consider that the above mentioned replication and erasure coding features are available today in Cloudian’s HyperStore 5.2 appliances and software. As a Cloudian Preferred Partner, MonadCloud is ready to work with you to design and build a Cloudian-powered private storage cluster that meets your requirements for data protection and capacity storage.