论文部分内容阅读
Fragmentation usually occurs when data space of original storage nodes has to be reallocated to new added storage nodes during the scale-out evolution of the large-scale storage system.It greatly influences its performance and becomes a challenge to manage the whole space.We present an efficient space management framework,called New Balance,to reduce fragmentation with the minimum data movement while keeping the storage system load balance.The space management framework has two phases including the collection phase and the allocation phase.For the collection phase,we propose a novel algorithm,called the greedy bi-direction collector,which collects enough space for the new storage nodes.For the allocation phase,we formally represent it as a variant of the bin packing problem and then utilize some bin packing heuristics including the first fitting and the best fitting to allocate collected intervals to new added storage nodes.The experimental results show that the amount of intervals can be reduced by 20%~55%and our algorithmic optimization improves the data lookup performance by at least 10%and the scale-out performance by 2X~3X.
Fragmentation usually when data space of original storage nodes has to be reallocated to new added storage nodes has to be reallocated to new added storage nodes have to be reallocated to new added storage nodes during to scale-out evolution of the large- scale storage system. If its performance and becomes a challenge to manage the whole space. We present an efficient space management framework, called New Balance, to reduce fragmentation with the minimum data movement while keeping the storage system load balance. the space management framework has two phases including the collection phase and the allocation phase. For the collection phase, we propose a novel algorithm, called the greedy bi-direction collector, which collects enough space for the new storage nodes. For the allocation phase, we formally represent it as a variant of the bin packing problem and then utilizes some bin packing heuristics including the first fitting and the best fitting to allocate collected intervals to new added storage nodes. The experimental results show that the amount of intervals can be reduced by 20% ~ 55% and our algorithmic optimization improves the data lookup performance by at least 10% and the scale-out performance by 2X ~ 3X.