With the advantages of low latency, high performance and low power consumption, solid state drives (SSDs) have been widely deployed as the cache layer between memory and back-end low-speed storage devices to narrow the performance gap between CPU and storage system. But in virtualization environment, the high integration of virtual machines can introduce a lot of duplicate data blocks in the cache device. Existing cache architectures and replacement algorithms rarely take this situation into consideration. This greatly limits the efficient use of the cache device. For this case, we proposed a duplication-aware SSD-based cache architecture. In this architecture, duplicate data blocks can be reduced and the utilization efficiency of the cache device will be improved. Furthermore, to reduce the cache replacement overhead we also proposed an improved ARC-based replacement strategy, which we named D-ARC.
Fig. 1. The simplified deployment architecture
Fig. 2. Functional modules in cache. AHT denotes address hash table, SHT denotes signature hash table, EDB denotes evicted dirty block buffer, RFB denotes refill buffer.
Fig. 3. Concept picture of D-ARC algorithm. All the data blocks which will be inserted into T2 should be compared with the cache block in position IP.
Fig. 4. Cache hit ratio. X-DEDU represents the performance of duplication-aware cache with X replacement strategy. X-NODE represents the conventional cache architecture with X replacement strategy.
Fig. 5. SSD write count. Duplication-aware cache can reduce SSD write operations and prolong its lifetime.
Fig. 6. Performance comparison between ARC and D-ARC.
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.