There is a growing trend towards heterogeneous systems, which contain CPUs and GPGPUs in a single chip. Managing those various on-chip resources shared between CPUs and GPGPUs, however, is a big issue and the last-level cache (LLC) is one of the most critical resources due to its impact on system performance. Some well-known cache replacement policies like LRU and DRRIP, designed for a CPU, can not be so well qualified for heterogeneous systems because the LLC will be dominated by memory accesses from thousands of threads of GPGPU applications and this may lead to significant performance downgrade for a CPU. Another reason is that a GPGPU is able to tolerate memory latency when quantity of active threads in the GPGPU is sufficient, but those policies do not utilize this feature.
In this paper we propose a novel shared LLC management policy for CPU-GPGPU heterogeneous systems called Buffer Filter which takes advantage of memory latency tolerance of GPGPUs. This policy has the ability to restrict streaming requests of GPGPU by adding a buffer to memory system and vacate LLC space for cache-sensitive CPU applications. Although there is some IPC loss for GPGPU but the memory latency tolerance ensures the basic performance of GPGPU’s applications. The experiments show that the Buffer Filter is able to filtrate up to 50% to 75% of the total GPGPU streaming requests at the cost of little GPGPU IPC decrease and improve the hit rate of CPU applications by 2x to 7x.