Supporting Superpages and Lightweight Page Migration in Hybrid Memory Systems

Paper

发布日期: 2023-05-02

文章字数: 1.5k

阅读时长: 6 分

1. 论文信息

文章来自ACM Transactions on Architecture and Code Optimization, (TACO), 2019
Supporting Superpages and Lightweight Page Migration in Hybrid Memory Systems

所有作者及单位

XIAOYUAN WANG, HAIKUN LIU, XIAOFEI LIAO, JI CHEN, HAI JIN, YU ZHANG, and LONG ZHENG, 华中科技大学
BINGSHENG HE, 新加坡国立大学
SONG JIANG, 德克萨斯大学阿灵顿分校(UTA)

2. Background

在大内存系统中，超级页一直被用来减轻地址转换开销。然而，在由DRAM和NVM组成的混合存储系统中，超页面往往会阻碍轻量级页面迁移，而轻量级页面迁移对性能和能量效率至关重要。

Superpages have long been used to mitigate address translation overhead in large-memory systems. However, superpages often preclude lightweight page migration, which is crucial for performance and energy efficiency in hybrid memory systems composed of DRAM and NVM.

3. 解决了什么问题

如果大多数内存引用分布在超级页的一个小区域中，那么以超级页粒度（例如2MB）进行的页迁移会导致DRAM容量和带宽的巨大浪费，从而导致无法承受的性能开销。成本可能比超级页面迁移的好处还要大。这给超级页面的使用带来了一个困境，因为轻量级页面迁移可能会超过扩展TLB覆盖的好处。

However, page migration at the superpage granularity (e.g., 2MB) can incur unbearable performance overhead due to a vast waste of DRAM capacity and bandwidth if most memory references are distributed in a small region ofthe superpage (see Section 2.2). The cost may be even larger than the benefit of superpage migration. This presents a dilemma for the use of superpages,
since the lightweight page migration can outweigh the benefits of extended TLB coverage.

累积分布函数图怎么看
2MB超级页的累积分布函数与给定区间(108个周期)内一个超级页中被触及的4KB小页的数量
从图中可以看到，很大一部分工作负载有80%以上的概率：2MB的页面中被访问的4kb页面只有12.5%。
还有一张表格来说明问题：
4kb热页访问统计
使用的工作负载：

轻量级热页的标识：为了支持轻量级页迁移，大量工作提倡通过内存控制器监视内存访问。然而，当主存容量变大时，以每页粒度（即4KB）使用访问计数器会导致高得令人望而却步的存储开销。

Identification oflightweight hot pages: to support lightweight page migration, a large body of work advocates monitoring memory accesses through the memory controller [55, 63]. However, using access counters at per-page granularity (i.e., 4KB) leads to prohibitively high storage overhead when the capacity of main memory becomes large.

轻量级页面迁移对TLB覆盖率的影响：页面迁移通常会分割超级页面，从而破坏物理地址的连续性。

Impact oflightweight page migration on TLB coverage: page migrations often fragment superpages and thus break the physical address continuity.

热页寻址效率：由于热页占应用程序内存引用的主要部分，因此必须进一步减少DRAM中那些热页的地址转换开销。

Efficiency of hot pages addressing:ashot pages
contribute to a major portion of applications’ memory references, it is essential to further reduce
the overhead of address translation for those hot pages in the DRAM.

4. 其他学者解决这个问题的思路和缺陷

以前的工作主张分割超级页面以实现轻量级内存管理，如页面迁移和共享，同时牺牲地址转换的性能[37,58]当超级页面中的热小页面迁移到DRAM时，保持改进的TLB覆盖率仍然是一个挑战。

Previous work has advocated splintering superpages to enable lightweight memory management such as page migration and sharing, while sacrificing the performance of address translation [37,58]. It is still a challenge to retain the improved TLB coverage when the hot small pages within superpages are migrated to the DRAM.

5. 围绕该问题作者如何构建解决思路

针对上述问题，提出了一种新的内存管理机制Rainbow, Rainbow在Superpage粒度上管理NVM，并使用DRAM在每个Superpage内缓存频繁访问（热）的小页面。相应地，Rainbow利用拆分TLB[2,7,30,52]的可用硬件特性来支持不同的页面大小，其中一个TLB用于寻址超级页面，另一个TLB用于寻址小页面。 Rainbow将SuperPage中的热小页迁移到DRAM中，而不会损害SuperPage TLB的完整性。因此，Rainbow实际上将DRAM架构为NVM的缓存。

为了减少细粒度页面访问计数的存储开销，分两个阶段进行计数。在给定的时间间隔内，Rainbow首先计算Superpage粒度下的NVM内存访问，然后选择前N个热门Superpage作为目标。在第二阶段，我们只监视那些小页面(4KB)粒度的热点超页面，以识别热点小页面。这种基于历史的策略避免了监视大量冷超页中的子块（4KB页），从而显著降低了热页识别的开销。
我们采用拆分TLB来加速DRAM和NVM引用的地址转换性能。当一些小页迁移到DRAM时，为了保持SuperPages TLB的完整性，我们在内存控制器中使用位图来识别迁移的热页，而不会分裂SuperPages。
我们提出了一种物理地址重映射机制来访问DRAM中迁移的热页，而不必为寻址DRAM页而遭受昂贵的页表遍历。为了实现这一目标，我们将迁移的热点页面的目的地址存储在其原始住所（超级页面）中。一旦热页对应的TLB未命中，DRAM页寻址应求助于对超级页的间接访问。这种设计在逻辑上利用了SuperPage TLBS作为4KB页面TLB的下一级缓存。因为Superpage TLB命中率通常很高，所以Rainbow可以显著加快DRAM页面寻址的速度。