A novel approach to DBMS design called In-Page Logging (IPL) was proposed a decade ago. The IPL system exploits characteristic of flash memory; asymmetric write/read speed. This approach manages per page log in erase unit of flash memory and avoids page write but saves redo log. When the page is required merge operation, which is instant recovery process, generates updated version of page using old page and its redo log. In fact this never has been implemented to real DBMS system since lack of fast, persistent, byte-addressable, and affordable device. Since NVDIMM matches with concept of IPL, we implemented IPL to PostgreSQL, a commercial open source DBMS, employing it as IPL log device.
IPLization in PostgreSQL (106 downloads)
In Page Logging Approach
In-Page Logging is a novel design for flash-based DBMS. It can overcome limitations of SSD and exploit its advantages. IPL manages per page log in erase unit of flash memory and shows difference from existing system in 3 situations. The first is when writing to disk. At that time, IPL does not write data page but only records log data. Then the old page will remain on disk. Second is when reading from disk. It combines old version of the page with the log and perform instant recovery to generate updated page, which is merge operation. Since merge is fast enough, the host will only see new page from disk. Last, when log sector is full. When the allocated log area is full, logs and old pages are merged to updated version and written to disk.
This allows the DBMS to replace page writes to log writes, which is smaller, by merging once every time the log area is full. Instead of avoiding one write operation, one log read and one instant recovery are included. However, SSDs are much slower in writes than read, even if a merge process is added, this obviously causes performance gain. In addition, because the write amount to disk is reduced, lifespan SSD is expected to increase.
Unfortunately, flash-based IPL has never been implemented in real DBMS. The reason is that the current SSD only supports page wise writing, and cannot effectively write few bytes to the IPL log area. There have been attempts to apply it using byte-addressable NVRAM such as PCRAM, but the price of such device is too expensive to be commercialized. On the other hand, NVDIMM is as fast as DRAM, persistent and cheap. Applying IPL to DBMS by adopting NVDIMM as IPL log area will improve SSD storage performance and lifetime.
The IPL log area was allocated on NVDIMM and managed in page wise manner. PostgreSQL manages the log by LSN. Since IPL approach needs to manage logs in page units, we added a procedure to capture WALs log and organize it per page units. We implemented the three situations described above. First, when writing to disk, we blocked write operation occurred in buffer pool or caused by the background flusher. Second when reading from disk, the old page merges with corresponding IPL logs and passes updated version to the host. The merge process is implemented by benchmarking recovery logic of PostgreSQL. Lastly, when log sector is full, it does not prevent writing. It works same as the existing process, and cleans the log area. Except for new implementation of merge logic for IPL, there are only tens of lines changed in existing code.
Since operations such as btree split covers several pages and require too much time for merge operation, we have defined target pages, operations suitable for IPL. Heap file and index file are IPLized and IPL is applied to six operations: insert, delete, update, tuple level lock, heap clean, and btree insert. The heap clean operation here refers to the page wise vacuum that PostgreSQL performs for every page read. When the IPL is not performed, the page is operated just like the existing DBMS.
Some operations on the DBMS occasionally generate a single WAL log but take place across multiple pages. Suppose tuple A was updated which is one delete and one insert in PostgreSQL. Old version remains on page 100 and new version was created on page 110. If so, page 100 will have a delete IPL Log, and 110 will have an insert IPL Log. When each page is requested by DBMS, the merge process is performed without regard to each other. Page 100 and Page 110 are independent from the viewpoint of IPL merge operation, although both have been changed by one operation. It is possible because updated version and merged version are idempotent pages.
The upper figure illustrates how existing DBMS and IPL approach work in PostgreSQL. The time is specified for file I/O. The solid line indicates the IPL approach and dotted line indicates the existing DBMS. As it can be seen, IPL replaces one write with one read and merge, which is much faster.
In performance evaluation, we used PostgreSQL 9.4.5 for DBMS. You can look at full source code in the Github repository[will be added]. NVDIMM was emulated using Linux PMEM interface at DRAM.
Upper graph shows the transactional performance differences between original and IPLized PostgreSQL in each file I/O configuration. In normal file system, IPLized DBMS performance has dropped by 1.2%, in direct I/O environment increased by 5.9%, and in osync environment increased by 44.8%. For a normal file system, PostgreSQL only supports buffered I/O mode, which does IO operations against buffer cache. Therefore, the effort to reduce the amount of write toward storage was not effective.
Also, the process of storing IPL log and merge process speed became similar to read/write speed, which generate minor overhead. The IPL approach shows better performance at situation where large write toward storage occurs. To demonstrate the maximum performance improvement of IPL approach, following experiments were conducted using the sync option.
TABLE II. TPM, READ, WRITE ANALYSIS AT EACH LOG SIZE.
Upper table, figure show tmp, write/read amount toward storage according to log size. As the log size increases, the amount of write is greatly reduced and read is increased. It is the characteristic of IPL approach that write is replaced by read and merge operation. As a result, the transaction throughput has improved by up to 74%. The amount of write has been reduced by up to 17%, which would help expanding life of the SSD.
IPLized PostgreSQL showed better performance at write mount and transaction throughput. The amount of read has increased, but by the characteristics of asymmetric read/write speed on SSD, it did not cause performance degradation.
On this research, we implemented IPL system on commercial open source DBMS adopting NVDIMM. The IPL log area was allocated on NVDIMM and managed in page wise manner. The IPL approach replaces one write with one read and merge, which is much faster. The experiment showed improvement of reducing write amount in proportion to occurrence of write operation on DBMS. The bigger the log size, the smaller the amount of writes. As a result is led to performance gain and SSD lifetime increase.IPLization in PostgreSQL (106 downloads)