Journal of Computers, Vol 7, No 6 (2012), 1303-1311, Jun 2012
doi:10.4304/jcp.7.6.1303-1311

LSP: A Locality-Aware Strip Prefetching Scheme for Striped Disk Array Systems with Concurrent Accesses

Xiaodong Shi, Dan Feng

Abstract


In striped disk array systems, the independency of disks for prefetching is more important than parallelism under high concurrency of accesses, based on which strip prefetching with low read cost has more ability to improve the performance of RAID. However, it indiscriminately fetching all the involved strips limits its applicability. To solve this problem, we propose a Locality-aware Strip Prefetching Scheme (LSP), where it keeps track of the users’ accesses and identifies the hot data areas. Only those strips located in the hot data areas will be prefetched and each prefetching request fetches one strip. LSP has several advantages. First, LSP adapts to the evolving workloads in an online and self-tuning fashion and satisfies the striped disk array systems due to the low read cost in each prefetching request. Second, LSP fully exploits the spatial locality in users’ accesses. Here, the spatial locality is more general, which includes multiple simple patterns, such as loop references, sequentiality, reverse references, and other locality-awarded patterns. Third, LSP discriminates the hot data areas from the cold data areas when prefetching, which significantly alleviates the waste of disk bandwidth, optimizes the cache utilization and improves the prefetching accuracy. We have implemented the prototype of LSP algorithm in Linux kernel 2.6.18. The experimental results show that LSP outperforms SP and Sequential prefetching (SEQP) by up to 22.4% and 24.1% in terms of the average response time, and by up to 1.5 times and 2.3 times in terms of throughput, respectively.


Keywords


Striped disks array systems; Independency of disks; Strip prefetching; Spatial locality

References


 

[1] Santos, J.R., Muntz, R.R., Ribeiro-Neto, B., Comparing Random Data Allocation and Data Striping in Multimedia Servers. In Proceedings of the 2000 SIGMETRICS Confer-ence on Measurement and Modeling of Computer Systems, ACM Press, 2000, Santa Clara, CA, 44-55

[2] Baek, S.H., and Park, K.H. Prefetching with Adaptive Cache Culling for Striped Disk Arrays, In Proc. of USENIX Annual Technical Conference, pp. 363-376, June, 2008

[3] Liang, S., Jiang, S., and Zhang, X. STEP: Sequentiality and thrashing detection based prefetching to improve performance of networked storage servers. In Distributed Computing Systems, 2007. ICDCS’07. 27th International Conference on (2007), pp. 64-.

[4] Li, M., Varki, E., Bhatia, S., and Merchant, A. Tap: Ta-ble-based prefetching for storage caches. In Proc. of the 6th USENIX Conference on File and Storage Technolo-gies(Feb,2008), pp. 81-96.

[5] Gill, B.S., and Bathen, L.A.D. AMP: Adaptive multistream prefetching in a shared cache. In Proc. of USENIX 2007 Annual Technical Conference (2007), 5th USENIX Confer-ence on File and Storage Technologies.

[6] D. Patterson, G.Gibson, and R.Katz. A case for redundant arrays of inexpensive disk(DISK). In International conferece on Management of Data, pages 10-116, June 1988

[7] P.Chen, E.Lee, G.Gibson, R.Katz, and D.Patterson. RAID: high-performance, Reliable secondary storage. ACM Com-puting Surveys, 26(2):145-188, June 1994
http://dx.doi.org/10.1145/176979.176981

[8] P.Scheuermann, G.Weikum, P.Zabback: Data partitioning and load balancing in parallel disk systems, VLDB Journal 7(3), 1998

[9] Tracy Kimbrel and Anna R. Karlin. Near-optimal Parallel Prefetching and Caching. In Proceedings of the 1996 IEEE Symposium on Foundations of Computer Science, October 1996.

[10] L. Tian, D. Feng, H. Jiang, K. Zhou, L. Zeng, J. Chen, Z. Wang, and Z. Song. PRO: A Popularity-based Multi- threaded Reconstruction Optimization for RAID Structured Storage Systems. In FAST’07, Feb. 2007.

[11] Storage Performance Council. http://www.storageperform ance.org/home.

[12] Y.Zhu and H. Jiang, “False rate analysis of Bloom filter replicas in distributed systems,” International Conference on Parallel Processing(ICPP), pp.255-262, 2006.

[13] Baek, S.H., and Park, K.H., 2009. Striping-Aware Sequen-tial prefetching for independency and Parallelism in Disk Arrays with Concurrent Accesses. IEEE ToC. 58(8). 1146-1152.

[14] Z.Li, Z.Chen, S.Srinivasan, and Y.Zhou. C-Miner: Mining block correlations in storage systems. In Third USENIX Symposium on File and Storage Technologies(FAST’04), pages 173-186, April 2004.

[15] P. Xia, D. Feng, H. Jiang, L. Tian, F. Wang. FARMER: A novel approach to file access correlation mining and evalua-tion reference model for optimizing peta-scale file system performance. In the International Symposium on High Per-formance Distributed Computing(HPDC’08), June, 2008

[16] R.Arnan, E.Bachmat, T.K.Lam, and R.Michel. Dynamic data reallocation in disk arrays. ACM Transactions on Storage, 3(1), 2007
http://dx.doi.org/10.1145/1227835.1227837

[17] L.Cherkasova and G.Ciardo. Characterizing temporal locality and its impact on web server performance. Technical Report HPL-2000-82, Hewlett Packard Laboratories, Jul. 2000

[18] L. Cherkasova and M.Gupta. Analysis of enterprise media server workloads: access patterns, locality, content evolution, and rates of change. IEEE/ACM Transactions on networking, 12(5):781-794, Oct. 2004.
http://dx.doi.org/10.1109/TNET.2004.836125

[19] W.W.Hsu, A.J.Smith, and H.C.Yong, I/O reference behav-ior of Production Database Wrokloads and the TPC Bench-marks-an analysis at the logical level, ACM Trans. Database Syst. 26, No.1, 96-143 (March 2001)
http://dx.doi.org/10.1145/383734.383737

[20] Peng Gu, Yifeng Zhu, Hong Jiang, Jun Wang: Nexus: A novel weighed-graph-based prefetching algorithm for Metadata Servers in petabyte-scale storage systems IEEE International Sympoium on Cluster Computing and the Grid(CCGRID’06), page:409-416, 2006

[21] R.H.Patterson, G.A.Gibson, E.Ginting, D.Stodolsky, and J.Zelenka, Informed prefetching and caching. In High Per-formance Mass Storage and Parallel I/O: Technologies and Applications. New York, NY:IEEE Computer Society Press and Wiley, 2001, 16, p. 224-244.

[22] KALLAHALLA, M., AND VARMAN, P. J. PC-OPT: Optimal offline prefetching and caching for parallel I/O sys-tems. IEEE Trans. on Computers 51, 11 (Nov. 2002), 1333–1344.
http://dx.doi.org/10.1109/TC.2002.1047757

[23] R. Shah, P. J. Varman, and J. S. Vitter. Online algorithms for prefetching and caching on parallel disks. In Proceedings of the 16th Annual ACM Symposium on Parallelism in Al-gorithms and Architectures, pages 255–264, June 2004.

[24] M. Kallahalla and P. J. Varman. Optimal read-once parallel disk scheduling. In Proceedings of the Sixth ACM Workshop on I/O in Parallel and Distributed Systems, pages 68–77, Atlanta, GA, 1999. ACM Press, New York, 1999.

[25] D.P. Bovet and M. Cesati. Understanding the linux kernel. O’Reilly, 2005

[26] J.Skeppstedt and M.Dubois, Compiler controlled prefetching for multiprocessors using low-overhead traps and prefetch engines. J. Parallel Distrib. Comput, vol.60, no.5, pp. 585-615, 2000.
http://dx.doi.org/10.1006/jpdc.2000.1623


Full Text: PDF


Journal of Computers (JCP, ISSN 1796-203X)

Copyright @ 2006-2013 by ACADEMY PUBLISHER – All rights reserved.