1. Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107?113 2. Deutsch L P. Deflate compressed data format specification version 1.3. IETF RFC 1951.1996 3. Horspool R N. Improving LZW [data compression algorithm]. Proceedings of the Data Compression Conference (DCC’91), Apr 8?11, 1991, Snowbird, UT, USA. Piscataway, NJ, USA: IEEE, 1991: 332?341 4. Ziv J, Lempel A. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory,1997, 23(3): 337?343 5. Jas A, Ghosh-Dastidar J, Ng M E, et al. An efficient test vector compression scheme using selective Huffman coding. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, 2003, 22(6): 797?806 6. Poess M, Potapov D. Data compression in Oracle. Proceedings of the 29th International Conference on Very Large Data Bases (VLDB’03), Sep 12?13, 2003, Berlin, Germany. San Francisco, CA,USA: Morgan Kaufmann Publishers, 2003: 937?947 7. Fraser C C. An instruction for direct interpretation of LZ77?Compressed programs. MSR-TR-2002-90. Redmond, WA, USA: Microsoft Research, 2002 8. Marcelloni F, Vecchio M. A simple algorithm for data compression in wireless sensor networks. IEEE Communications Letters, 2008, 12, (6): 411?413 9. He Y, Lee R, Huai Y, et al. Rcfile: a fast and space-efficient data placement structure in mapreduce-based warehouse systems. Proceedings of the 27th International Conference on Data Engineering (ICDE’11), Apr 11?16, 2011, Hannover, Germany. Piscataway, NJ, USA: IEEE, 2011: 1199?1208 10. Hinds S C, Fisher J L, D’Amato D P. A document skew detection method using run-length encoding and the hough transform. Proceedings of the 10th International Conference on Pattern Recognition: Vol 1, Jun 16?21, 1990, Atlantic, NJ, USA. Piscataway, NJ, USA: IEEE, 1990: 464?468 11. Deutsch L P. Gzip file format specification version 4.3. IETF RFC 1952.1996 12. Thiagarajan A, Madden S. Querying continuous functions in a database system. Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD’08), Jun 9?12, 2008, Vancouver, Canada. New York, NY, USA: ACM, 2008: 791?804 13. Gandhi S, Nath S, Sur S, et al. Gamps: compressing multi sensor data by grouping and amplitude scaling. Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD’09), Jun 29?Jul 2, 2009, Providence, Rhode Island, USA. New York, NY, USA: ACM, 2009: 771?784 14. Agrawal R, Faloutsos C, Swami A. Efficient similarity search in sequence databases. Berlin, Germany: Springer, 1993 15. Urbani J, Maassen J, Bal H. Massive semantic Web data compression with MapReduce. Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC’10), Jun 21?25, 2010, Chicago, IL, USA. New York, NY, USA: ACM, 2010: 795?802 16. Ekanayake J, Pallickara S, Fox G. MapReduce for data intensive scientific analyses. Proceedings of the 4th International Conference on eScience (eScience’08), Dec 7?12, 2008, Indianapolis, IN, USA. Piscataway, NJ, USA: IEEE, 2008: 277?284 17. Liu X, Thomsen C, Pedersen T B. Etlmr: a highly scalable dimensional ETL framework based on MapReduce. Proceedings of the 13th International Conference on Data Warehousing and Knowledge Discovery (DAWAK’11), Aug 29?Sep 2, 2011, Toulouse, France. LNCS 6862. Berlin, Germany: Springer-Verlag, 2011: 96?111 18. Mackey G, Sehrish S, Wang J. Improving metadata management for small files in hdfs. Proceedings of the IEEE International Conference on Cluster Computing and Workshops (CLUSTER’09), Aug 31?Sep 4, 2009, New Orleans, LA, USA. Piscataway, NJ, USA: IEEE, 2009: 4p |