In recent years, Hadoop provides three solutions for the handle of small files, namely archive technology, sequence file technology, and merging file technology. Improving the ability of a distributed file system to handle a huge number of small files has become an urgent problem. As a consequence, there is a sharp performance decline in processing small files of data servers. Traditional file systems have low performance when processing small files. A sharp increase of metadata results in the number of files and data blocks has limitations by the metadata server memory. It also raises two issues: The first is the problem of a limited total number of files. When dealing with small files, the number of data blocks in the file system increased dramatically. Metadata is stored in a metadata server while the data blocks are stored in the data servers. It cuts large user files into a number of data blocks (such as 64 M). HDFS is designed for the efficient storage of and access to massive big files. Li, in Big Data, 2016 9.5.1 Small File Performance Optimization
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |