Hadoop File Compression
Hadoop作为一个较通用的海量数据处理平台,在使用压缩方式方面,主要考虑压缩速度和压缩文件的可分割性。
为了支持多种压缩解压缩算法,Hadoop引入了编码/解码器。与Hadoop序列化框架类似,编码/解码器也是使用抽象工厂的设计模式。
- 运行和调试MapReduce程序只需要有相应的Hadoop依赖包就行,可以完全当成一个普通的JAVA程序。
File Upload
Using Http PUT method for file upload :
The PUT method not as widely used as the POST method is the_ more efficient way of uploading files to a serve_r. This is because in a POST upload the files need to be combined together into a multipart message and this message has to be decoded at the server. In contrast, the PUT method allows you to simply write the contents of the file to the socket connection that is established with the server.When using the POST method, all the files are combined together into a single multipart/form-data type object. This MIME message when transferred to the server, has to be decoded by the server side handler. The decoding process may consume significant amounts of memory and CPU cycles for very large files.
The disadvantage with the PUT method is that if you are on a shared hosting enviorenment it may not be available to you. we need to enable the PUT verb for the server extension
ff