project proposal - hdfs

Upload: ravelstein

Post on 08-Jan-2016

222 views

Category:

Documents


0 download

DESCRIPTION

Understanding HDFS

TRANSCRIPT

sv-lncs

1 Introduction

Hadoop is an indispensable tool for Big Data computing. Like any other distributed system, the success of its operation is augmented by its distributed file system architecture known as HDFS. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets.2 Requirements This project will involve Walk through of the open source Hadoop source code to understand and illustrate following HDFS operations. Each file system operation listed below will list functions and libraries called on both the client and the server when a file operation occurs.1) Open.

2) Read.3) Seek

4) Write5) Security of Files for operation 1-4. References

1.HDFS source code: http://hadoop.apache.org/hdfs/version_control.html 2 HDFS Java API: http://hadoop.apache.org/core/docs/current/api/