A native go client for HDFS
Перейти к файлу
microsoft-github-policy-service[bot] 8ede19d766
Auto merge mandatory file pr
This pr is auto merged as it contains a mandatory file and is opened for more than 10 days.
2022-11-28 19:13:49 +00:00
cmd/hdfs Tweak put tests 2018-03-16 12:59:08 -07:00
protocol Fix the merge from upstream (#9) 2018-03-17 13:22:31 -07:00
rpc Fix the merge from upstream (#9) 2018-03-17 13:22:31 -07:00
test Improve conf loading and surrounding tests 2018-03-16 12:59:08 -07:00
vendor/github.com Update dependencies 2018-03-16 12:59:08 -07:00
.gitignore divert minicluster output to a file 2015-02-25 12:23:52 +01:00
.travis.yml travis: attempt to fix hadoop tarball caching 2018-03-16 12:59:08 -07:00
Gopkg.lock Update dependencies 2018-03-16 12:59:08 -07:00
Gopkg.toml Convert from govendor to dep 2018-03-16 12:59:08 -07:00
LICENSE.txt add license (fixes #4) 2014-11-24 20:13:34 +01:00
Makefile Update installation instructions 2017-01-23 13:19:00 +01:00
README.md typo 2017-01-23 14:58:17 +01:00
SECURITY.md Microsoft mandatory file 2022-08-15 21:16:37 +00:00
client.go Deprecate NewForUser and NewForConnection 2018-03-16 12:59:08 -07:00
client_test.go Merge master branch to our branch (#6) 2017-10-26 11:44:59 -07:00
conf.go Improve conf loading and surrounding tests 2018-03-16 12:59:08 -07:00
conf_test.go Improve conf loading and surrounding tests 2018-03-16 12:59:08 -07:00
content_summary.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
content_summary_test.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
exceptions.go improve exception handling by switching on the exception class name 2014-10-13 12:05:22 +02:00
file_reader.go fix for #28: properly closing BlockReader on Seek() 2016-01-23 12:40:51 -08:00
file_reader_test.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
file_writer.go Fixed incorrect Append implementation. 2016-07-01 11:05:57 -06:00
file_writer_test.go Align to chunk boundaries when appending (fixes #61) 2017-02-25 20:03:39 +01:00
hdfs.go log doesn't make sense for examples 2014-10-14 16:15:18 +02:00
mkdir.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
mkdir_test.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
perms.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
perms_test.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
readdir.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
readdir_test.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
remove.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
remove_test.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
rename.go Force overwrite in rename and provide option support in mv 2016-10-26 10:15:58 -07:00
rename_test.go Force overwrite in rename and provide option support in mv 2016-10-26 10:15:58 -07:00
setup_test_env.sh set up releases 2017-01-23 12:52:27 +01:00
stat.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00
stat_fs.go Add StatFs and the df [-h] command 2017-03-14 00:47:54 +01:00
stat_fs_test.go Add StatFs and the df [-h] command 2017-03-14 00:47:54 +01:00
stat_test.go all: run goimports -w -l . 2015-11-23 16:25:37 -08:00

README.md

HDFS for Go

GoDoc build

This is a native golang client for hdfs. It connects directly to the namenode using the protocol buffers API.

It tries to be idiomatic by aping the stdlib os package, where possible, and implements the interfaces from it, including os.FileInfo and os.PathError.

Here's what it looks like in action:

client, _ := hdfs.New("namenode:8020")

file, _ := client.Open("/mobydick.txt")

buf := make([]byte, 59)
file.ReadAt(buf, 48847)

fmt.Println(string(buf))
// => Abominable are the tumblers into which he pours his poison.

For complete documentation, check out the Godoc.

The hdfs Binary

Along with the library, this repo contains a commandline client for HDFS. Like the library, its primary aim is to be idiomatic, by enabling your favorite unix verbs:

$ hdfs --help
Usage: hdfs COMMAND
The flags available are a subset of the POSIX ones, but should behave similarly.

Valid commands:
  ls [-lah] [FILE]...
  rm [-rf] FILE...
  mv [-fT] SOURCE... DEST
  mkdir [-p] FILE...
  touch [-amc] FILE...
  chmod [-R] OCTAL-MODE FILE...
  chown [-R] OWNER[:GROUP] FILE...
  cat SOURCE...
  head [-n LINES | -c BYTES] SOURCE...
  tail [-n LINES | -c BYTES] SOURCE...
  du [-sh] FILE...
  checksum FILE...
  get SOURCE [DEST]
  getmerge SOURCE DEST
  put SOURCE DEST

Since it doesn't have to wait for the JVM to start up, it's also a lot faster hadoop -fs:

$ time hadoop fs -ls / > /dev/null

real  0m2.218s
user  0m2.500s
sys 0m0.376s

$ time hdfs ls / > /dev/null

real  0m0.015s
user  0m0.004s
sys 0m0.004s

Best of all, it comes with bash tab completion for paths!

Installing the library

To install the library, once you have Go all set up:

$ go get -u github.com/colinmarc/hdfs

Installing the commandline client

Grab a tarball from the releases page and unzip it wherever you like.

You'll want to add the following line to your .bashrc or .profile:

export HADOOP_NAMENODE="namenode:8020"

To install tab completion globally on linux, copy or link the bash_completion file which comes with the tarball into the right place:

ln -sT bash_completion /etc/bash_completion.d/gohdfs

By default, the HDFS user is set to the currently-logged-in user. You can override this in your .bashrc or .profile:

export HADOOP_USER_NAME=username

Compatibility

This library uses "Version 9" of the HDFS protocol, which means it should work with hadoop distributions based on 2.2.x and above. The tests run against CDH 5.x and HDP 2.x.

Acknowledgements

This library is heavily indebted to snakebite.