Add ability to compile against Cloudera or Apache Hadoop.

Added more thorough compilation instructions.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149879 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Andrew Bayer 2011-07-22 20:03:41 +00:00
Родитель 0fde7dff8d
Коммит 22190b9ba3
5 изменённых файлов: 212 добавлений и 17 удалений

130
COMPILING.txt Normal file
Просмотреть файл

@ -0,0 +1,130 @@
= Compiling
This document explains how to compile Sqoop.
== Build Dependencies
Compiling Sqoop requires the following tools:
* Apache ant (1.7.1)
* Java JDK 1.6
Additionally, building the documentation requires these tools:
* asciidoc
* make
* python 2.5+
* xmlto
* tar
* gzip
Furthermore, Sqoop's build can be instrumented with the following:
* findbugs (1.3.9) for code quality checks
* cobertura (1.9.4.1) for code coverage
== The Basics
Sqoop is compiled with ant. Type +ant -p+ to see the list of available targets.
Type +ant jar+ to compile java sources into jar files. Type +ant package+ to
produce a fully self-hosted build. This will appear in the
+build/sqoop-(version)/+ directory.
== Testing Sqoop
Sqoop has several unit tests which can be run with +ant test+. This command
will run all the "basic" checks against an in-memory database, HSQLDB.
Sqoop also has compatibility tests that check its ability to work with
several third-party databases. To enable these tests, you will need to install
and configure the databases, and download the JDBC drivers for each one.
=== MySQL
Install MySQL server and client 5.0. Download MySQL Connector/J 5.0.8 for
JDBC. Instructions for configuring the MySQL database are in MySQLAuthTest
and DirectMySQLTest.
=== Oracle
Install Oracle XE (Express edition) 10.2.0. Instructions for configuring the
database are in OracleManagerTest. Download the ojdbc6_g jar.
=== PostgreSQL
Install PostgreSQL 8.3.9. Download the postgresql 8.4 jdbc driver. Instructions
for configuring the database are in PostgresqlTest.
=== Running the Third-party Tests
After the third-party databases are installed and configured, run:
++++
ant test -Dthirdparty=true -Dsqoop.thirdparty.lib.dir=/path/to/jdbc/drivers/
++++
== Multiple Hadoop Distributions
Sqoop can be compiled against different versions of Hadoop. Both the svn
trunk of Apache Hadoop, and Cloudera's Distribution for Hadoop (CDH3)
can be used as the underlying Hadoop implementation.
By default, Sqoop will compile against the latest snapshot from Apache
(retrieved through maven). You can specify the Hadoop distribution to
retrieve with the hadoop.dist property. Valid values are "apache" or
"cloudera":
++++
ant jar -Dhadoop.dist=apache
ant jar -Dhadoop.dist=cloudera
++++
To switch between builds, you will need to clear Ivy's dependency
cache: +ant veryclean+
== Code Quality Analysis
We have two tools which can be used to analyze Sqoop's code quality.
=== Findbugs
Findbugs detects common errors in programming. New patches should not
trigger additional warnings in Findbugs.
Install findbugs (1.3.9) according to its instructions. To use it,
run:
++++
ant findbugs -Dfindbugs.home=/path/to/findbugs/
++++
A report will be generated in +build/findbugs/+
=== Cobertura
Cobertura runs code coverage checks. It instruments the build and
checks that each line and conditional expression is evaluated along
all possible paths.
Install Cobertura according to its instructions. Then run a test with:
++++
ant clean
ant cobertura -Dcobertura.home=/path/to/cobertura
ant cobertura -Dcobertura.home=/path/to/cobertura \
-Dthirdparty=true -Dsqoop.thirdparty.lib.dir=/path/to/thirdparty
++++
(You'll need to run the cobertura target twice; once against the regular
test targets, and once against the thirdparty targets.)
When complete, the report will be placed in +build/cobertura/+
New patches should come with sufficient tests for their functionality
as well as their error recovery code paths. Cobertura can help assess
whether your tests are thorough enough, or where gaps lie.

Просмотреть файл

@ -37,6 +37,7 @@ provided in the +build/sqoop-(version)/+ directory.
You can build just the jar by running +ant jar+.
See the COMPILING.txt document for for information.
== This is also an Asciidoc file!

Просмотреть файл

@ -48,6 +48,10 @@
<property name="javac.debug" value="on"/>
<property name="build.encoding" value="ISO-8859-1"/>
<!-- controlling the Hadoop source -->
<!-- valid values for ${hadoop.dist} are 'apache' and 'cloudera' -->
<property name="hadoop.dist" value="apache" />
<!-- testing with JUnit -->
<property name="test.junit.output.format" value="plain"/>
<property name="test.output" value="no"/>
@ -107,11 +111,13 @@
<pathelement location="${build.classes}"/>
<path refid="lib.path"/>
<path refid="${name}.common.classpath"/>
<path refid="${name}.hadoop.classpath"/>
</path>
<!-- Classpath for unit tests (superset of compile.classpath) -->
<path id="test.classpath">
<pathelement location="${build.test.classes}" />
<path refid="${name}.hadooptest.classpath"/>
<path refid="${name}.test.classpath"/>
<path refid="compile.classpath"/>
</path>
@ -126,7 +132,8 @@
<target name="init" />
<!-- Compile core classes for the project -->
<target name="compile" depends="init, ivy-retrieve-common"
<target name="compile"
depends="init, ivy-retrieve-common, ivy-retrieve-hadoop"
description="Compile core classes for the project">
<!-- don't use an out-of-date instrumented build. -->
<delete dir="${cobertura.class.dir}" />
@ -143,7 +150,8 @@
</javac>
</target>
<target name="compile-test" depends="compile, ivy-retrieve-test"
<target name="compile-test"
depends="compile, ivy-retrieve-test, ivy-retrieve-hadoop-test"
description="Compile test classes">
<mkdir dir="${build.test.classes}" />
<javac
@ -362,6 +370,13 @@
<delete dir="${build.dir}"/>
</target>
<target name="veryclean"
depends="clean"
description="Clean build and remove cached dependencies">
<delete dir="${user.home}/.ivy2/cache/org.apache.hadoop" />
<delete file="${ivy.jar}" />
</target>
<target name="findbugs" depends="check-for-findbugs,jar,compile-test"
if="findbugs.present" description="Run FindBugs">
<taskdef name="findbugs" classname="edu.umd.cs.findbugs.anttask.FindBugsTask"
@ -467,37 +482,59 @@
<property name="ivy.configured" value="true" />
</target>
<!-- retrieve ivy-managed artifacts for the compile configuration -->
<target name="ivy-resolve-common" depends="ivy-init">
<ivy:resolve settingsRef="${name}.ivy.settings" conf="common" />
</target>
<!-- retrieve ivy-managed artifacts for the compile configuration -->
<target name="ivy-retrieve-common" depends="ivy-resolve-common">
<ivy:retrieve settingsRef="${name}.ivy.settings"
pattern="${build.ivy.lib.dir}/${ivy.artifact.retrieve.pattern}" sync="true" />
<ivy:cachepath pathid="${name}.common.classpath" conf="common" />
</target>
<!-- retrieve ivy-managed artifacts for the test configuration -->
<target name="ivy-resolve-test" depends="ivy-init">
<ivy:resolve settingsRef="${name}.ivy.settings" conf="test" />
</target>
<!-- retrieve ivy-managed artifacts for the test configuration -->
<target name="ivy-retrieve-test" depends="ivy-resolve-test">
<ivy:retrieve settingsRef="${name}.ivy.settings"
pattern="${build.ivy.lib.dir}/${ivy.artifact.retrieve.pattern}" sync="true" />
<ivy:cachepath pathid="${name}.test.classpath" conf="test" />
</target>
<!-- retrieve ivy-managed artifacts for the redist configuration -->
<target name="ivy-resolve-redist" depends="ivy-init">
<ivy:resolve settingsRef="${name}.ivy.settings" conf="redist" />
</target>
<!-- retrieve ivy-managed artifacts for the redist configuration -->
<target name="ivy-retrieve-redist" depends="ivy-resolve-redist">
<ivy:retrieve settingsRef="${name}.ivy.settings"
pattern="${build.ivy.lib.dir}/${ivy.artifact.retrieve.pattern}" sync="true" />
<ivy:cachepath pathid="${name}.redist.classpath" conf="redist" />
</target>
<!-- retrieve ivy-managed artifacts from the Hadoop distribution -->
<target name="ivy-resolve-hadoop" depends="ivy-init">
<ivy:resolve settingsRef="${name}.ivy.settings" conf="${hadoop.dist}" />
</target>
<target name="ivy-retrieve-hadoop" depends="ivy-resolve-hadoop">
<ivy:retrieve settingsRef="${name}.ivy.settings"
pattern="${build.ivy.lib.dir}/${ivy.artifact.retrieve.pattern}" sync="true" />
<ivy:cachepath pathid="${name}.hadoop.classpath" conf="${hadoop.dist}" />
</target>
<!-- retrieve ivy-managed test artifacts from the Hadoop distribution -->
<target name="ivy-resolve-hadoop-test" depends="ivy-init">
<ivy:resolve settingsRef="${name}.ivy.settings" conf="${hadoop.dist}test" />
</target>
<target name="ivy-retrieve-hadoop-test" depends="ivy-resolve-hadoop-test">
<ivy:retrieve settingsRef="${name}.ivy.settings"
pattern="${build.ivy.lib.dir}/${ivy.artifact.retrieve.pattern}"
sync="true" />
<ivy:cachepath pathid="${name}.hadooptest.classpath"
conf="${hadoop.dist}test" />
</target>
</project>

34
ivy.xml
Просмотреть файл

@ -32,7 +32,20 @@
<conf name="common" visibility="private"
extends="runtime"
description="artifacts needed to compile/test the application"/>
<conf name="apache" visibility="private"
extends="runtime"
description="artifacts from Apache for compile/test" />
<conf name="cloudera" visibility="private"
extends="runtime"
description="artifacts from Cloudera for compile/test" />
<conf name="test" visibility="private" extends="runtime"/>
<conf name="apachetest" visibility="private"
extends="test"
description="artifacts from Apache for testing" />
<conf name="clouderatest" visibility="private"
extends="test"
description="artifacts from Cloudera for testing" />
<!-- We don't redistribute everything we depend on (e.g., Hadoop itself);
anything which Hadoop itself also depends on, we do not ship.
@ -46,18 +59,27 @@
<artifact conf="master"/>
</publications>
<dependencies>
<!-- Dependencies for Apache Hadoop -->
<dependency org="org.apache.hadoop" name="hadoop-core"
rev="${hadoop-core.version}" conf="common->default"/>
rev="${hadoop-core.apache.version}" conf="apache->default"/>
<dependency org="org.apache.hadoop" name="hadoop-core-test"
rev="${hadoop-core.version}" conf="common->default"/>
rev="${hadoop-core.apache.version}" conf="apachetest->default"/>
<dependency org="org.apache.hadoop" name="hadoop-hdfs"
rev="${hadoop-hdfs.version}" conf="common->default"/>
rev="${hadoop-hdfs.apache.version}" conf="apache->default"/>
<dependency org="org.apache.hadoop" name="hadoop-hdfs-test"
rev="${hadoop-hdfs.version}" conf="test->default"/>
rev="${hadoop-hdfs.apache.version}" conf="apachetest->default"/>
<dependency org="org.apache.hadoop" name="hadoop-mapred"
rev="${hadoop-mapred.version}" conf="common->default"/>
rev="${hadoop-mapred.apache.version}" conf="apache->default"/>
<dependency org="org.apache.hadoop" name="hadoop-mapred-test"
rev="${hadoop-mapred.version}" conf="test->default"/>
rev="${hadoop-mapred.apache.version}" conf="apachetest->default"/>
<!-- Dependencies for Cloudera's Distribution for Hadoop -->
<dependency org="org.apache.hadoop" name="hadoop-core"
rev="${hadoop-core.cloudera.version}" conf="cloudera->default"/>
<dependency org="org.apache.hadoop" name="hadoop-core-test"
rev="${hadoop-core.cloudera.version}" conf="clouderatest->default"/>
<!-- Common dependencies for Sqoop -->
<dependency org="commons-logging" name="commons-logging"
rev="${commons-logging.version}" conf="common->default"/>
<dependency org="log4j" name="log4j" rev="${log4j.version}"

Просмотреть файл

@ -19,9 +19,14 @@
commons-io.version=1.4
commons-logging.version=1.0.4
hadoop-core.version=0.22.0-SNAPSHOT
hadoop-hdfs.version=0.22.0-SNAPSHOT
hadoop-mapred.version=0.22.0-SNAPSHOT
# Apache Hadoop dependency version: use trunk.
hadoop-core.apache.version=0.22.0-SNAPSHOT
hadoop-hdfs.apache.version=0.22.0-SNAPSHOT
hadoop-mapred.apache.version=0.22.0-SNAPSHOT
# Cloudera Distribution dependency version
hadoop-core.cloudera.version=0.20.2-CDH3b2-SNAPSHOT
hsqldb.version=1.8.0.10
ivy.version=2.0.0-rc2