Граф коммитов

1124 Коммитов

Автор SHA1 Сообщение Дата
Matei Zaharia b9fb8d6463 Include date in folder name for Spark local dir. 2012-10-01 15:55:16 -07:00
Matei Zaharia c06b0c7537 Merge pull request #235 from pwendell/publish-local-maven
publish-local should go to maven + ivy by default
2012-10-01 15:49:16 -07:00
Patrick Wendell 6fee76d6d5 publish-local should go to maven + ivy by default 2012-10-01 15:34:47 -07:00
Matei Zaharia bc881e4798 Merge branch 'dev' of github.com:mesos/spark into dev 2012-10-01 15:21:56 -07:00
Matei Zaharia 802aa8aef9 Some bug fixes and logging fixes for broadcast. 2012-10-01 15:20:42 -07:00
Matei Zaharia c1db5a849b Ignore file spark-tests.log in git 2012-10-01 15:08:20 -07:00
Matei Zaharia 74a9244255 Write all unit test output to a file 2012-10-01 15:07:42 -07:00
Matei Zaharia 8981804c71 Merge pull request #233 from rxin/dev
Fixed #232: DirectBuffer's cleaner was empty and Spark tried to invoke clean on it.
2012-10-01 14:30:39 -07:00
Reynold Xin f264153162 Fixed #232: DirectBuffer's cleaner was empty and Spark tried to invoke
clean on it.
2012-10-01 14:07:34 -07:00
Matei Zaharia 3b348f909d Improve log messages from BlockManager 2012-10-01 12:01:38 -07:00
Matei Zaharia 22b6b16e56 Merge branch 'dev' of github.com:mesos/spark into dev 2012-10-01 10:57:32 -07:00
Matei Zaharia 0b84871dbc Remove some printlns in tests 2012-10-01 10:57:26 -07:00
Matei Zaharia 53f90d0f0e Use underscores instead of colons in RDD IDs 2012-10-01 10:48:53 -07:00
Matei Zaharia 39363b5185 Merge pull request #231 from rxin/dev
Added a new command "pl" in sbt to publish to both Maven and Ivy.
2012-10-01 10:45:03 -07:00
Reynold Xin 5783236ae6 Added a new command "pl" in sbt to publish to both Maven and Ivy. 2012-10-01 00:17:13 -07:00
Matei Zaharia 2314132d57 Added a (failing) test for LRU with MEMORY_AND_DISK. 2012-09-30 22:52:16 -07:00
Matei Zaharia 3128c57f90 Simplified Class / ClassLoader test 2012-09-30 21:48:27 -07:00
Matei Zaharia 83143f9a5f Fixed several bugs that caused weird behavior with files in spark-shell:
- SizeEstimator was following through a ClassLoader field of Hadoop
  JobConfs, which referenced the whole interpreter, Scala compiler, etc.
  Chaos ensued, giving an estimated size in the tens of gigabytes.
- Broadcast variables in local mode were only stored as MEMORY_ONLY and
  never made accessible over a server, so they fell out of the cache when
  they were deemed too large and couldn't be reloaded.
2012-09-30 21:19:39 -07:00
Matei Zaharia fd0374b9de Comment 2012-09-29 21:43:06 -07:00
Matei Zaharia 5718cef2a4 Removed Logging trait from CoalescedRDD since we don't log anything 2012-09-29 21:40:43 -07:00
Matei Zaharia 4a74e8635c Merge pull request #228 from rxin/dev
Added mapPartitionsWithSplit to the programming guide.
2012-09-29 21:33:38 -07:00
Matei Zaharia 143ef4f90d Added a CoalescedRDD class for reducing the number of partitions in an RDD. 2012-09-29 21:30:52 -07:00
Matei Zaharia c45758ddde Comment 2012-09-29 20:27:54 -07:00
Matei Zaharia ebd52347b5 Merge branch 'dev' of github.com:mesos/spark into dev 2012-09-29 20:22:31 -07:00
Matei Zaharia 9b326d01e9 Made BlockManager unmap memory-mapped files when necessary to reduce the
number of open files. Also optimized sending of disk-based blocks.
2012-09-29 20:21:54 -07:00
Reynold Xin f5812d0354 Added mapPartitionsWithSplit to the programming guide. 2012-09-29 01:31:36 -07:00
Matei Zaharia 2f11e3c285 Merge pull request #227 from JoshRosen/fix/distinct_numsplits
Allow controlling number of splits in distinct().
2012-09-28 23:57:24 -07:00
Josh Rosen 8654165e69 Use null as dummy value in distinct(). 2012-09-28 23:55:17 -07:00
Josh Rosen 37c199bbb0 Allow controlling number of splits in distinct(). 2012-09-28 23:44:19 -07:00
Matei Zaharia 56dcad5936 Don't create a Cache in SparkEnv because we don't use it 2012-09-28 23:40:56 -07:00
Matei Zaharia 1d44644f4f Logging tweaks 2012-09-28 23:28:16 -07:00
Matei Zaharia 815d6bd69a Renamed subdirs option 2012-09-28 19:02:41 -07:00
Matei Zaharia e54e1d7043 Made subdirs per local dir configurable, and reduced lock usage a bit 2012-09-28 19:00:50 -07:00
Matei Zaharia ae8c7d6cfa Made disk store use multiple directories, deleted ShuffleManager 2012-09-28 18:28:13 -07:00
Matei Zaharia 3d7267999d Print and track user call sites in more places in Spark 2012-09-28 17:42:00 -07:00
Matei Zaharia 9f6efbf06a Merge pull request #225 from pwendell/dev
Log message which records RDD origin
2012-09-28 16:28:07 -07:00
Matei Zaharia 0121a26bd1 Changed the way tasks' dependency files are sent to workers so that
custom serializers or Kryo registrators can be loaded.
2012-09-28 16:14:05 -07:00
Patrick Wendell 9fc78f8f29 Fixing some whitespace issues 2012-09-28 16:05:50 -07:00
Patrick Wendell bc909c2903 Changes based on Matei's comments 2012-09-28 16:04:36 -07:00
Patrick Wendell c387e40fb1 Log message which records RDD origin
This adds tracking to determine the "origin" of an RDD. Origin is defined by
the boundary between the user's code and the spark code, during an RDD's
instantiation. It is meant to help users understand where a Spark RDD is
coming from in their code.

This patch also logs origin data when stages are submitted to the scheduler.

Finally, it adds a new log message to fix an inconsitency in the way that
dependent stages (those missing parents) and independent stages (those
without) are logged during submission.
2012-09-28 15:51:46 -07:00
Matei Zaharia 2a8bfbca00 Fixed a bug where isLocal was set to false when using local[K] 2012-09-28 14:50:54 -07:00
Matei Zaharia 4a138403ef Fix a bug in JAR fetcher that made it always fetch the JAR 2012-09-27 21:32:06 -07:00
Matei Zaharia 009b0e37e7 Added an option to compress blocks in the block store 2012-09-27 18:45:44 -07:00
Matei Zaharia 7bcb08cef5 Renamed storage levels to something cleaner; fixes #223. 2012-09-27 17:50:59 -07:00
Matei Zaharia 0850d641af Merge branch 'dev' of github.com:mesos/spark into dev 2012-09-26 23:53:48 -07:00
Matei Zaharia bf18e0994e Minor typos 2012-09-26 23:53:38 -07:00
Matei Zaharia a4093f7563 Minor doc fixes 2012-09-26 23:22:15 -07:00
Matei Zaharia 920fab23c3 Merge pull request #222 from rxin/dev
Added MapPartitionsWithSplitRDD.
2012-09-26 23:16:45 -07:00
Matei Zaharia ea05fc130b Updates to standalone cluster, web UI and deploy docs. 2012-09-26 22:54:39 -07:00
Matei Zaharia 1ef4f0fbd2 Allow controlling number of splits in sortByKey. 2012-09-26 19:18:47 -07:00