зеркало из https://github.com/microsoft/spark.git
[SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow error
The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 In a nutshell, if `val partitionsRDD` in EdgeRDDImpl and VertexRDDImpl are non-transient, the serialization chain can become very long in iterative algorithms and finally lead to the StackOverflow error. More details and explanation can be found in the JIRA. Author: JerryLead <JerryLead@163.com> Author: Lijie Xu <csxulijie@gmail.com> Closes #3544 from JerryLead/my_graphX and squashes the following commits: 628f33c [JerryLead] set PartitionsRDD to be transient in EdgeRDDImpl and VertexRDDImpl c0169da [JerryLead] Merge branch 'master' of https://github.com/apache/spark 52799e3 [Lijie Xu] Merge pull request #1 from apache/master
This commit is contained in:
Родитель
fc0a1475ef
Коммит
17c162f668
|
@ -26,7 +26,7 @@ import org.apache.spark.storage.StorageLevel
|
|||
import org.apache.spark.graphx._
|
||||
|
||||
class EdgeRDDImpl[ED: ClassTag, VD: ClassTag] private[graphx] (
|
||||
override val partitionsRDD: RDD[(PartitionID, EdgePartition[ED, VD])],
|
||||
@transient override val partitionsRDD: RDD[(PartitionID, EdgePartition[ED, VD])],
|
||||
val targetStorageLevel: StorageLevel = StorageLevel.MEMORY_ONLY)
|
||||
extends EdgeRDD[ED](partitionsRDD.context, List(new OneToOneDependency(partitionsRDD))) {
|
||||
|
||||
|
|
|
@ -27,7 +27,7 @@ import org.apache.spark.storage.StorageLevel
|
|||
import org.apache.spark.graphx._
|
||||
|
||||
class VertexRDDImpl[VD] private[graphx] (
|
||||
val partitionsRDD: RDD[ShippableVertexPartition[VD]],
|
||||
@transient val partitionsRDD: RDD[ShippableVertexPartition[VD]],
|
||||
val targetStorageLevel: StorageLevel = StorageLevel.MEMORY_ONLY)
|
||||
(implicit override protected val vdTag: ClassTag[VD])
|
||||
extends VertexRDD[VD](partitionsRDD.context, List(new OneToOneDependency(partitionsRDD))) {
|
||||
|
|
Загрузка…
Ссылка в новой задаче