728x90
반응형

macos 에서 pyspark 를 통해 디버깅을 하던 도중 다음과 같은 에러가 발생했다.

 

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/09/27 10:36:44 WARN package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
objc[7304]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[7304]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
23/09/27 10:36:47 ERROR Executor: Exception in task 0.0 in stage 5.0 (TID 4)
org.apache.spark.SparkException: Python worker exited unexpectedly (crashed)
    at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:601)
    at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:583)
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
    at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:99)
    at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:75)
    at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:514)
    at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage4.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
    at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
    at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
    at org.apache.spark.scheduler.Task.run(Task.scala:139)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.io.EOFException
    at java.base/java.io.DataInputStream.readInt(DataInputStream.java:398)
    at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:83)
    ... 24 more
23/09/27 10:36:47 WARN TaskSetManager: Lost task 0.0 in stage 5.0 (TID 4) (192.168.0.171 executor driver): org.apache.spark.SparkException: Python worker exited unexpectedly (crashed)
    at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:601)
    at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:583)
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
    at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:99)
    at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:75)
    at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:514)
    at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage4.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
    at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
    at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
    at org.apache.spark.scheduler.Task.run(Task.scala:139)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.io.EOFException
    at java.base/java.io.DataInputStream.readInt(DataInputStream.java:398)
    at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:83)
    ... 24 more
23/09/27 10:36:47 ERROR TaskSetManager: Task 0 in stage 5.0 failed 1 times; aborting job

 

 

이 문제에 대해서 찾아보기 다음과 같이 stackoverflow 에서 원인을 찾을 수 있었다.

Multiprocessing causes Python to crash and gives an error may have been in progress in another thread when fork() was called

 

Multiprocessing causes Python to crash and gives an error may have been in progress in another thread when fork() was called

I am relatively new to Python and trying to implement a Multiprocessing module for my for loop. I have an array of Image url's stored in img_urls which I need to download and apply some Google vis...

stackoverflow.com

 

 

이 문제의 원인은 MacOS 의 High Sierra 버전과 최근 버전에서 멀티쓰레딩을 제한하기 위한 보안을 업데이트해서 발생하는 문제라고 한다.

 

 

따라서 이 문제는 OBJC_DISABLE_INITIALIZE_FORK_SAFETY 설정을 YES 로 해주면 된다고 한다.

vi ~/.zshrc

# ~/.zshrc 에 추가
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES

source ~/.zshrc

여기서 OBJC_DISABLE_INITIALIZE_FORK_SAFETY 옵션은 Objective-C 런타임 초기화(fork safety) 검사를 비활성화하는데 사용되는 환경 변수라고 한다.

 

이 변수는 MacOS 10.15 (Catalina) 및 이후 버전에서 도입된 보안 기능 중 하나라고 한다.

 

따라서 보안이 업데이트된 이후에는 fork() 시스템 호출 이후 자동으로 초기화되도록 설계되었다고 한다.
병렬 처리와 같은 프로세스에서 안전성을 제공하지만 특정 애플리케이션이나 라이브러리에서 문제가 발생할 수 있다.

 

따라서 OBJC_DISABLE_INITIALIZE_FORK_SAFETY 이 옵션을 제공하고 있다고 한다.
그러나 이러한 설정을 변경할 때는 주의해야 해야한다고 한다. fork safety 검사를 비활성화하면 프로세스 간에 데이터 무결성 문제가 발생할 수 있고, 이러한 설정을 변경하기 전에 문제와 해결 방법을 신중하게 고려해야할 필요가 있다고 한다.

728x90
반응형
복사했습니다!