Let the bug crawls!
"A bug is never just a mistake. It represents something bigger. An error of thinking. That makes you who you are." - Eliot @ Mr. RobotToday, we just spent a half day to de-a-bug. This is a tough, really really tough bug. Let me tell you a story of this:
TL;DR
1. Last week, I found that the performance of one of the Power8 clusters for running the Spark is abnormal. Also, it is even slower than another Intel Xeon machine which has "worse" specs than this one.2. As the power8 machine has more cores and more memory, even the clock speed is faster than the Xeon machine, we totally have no idea on this issue.
3. First, we tested with a simple Spark application but no luck. We tried to dig into each stage step by step to see the duration, write time and serialization delay sort of things.
4. After that, we put the focus on the JVM. We suspected that JVM would cause a performance difference as IBM Java and Oracle Java are used on Power8 and Xeon respectively. However, the performance still differed after installing OpenJDK on both machines for a simple Java sorting program.
5. As we did't have any idea on Java, we targeted on something more fundamental --- C.
6. Guess what? After dealing with the optimization flags and all sorts of CPU benchmarking, we concluded that even the Power8 has "higher" specs than Xeon, Xeon stills outweigh it because of an all-rounded functionality and optimized instructions set.


沒有留言:
張貼留言