Dear All,
We recently bought some two-quad core Nehalem X5550 machines to run VASP 4.6. The Nehalem machines are connected via normal Gigabit Ethernet. We used Centos, ifort 11.1.038, and intel MKL 10.2.2.025. The mpi library we used is openmpi-1.3.3. We are happy to see that VASP runs very fast using 8 cores in a single Nehalem box. However, we are very frustrated to find the scalability cross two or more Nehalem boxes is very bad, i.e. VASP doesn’t scale at all cross boxes. The walltime vs. number of cores for a test system (a 2x2 surface) is shown below:
Cores Time (second)
8 85.78
16 116.93
24 245.06
32 261.96
These results surprised me. Actually, VASP scales pretty well on our AMD Barcelona clusters with Gigabit interconnection.
I am wondering if it is a must to use a low-latency interconnection such as infiniband to get a good scalability on Nehalem cross different boxes.
I appreciate it if anybody can share me some experience to run VASP efficiently on Nehalem with Gigabit interconnection.
Thanks.
VASP scalability on Nahelem with Gigabit interconnection
Moderators: Global Moderator, Moderator
-
- Newbie
- Posts: 14
- Joined: Sat Feb 05, 2005 3:53 am
- License Nr.: 866
- Location: U.S.
VASP scalability on Nahelem with Gigabit interconnection
Last edited by Brane on Tue Oct 20, 2009 4:36 pm, edited 1 time in total.
-
- Newbie
- Posts: 24
- Joined: Wed Feb 18, 2009 11:40 pm
- License Nr.: 196
- Location: Poznań, Poland
VASP scalability on Nahelem with Gigabit interconnection
Have you tried running the same test on single core? What is the actual impact of using 8 cores in single node? Those times are for single iteration? If not, than compare single iteration.
If you have small test system, you send lots of informations between hosts, and calculations take less time than communication between nodes (MPI). If I'm trying to compare new computer to one of "the old ones", I use test case where single iteration runs about 1000 seconds. It makes me sure that mpi-impact is relatively small, and test is similar to jobs actually run in near future.
Also comparison of a lot slower cpu with fast nehalem with mpi makes small sense for small test cases - due to previous argument.
Sorry for little off-topic, but I believe you should run again test case and your results won't be such a drama.
If you have small test system, you send lots of informations between hosts, and calculations take less time than communication between nodes (MPI). If I'm trying to compare new computer to one of "the old ones", I use test case where single iteration runs about 1000 seconds. It makes me sure that mpi-impact is relatively small, and test is similar to jobs actually run in near future.
Also comparison of a lot slower cpu with fast nehalem with mpi makes small sense for small test cases - due to previous argument.
Sorry for little off-topic, but I believe you should run again test case and your results won't be such a drama.
Last edited by pafell on Fri Oct 23, 2009 7:28 am, edited 1 time in total.
-
- Newbie
- Posts: 14
- Joined: Sat Feb 05, 2005 3:53 am
- License Nr.: 866
- Location: U.S.
VASP scalability on Nahelem with Gigabit interconnection
Hi Pafell,
Thanks for your comments.
As you can see in my previous post, I am interested in the scalability on Nehalem with Gb Ethernet connection. My tests showed VASP scales very well within a single Nehalem box. A test job with 8 cores in a single box is about 6 times faster than that running on a single core. However, when job running across two boxes, the walltime increases and VASP doesn't scale. The time I gave above is the walltime for an ionic step (i.e., 'grep LOOP+ OUTCAR'). You may question that my test system is small and the MPI communication between boxes is too much. However, I also tested a big system and I got the same result. Meanwhile, I also ran the same benchmarks in a small test Infiniband-connected Nehalem boxes and found that VASP scales pretty good. The problem is that Infiniband is too expensive and we may have to end up with Gb Ethernet connection. That is why I very much like to see a good scalability on Nehalem with Gb Ethernet connection, as it was the case on AMD Barcelona boxes.
Thanks.
<span class='smallblacktext'>[ Edited Mon Oct 26 2009, 05:22AM ]</span>
Thanks for your comments.
As you can see in my previous post, I am interested in the scalability on Nehalem with Gb Ethernet connection. My tests showed VASP scales very well within a single Nehalem box. A test job with 8 cores in a single box is about 6 times faster than that running on a single core. However, when job running across two boxes, the walltime increases and VASP doesn't scale. The time I gave above is the walltime for an ionic step (i.e., 'grep LOOP+ OUTCAR'). You may question that my test system is small and the MPI communication between boxes is too much. However, I also tested a big system and I got the same result. Meanwhile, I also ran the same benchmarks in a small test Infiniband-connected Nehalem boxes and found that VASP scales pretty good. The problem is that Infiniband is too expensive and we may have to end up with Gb Ethernet connection. That is why I very much like to see a good scalability on Nehalem with Gb Ethernet connection, as it was the case on AMD Barcelona boxes.
Thanks.
<span class='smallblacktext'>[ Edited Mon Oct 26 2009, 05:22AM ]</span>
Last edited by Brane on Mon Oct 26, 2009 3:49 am, edited 1 time in total.
-
- Hero Member
- Posts: 586
- Joined: Tue Nov 16, 2004 2:21 pm
- License Nr.: 5-67
- Location: Germany
VASP scalability on Nahelem with Gigabit interconnection
Hello Brane,
one way to overcome heavy (and slow) communication is to increase NPAR. You'll do more numerics with one process but less communication. Typically Gb ethernet profits from that.
Cheers
Alex
one way to overcome heavy (and slow) communication is to increase NPAR. You'll do more numerics with one process but less communication. Typically Gb ethernet profits from that.
Cheers
Alex
Last edited by alex on Mon Oct 26, 2009 7:56 am, edited 1 time in total.