Issues with Nvidia GeForce GTX 590, 580, 570, 480, 470 and 465 (Cuda CC 2.0)
log in

Advanced search

Message boards : Nvidia : Issues with Nvidia GeForce GTX 590, 580, 570, 480, 470 and 465 (Cuda CC 2.0)

Author Message
Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 45 - Posted: 21 Nov 2016, 12:31:07 UTC
Last modified: 23 Nov 2016, 19:55:55 UTC

Dear owners of GeForce GTX 590, 580, 570, 480, 470 and 465 GPUs!

I would be very grateful if you would take part in the testing, which can help to solve the issues with your GPUs.
Here is a link to the archive containing the two versions of CUDA app (windows only) and the data for simple NaCl crystallite.
To perform the testing, start the cmd in the folder where you have extracted the archive and run:

XaNSoNS_CUDA_v1.exe NaCl1.xml > stdout1

and after that, run:

XaNSoNS_CUDA_v2.exe NaCl2.xml > stdout2

Please upload the files stdout1, stdout2, NaCl_1_xray_1D.txt and NaCl_2_xray_1D.txt to somewhere and provide me a link.

Attention! Since I do not know the source of the problem yet, you may experience various errors including the driver crash! Please, do not perform the testing if this is a problem for you.

Thanks in advance!

Steve Hawker*
Send message
Joined: 11 Nov 16
Posts: 13
Credit: 44,799
RAC: 0
Message 47 - Posted: 21 Nov 2016, 14:45:35 UTC

What about GT 430 (CC ??)

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 48 - Posted: 21 Nov 2016, 15:13:47 UTC - in response to Message 47.

What about GT 430 (CC ??)

Yes, now I see that you have problems with this GPU. It's CC 2.1. Can you run this test?

Theobald
Send message
Joined: 19 Nov 16
Posts: 4
Credit: 6,696
RAC: 0
Message 52 - Posted: 21 Nov 2016, 19:52:10 UTC
Last modified: 21 Nov 2016, 20:10:29 UTC

GT540M WU failed too.

Same 96core Fermi chip with CC 2.1:
GT420M/GT430M/GT435M
GT525M/GT540M/GT550M
GT620M/GT625M/GT630M
G710M/GT720M
GT820M

http://www.file-upload.net/download-12111195/CUDA_CC_2.0_test_results.zip.html

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 55 - Posted: 21 Nov 2016, 20:21:39 UTC - in response to Message 52.

GT540M WU failed too.

Same 96core Fermi chip with CC 2.1:
GT420M/GT430M/GT435M
GT525M/GT540M/GT550M
GT620M/GT625M/GT630M
G710M/GT720M
GT820M

http://www.file-upload.net/download-12111195/CUDA_CC_2.0_test_results.zip.html

Thank you!
But you have included the initial XML files in the archive instead of stdout1 and stdout2. I need the stdout to see the kernel execution time.
The diffraction patterns contain no errors.

Theobald
Send message
Joined: 19 Nov 16
Posts: 4
Credit: 6,696
RAC: 0
Message 57 - Posted: 21 Nov 2016, 20:42:24 UTC - in response to Message 55.
Last modified: 21 Nov 2016, 20:43:01 UTC

Oooops... http://www.file-upload.net/download-12111299/CUDA_CC_2.0_test_results_fixed.zip.html

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 58 - Posted: 21 Nov 2016, 21:49:21 UTC - in response to Message 57.
Last modified: 22 Nov 2016, 12:01:33 UTC

Oooops... http://www.file-upload.net/download-12111299/CUDA_CC_2.0_test_results_fixed.zip.html

Thanks again!
This particular simulation was completed successfully on your GPU. However, the kernel execution time is almost 0.25 sec. which is too high.
The app tries to estimate the number of the kernel calls from the theoretical peak performance of the GPU to make sure that the kernel execution time would be less than 0.1 sec.
I'll try to optimize this for Fermi GPUs.

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 62 - Posted: 22 Nov 2016, 11:14:59 UTC - in response to Message 57.

Oooops... http://www.file-upload.net/download-12111299/CUDA_CC_2.0_test_results_fixed.zip.html

I've reduced the target kernel execution time to 20 ms for all GPUs (no performance drop detected on GTX Titan). Additionally, I've changed the coefficient used to estimate the kernel execution time for GPUs with CC<3.0.
This archive contains the initial data for this WU (failed on GT 540M) and the new executable.
Can you run:
XaNSoNS_CUDA.exe start.xml > stdout
and give me a link to stdout and TestJob_neut_1D.txt, please?

Everyone affected by this issue are also invited to run this test.

Theobald
Send message
Joined: 19 Nov 16
Posts: 4
Credit: 6,696
RAC: 0
Message 65 - Posted: 22 Nov 2016, 23:17:39 UTC

Here you go: http://www.file-upload.net/download-12114342/CUDA_KernelTime_test_results.rar.html

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 69 - Posted: 23 Nov 2016, 10:08:37 UTC - in response to Message 65.

Here you go: http://www.file-upload.net/download-12114342/CUDA_KernelTime_test_results.rar.html

Thank you!
I've checked your results. The kernel execution time was OK but the diffraction pattern was not, so I started to search the bug somewhere else and eventualy found two)! One was affecting the GPUs with CC 2.0 only and another one was affecting any Nvidia GPU with theoretical peak performance below a certain value. That's why not all CC 2.1 devices were affected.

The new version (1.5) of CUDA app for windows is uploaded. If the problem is solved, I will upload the updated versions for other platfroms.

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 70 - Posted: 23 Nov 2016, 10:48:28 UTC - in response to Message 69.

The kernel execution time was OK but the diffraction pattern was not, so I started to search the bug somewhere else and eventualy found two)!

Well, there is another one somewhere...

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 71 - Posted: 23 Nov 2016, 11:07:51 UTC - in response to Message 70.
Last modified: 23 Nov 2016, 11:08:41 UTC

Ok, I figured out how to emulate the behavior of low-performance GPUs on my GTX Titan. Now I can reproduce the bugs.
The buggy 1.5 CUDA version is removed.

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 73 - Posted: 23 Nov 2016, 16:37:12 UTC

Based on the statistics I have, the problems with CC 2.0 devices and low-performance Nvidia GPUs are solved in the version 1.6 (1.06).

Theobald
Send message
Joined: 19 Nov 16
Posts: 4
Credit: 6,696
RAC: 0
Message 74 - Posted: 23 Nov 2016, 19:39:20 UTC - in response to Message 73.
Last modified: 23 Nov 2016, 20:22:16 UTC

No errors anymore. I will not continue crunching on my GT540M for now, just wanted to test and help. -> R9 380 is much better and Notebook is getting loud and hot :p

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 231
Credit: 96,990
RAC: 24
Message 77 - Posted: 23 Nov 2016, 20:27:33 UTC - in response to Message 74.

No errors anymore. I will not continue crunching on my GT540M for now, just wanted to test and help. -> R9 380 is much better and Notebook is getting loud and hot :p

Thank you! Your help was very useful.

Message boards : Nvidia : Issues with Nvidia GeForce GTX 590, 580, 570, 480, 470 and 465 (Cuda CC 2.0)


Main page · Your account · Message boards


© 2017 Vladislav Neverov (NRC 'Kurchatov institute'), Nikolay Khrapov (Institute for Information Transmission Problems of RAS)