TeraFLOPS
log in

Advanced search

Message boards : Science : TeraFLOPS

Author Message
PappaLitto
Send message
Joined: 5 Jul 17
Posts: 5
Credit: 8,112,628
RAC: 0
Message 461 - Posted: 8 Sep 2017, 22:29:59 UTC

I noticed on your server status page it had 225 teraFLOPs listed as the compute speed. I find this very odd as I am seeing many fast cards each capable of 10+ teraFLOPs like the 1080ti in single precision. Therefore it leads me to believe that this project is entirely double precision, of which nvidia consumer gpus are not well known for having very high double precision rates (1/32). Can you explain what the application itself is actually doing and what the beginning of the data consists of and what the output is?

Win10
Send message
Joined: 10 Jul 17
Posts: 4
Credit: 47,870,721
RAC: 0
Message 464 - Posted: 9 Sep 2017, 6:43:29 UTC

This project uses Integer (int64) calculations.
it does not use floating point.

Vlad
Project administrator
Project developer
Project tester
Project scientist
Help desk expert
Send message
Joined: 26 Oct 16
Posts: 322
Credit: 103,382
RAC: 0
Message 465 - Posted: 9 Sep 2017, 10:11:30 UTC - in response to Message 461.
Last modified: 9 Sep 2017, 10:16:04 UTC

I noticed on your server status page it had 225 teraFLOPs listed as the compute speed. I find this very odd as I am seeing many fast cards each capable of 10+ teraFLOPs like the 1080ti in single precision. Therefore it leads me to believe that this project is entirely double precision, of which nvidia consumer gpus are not well known for having very high double precision rates (1/32). Can you explain what the application itself is actually doing and what the beginning of the data consists of and what the output is?

Double-precision floating point operations are not used on the GPU. Below is information on input and output data of the apps, as well as the computational algorithm.

Input data:

1. XML file containing the unit cell configuration (extracted from CIF file) and parameters of simulation,
2. Text file containing the x-ray atomic form factors (for x-ray diffraction) or neutron scattering lengths (for neutron diffraction). These data are provided by periodictable python module.

Output data:

1. stdout file containing some information on the performed calculations, GPU properties (for GPU apps) and errors (if any).
2. file with powder diffraction pattern. It contains two-column array. The first column is the scattering vector magnitude and the second column is the scattering intensity (could be one-column actually cause the first column is the same for all WUs).

The calculation algorithm consists of 3 steps:

1. calculating the atomic ensemble using the lattice parameters and unit cell configuration given in the XML file,
2. calculating the histogram of interatomic distances for this atomic ensemble,
3. calculating the scattering intensity using the histogram of interatomic distances.

The first step is always performed on a CPU. Here, most of intermediate operations are double-precision, but the final atomic ensemble is a 32-bit float array.

The second step is performed on a GPU in the GPU apps. The distances are calculated using floating point operations, of course, but the histogram of interatomic distances is a 64-bit integer array. Updating the bins of this histogram requires atomic operations in global memory on 64-bit integers. This step takes most of the computational time.

The third step is also performed on a GPU in the GPU apps. Here, most of the calculations use floating-point operations. Thanks to the intrinsic trigonometry functions this step performed very fast on GPUs.

Also, you can see my presentation on BOINC:FAST 2017 conference if you want more details.

According to this page, the GigaFLOPs number shown in the status page is calculated using the simple formula: (Total RAC)/200. So, this is a very approximate value. Also, most of the users do not crunch for this project all day long.

PappaLitto
Send message
Joined: 5 Jul 17
Posts: 5
Credit: 8,112,628
RAC: 0
Message 466 - Posted: 10 Sep 2017, 3:06:52 UTC

Thank you for your well written response

Message boards : Science : TeraFLOPS


Main page · Your account · Message boards


© 2021 Vladislav Neverov (NRC 'Kurchatov institute'), Nikolay Khrapov (Institute for Information Transmission Problems of RAS)