Posts by [B@P] Daniel
log in
1) Message boards : Cafe : Congratulations PDW for earning 300k COD entries badge! (Message 553)
Posted 28 Sep 2017 by Profile [B@P] Daniel
Congratulations! You probably will be the only one person who have it!
2) Message boards : Linux : GPU app for ARM Mali T-628 (Message 529)
Posted 23 Sep 2017 by Profile [B@P] Daniel
Limit for anonymous platform should work fine, my job queue for CPU did not grow too much. I suspect different issue here: OpenCL reports that Odroid XU4 has 2 Mali T-628 GPUs, one with 4 and one with 2 compute units. BOINC sees both, but uses only 1st one. I tried to add <use_all_gpus>1</use_all_gpus> to cc_config.xml, but it did not help. I suspect that BOINC does not work properly when 2 GPUs have the same vendor and name, but different parameters.

BOINC was able to detect this Mali GPU by itself, I did not have to add anything special to cc_config.xml. Looks that BOINC did not try to benchmark it itself, and reports 0 instead of actual GFLOPS. Or it does not report this value to server, I do not know. I have added this info to BOINC issue on Github (link is in 1st post).
3) Message boards : Linux : GPU app for ARM Mali T-628 (Message 521)
Posted 22 Sep 2017 by Profile [B@P] Daniel
Recompiled app works fine. However now I noticed that Boinc client downloaded too many tasks for GPU - I have 369 of them waiting, and estimated time to complete is 100 days. My work buffer is set to 0.5 + 0.01 days. In the past I noticed that for some projects real buffer is twice the configures one, but this ca be accepted. However in this case it look like there were no limit at all.

CPU task queue looks fine, now I have 5 tasks waiting.
4) Message boards : Linux : GPU app for ARM Mali T-628 (Message 513)
Posted 18 Sep 2017 by Profile [B@P] Daniel
I have uploaded binaries to bitbucket, you can try them if you want:
https://bitbucket.org/sirzooro/boinc-stuff/downloads/

In order to use them, attach your device to project, unpack contents of xansons_odroid_xu4.tgz archive to /var/lib/boinc-client/projects/xansons4cod.com_xansons4cod/ dir and restart BOINC Client (config reload will not work). After doing thing, in event log you you should see entry like this:

XANSONS for COD | Found app_info.xml; using anonymous platform


CPU app is configured for 8 cores. If your device has different number of CPU cores, you must edit app_info.xml, and change values in tags avg_ncpus, max_ncpus and cmdline to match your device. Changes to app_info.xml most probably also requires BOINC restart to take effect.

Thank you very much, that's great!

The app already has zero-copy implemented for Intel GPU, so I enabled it for ARM too (not tested). Also, building the app for Mali should be easier now (if I updated the Makefile correctly), just:
make OpenCL=1 Mali=1
https://gitlab.com/vsnever/xansons_boinc/commit/e4739db337917c6d8273421a16c24c44b353c479

Thanks, I pulled the changes. For some reason linking failed, it started working after I moved -lmali after -lpthread. New app is running now, I will see in the morning if it still works.
5) Message boards : Web site : No more work? (Message 510)
Posted 18 Sep 2017 by Profile [B@P] Daniel
My two hosts already finished all WUs and switched to backup project. Status page shows that all WUs for CPU and GPU are sent out. It also shows that 10k WUs are waiting for validation.
6) Message boards : Linux : GPU app for ARM Mali T-628 (Message 509)
Posted 18 Sep 2017 by Profile [B@P] Daniel
I have uploaded binaries to bitbucket, you can try them if you want:
https://bitbucket.org/sirzooro/boinc-stuff/downloads/

In order to use them, attach your device to project, unpack contents of xansons_odroid_xu4.tgz archive to /var/lib/boinc-client/projects/xansons4cod.com_xansons4cod/ dir and restart BOINC Client (config reload will not work). After doing thing, in event log you you should see entry like this:

XANSONS for COD | Found app_info.xml; using anonymous platform


CPU app is configured for 8 cores. If your device has different number of CPU cores, you must edit app_info.xml, and change values in tags avg_ncpus, max_ncpus and cmdline to match your device. Changes to app_info.xml most probably also requires BOINC restart to take effect.
7) Message boards : Linux : GPU app for ARM Mali T-628 (Message 508)
Posted 18 Sep 2017 by Profile [B@P] Daniel
WU mentioned earlier finally completed, after 6 hours 42 minutes. Smaller block size helped, WU was validated successfully. Looks that it was a very big one, I got over 3k credits for it. Here is link to it for reference:
http://xansons4cod.com/xansons4cod/result.php?resultid=21913410

Next WU was a lot shorter, it completed in 31 minutes:
http://xansons4cod.com/xansons4cod/result.php?resultid=21913461

This time could be shorter if kernels were optimized for Mali GPU - global/local/host memory is shared (no need to copy data between them), it supports SIMD instruction (vector size is 128 bits) and there are no thread blocks (warps/wavefronts). But with only 3 weeks until end of main part of calculations it probably does not make sense to try to optimize code for this GPU.

Edit: you were first, I waited for 2nd WU to finish :). Thanks for help with this app!
8) Message boards : Linux : GPU app for ARM Mali T-628 (Message 504)
Posted 18 Sep 2017 by Profile [B@P] Daniel
Thanks. I tried it, and now it performs calculation. However 1st result is inconclusive and most probably will be invalid - some kernels returned error CL_OUT_OF_RESOURCES:
http://xansons4cod.com/xansons4cod/result.php?resultid=21913280

Probably not enough registers per workitem. I'm afraid, I can't fix this without the device.
By the way, did you change the makefile to build the app for ARM?

Update. Actually, can you change BlockSize1Dsmall value from 256 to 128 in kernelsPDF.cl (line 25) and typedefs.h (line 40), rebuild and try again?

OK, I changes this and rebuilt app. Looks that it slowed down, will see if it will be able to complete calculations successfully now.

Makefile needed only minor update - I changed path to BOINC source. OpenCL needed one more change, I had to link with libmali too - this was needed because after last system update libOpenCL stopped exporting cl* functions. In the past linking with libOpenCL was enough.
9) Message boards : Linux : GPU app for ARM Mali T-628 (Message 502)
Posted 17 Sep 2017 by Profile [B@P] Daniel
Thanks. I tried it, and now it performs calculation. However 1st result is inconclusive and most probably will be invalid - some kernels returned error CL_OUT_OF_RESOURCES:
http://xansons4cod.com/xansons4cod/result.php?resultid=21913280

I also spotted one minor issue - at the beginning progress jumps to 5% and then down to 1%. This is a cosmetic issue.

Now I am running 2nd WU, will see if it will finish successfully or not.

And one more thing, my device is Odroid XU4, not C2 as I wrote above. C2 does not have OpenCL-capable GPU.
10) Message boards : Linux : GPU app for ARM Mali T-628 (Message 500)
Posted 17 Sep 2017 by Profile [B@P] Daniel
I am trying to run Xansons apps on Odroid C2. CPU app works fine (http://xansons4cod.com/xansons4cod/results.php?hostid=3878&appid=2). However I have problem with GPU app for integrated Mali T-628 GPU. I was able to build and install it, but it seems that it does not work as expected - progress quickly jumped to 1%, and it is stuck there for 1 hour. CPU usage for app is about 97%, so looks that apps is doing something. I checked code briefly and found that GetGFLOPS function returns 0 for unrecognized GPU, what affects calculations which depend on this value. Could you take a look on this?

Part of scheduler request with details of Mali T-628 GPU is here: https://github.com/BOINC/boinc/issues/1686.
11) Message boards : CPU : No work for cpu (Message 486)
Posted 15 Sep 2017 by Profile [B@P] Daniel
Check also your NCI apps, BOINC has bug and does not request new WUs for any app if NCI app cannot download new WUs. I had this problem with QCN apps in the past. This problem is probably limited to QCN apps, Monkey apps do not cause this.
12) Message boards : News : Badges (Message 446)
Posted 29 Aug 2017 by Profile [B@P] Daniel
4 lowest badges use different colors for their backgrounds. Diamond one uses silver b/g with different gradient pattern, so it may be confused with silver badge. Additionally in small version it looks more like corrugated sheet than diamond - diamond cuts creates sharp edges which are missing here.
13) Message boards : Nvidia : Only 1 of 2 Nvidia GPU's running (Message 195)
Posted 22 Jun 2017 by Profile [B@P] Daniel
You might try a cc_config.xml file, placed in the BOINC Data directory:

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
</options>
</cc_config>

I have needed it in the past with two cards, but I don't recall whether it was needed for Nvidia or AMD cards.

As I remember, if you have few different GPUs, by default BOINC will use only the fastest one, i.e. crunch only one GPU task. This option tells it to also use the slower one(s). You can try it, but I doubt it will help here.
14) Message boards : Linux : No tasks sent to Linux CentOS 7 hosts (Message 194)
Posted 22 Jun 2017 by Profile [B@P] Daniel
However, there is an obvious workaround. I need to create additional linux plan class for Nvidia with OpenCL.

Done. Now, anyone who has linux with kernel verion < 3.16 and Nvidia GPU will get OpenCL app instead of CUDA.

Thanks! Nvidia GPU app also works under Linux now.
15) Message boards : Linux : No tasks sent to Linux CentOS 7 hosts (Message 180)
Posted 21 Jun 2017 by Profile [B@P] Daniel
Could you load them on server as "official" ones, or change something to allow anonymous platform? Server still rejects it.

I updated the apps and the preferences for linux plan classes, now you should be able to recieve CPU tsaks on your CentOS hosts. If not, then I probably should restart the scheduler.

Thanks! I see that one host without app_info.xml got and finished some tasks. Two failed because of "finish file present too long" BOINC bug, all other passed validation.

Edit: looks that libcuda.so is installed together with driver:

But not libcudart... However, there is an obvious workaround. I need to create additional linux plan class for Nvidia with OpenCL.

Yes, this is another possible solution. There is also one more which came to my mind now: try to setup virtual machine with CentOS and use it to build CUDA app.

Could you paste error which you get when you try to link with cudart_static lib using g++? Maybe I will be able to help you with this.
16) Message boards : Linux : No tasks sent to Linux CentOS 7 hosts (Message 174)
Posted 20 Jun 2017 by Profile [B@P] Daniel
OK, I am back. I have just tried new CPU and OpenCL versions, and looks that they work fine. Here are messages logged to std*.txt files:
===[stdout from OCL version]===
Parsing calculation parameters...
Calculation-->PrintAtoms is set to No.
Calculation-->PolarFactor is set to No.
Calculation-->wavelength is set to default.
Calculation-->hist_bin is set to 0.001.
Sample-->Rcutoff is set to 0.

Parsing Block 0...
Block-->centered is set to No.
Atoms-->filename is set to default value.
Atom-->occ is set to 1.0.
Atom-->Uiso is set to 0.
Atom-->occ is set to 1.0.
Atom-->Uiso is set to 0.

All blocks have been parsed.

Reading NaCl_FFtable.txt
Atomic ensemble calculation time: 0.00506988 s

Selected OpenCL device:
GPU: GeForce GTX 970
Number of compute units: 13
GPU clock rate: 1177 MHz
Theoretical peak performance: 3917 GFLOPs

Histogram calculation time: 0.00126214 s
1D pattern calculation time: 0.24491 s
Total calculation time: 0.51707 s

===[stderr from OCL version]===
shmget in attach_shmem: Invalid argument
21:23:33 (12065): Can't set up shared mem: -1. Will run in standalone mode.
21:23:34 (12065): called boinc_finish(0)

===[stdout from OAM version]===
Number of OpenMP threads is set to 32

Parsing calculation parameters...
Calculation-->PrintAtoms is set to No.
Calculation-->PolarFactor is set to No.
Calculation-->wavelength is set to default.
Calculation-->hist_bin is set to 0.001.
Sample-->Rcutoff is set to 0.

Parsing Block 0...
Block-->centered is set to No.
Atoms-->filename is set to default value.
Atom-->occ is set to 1.0.
Atom-->Uiso is set to 0.
Atom-->occ is set to 1.0.
Atom-->Uiso is set to 0.

All blocks have been parsed.

Reading NaCl_FFtable.txt
Atomic ensemble calculation time: 0.00522938 s

Histogram calculation time: 2.20116 s
1D pattern calculation time: 0.109375 s
Total calculation time: 2.31856 s

===[stderr from OAM version]===
shmget in attach_shmem: Invalid argument
21:25:13 (12136): Can't set up shared mem: -1. Will run in standalone mode.
21:25:15 (12136): called boinc_finish(0)

Could you load them on server as "official" ones, or change something to allow anonymous platform? Server still rejects it.

Edit: looks that libcuda.so is installed together with driver:
# ls -l /lib64/libcuda* lrwxrwxrwx. 1 root root 12 06-11 13:34 /lib64/libcuda.so -> libcuda.so.1 lrwxrwxrwx. 1 root root 17 06-11 13:34 /lib64/libcuda.so.1 -> libcuda.so.367.57 -rwxr-xr-x. 1 root root 8227752 06-11 13:34 /lib64/libcuda.so.367.57
17) Message boards : Linux : No tasks sent to Linux CentOS 7 hosts (Message 169)
Posted 20 Jun 2017 by Profile [B@P] Daniel
Thanks, I will test this later today after I return home.

I searched a bit and found that it is possible to use g++ instead of nvcc for linking. By doing so you would be able to link glibc/libstdc++ statically. Please check these links:
https://devblogs.nvidia.com/parallelforall/separate-compilation-linking-cuda-device-code/
https://stackoverflow.com/questions/9421108/how-can-i-compile-cuda-code-then-link-it-to-a-c-project.

Thank you!

In these two threads, they link to cudart and not to cudart_static. I can do that too, but g++ does not work with cudart_static. I'm not sure whether the cudart library is present on any linux hosts which have nvidia drivers installed. If not, the executable will not be portable.

nvcc internally calls host linker (e.g. /bin/ld), so you should be able to use cudart_static with g++ too. Please try to add option --verbose when calling it, you should see options passed to ld. gcc/g++ also supports this option, so you should be able to find what is missing or different when linking with g++.

Edit: nvcc also supports -Xlinker option which allows to pass any option to linker, so maybe you will be able to somehow use it to link statically with glibc/libstdc++.
18) Message boards : Linux : No tasks sent to Linux CentOS 7 hosts (Message 166)
Posted 20 Jun 2017 by Profile [B@P] Daniel
All versions needs this one symbol:
# c++filt _ZSt24__throw_out_of_range_fmtPKcz std::__throw_out_of_range_fmt(char const*, ...)

Ок, I rebuilt BOINC libraries and now able to build CPU and OpenCL versions with -static-libgcc -static-libstdc++. Still no success with CUDA version, just cannot link with g++ to cudart_static (linking to cudart works however) and nvcc does not understand -static-libgcc -static-libstdc++ options. Can you try the new CPU and OpenCL executables?

Thanks, I will test this later today after I return home.

I searched a bit and found that it is possible to use g++ instead of nvcc for linking. By doing so you would be able to link glibc/libstdc++ statically. Please check these links:
https://devblogs.nvidia.com/parallelforall/separate-compilation-linking-cuda-device-code/
https://stackoverflow.com/questions/9421108/how-can-i-compile-cuda-code-then-link-it-to-a-c-project.
19) Message boards : Linux : No tasks sent to Linux CentOS 7 hosts (Message 162)
Posted 20 Jun 2017 by Profile [B@P] Daniel
Good to know! Let me know when it will be ready. Or maybe upload it somewhere, so I could take it and put in on my host?

Can you try them out on your CentOS host? It would be great! Here is a link. The archive includes input data required to run the test.
You can run the apps in standalone mode. To run the CPU version, do:

./xansons_boinc_OMP --nthreads N

where N is the number of parallel CPU threads.

To run the CUDA version, do:

./xansons_boinc_CUDA --device N

where N is the device number (0 if you have only one GPU).

Also you can run the OpenCL version:

./xansons_boinc_OCL

OpenCL version will work on Nvidia GPU too.

I tried, bud all versions still require GLIBCXX_3.4.20. Results for other binaries are the similar:
# ldd -d -r xansons_boinc_CUDA ./xansons_boinc_CUDA: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by ./xansons_boinc_CUDA) linux-vdso.so.1 => (0x00007ffd235a2000) libcuda.so.1 => /lib64/libcuda.so.1 (0x00007fcb7c079000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcb7be5d000) librt.so.1 => /lib64/librt.so.1 (0x00007fcb7bc54000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fcb7ba50000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fcb7b747000) libm.so.6 => /lib64/libm.so.6 (0x00007fcb7b444000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fcb7b22e000) libc.so.6 => /lib64/libc.so.6 (0x00007fcb7ae6d000) /lib64/ld-linux-x86-64.so.2 (0x00007fcb7ca6c000) libnvidia-fatbinaryloader.so.367.57 => /lib64/libnvidia-fatbinaryloader.so.367.57 (0x00007fcb7ac1e000) symbol _ZSt24__throw_out_of_range_fmtPKcz, version GLIBCXX_3.4.20 not defined in file libstdc++.so.6 with link time reference (./xansons_boinc_CUDA)

All versions needs this one symbol:
# c++filt _ZSt24__throw_out_of_range_fmtPKcz std::__throw_out_of_range_fmt(char const*, ...)
20) Message boards : Linux : No tasks sent to Linux CentOS 7 hosts (Message 157)
Posted 19 Jun 2017 by Profile [B@P] Daniel

Edit: I have manually downloaded app and created app_info.xml for it. Unfortunately server refuses this, I got following reply from server:

This project doesn't support computers of type anonymous

I've added the anonymous platform to project.xml and run xadd. Is that helped?

No, I still get the same error from server.

I also checked if all required libs are in place and it turned out that newer glibc (GLIBCXX_3.4.20) is needed. But this problem can be resolved if you recompile app with options -static-libgcc -static-libstdc++ or if I recompile it locally on my host.

Through some trial and error I finally succeeded to link the linux app for CPU statically to libstdc++. Note that simple approach with -static-libgcc -static-libstdc++ didn't work in my case for some reasons. I need to do some testing and if everything will be ok I'll update the linux apps.

Good to know! Let me know when it will be ready. Or maybe upload it somewhere, so I could take it and put in on my host?


Next 20

Main page · Your account · Message boards


© 2020 Vladislav Neverov (NRC 'Kurchatov institute'), Nikolay Khrapov (Institute for Information Transmission Problems of RAS)