At vScaler, we're imploring anyone that has compute resource to spare, to donate it towards research efforts into COVID-19 (Coronavirus). To find out more about the Folding@home initiative, read our blog post here.
For HPC administrators wishing to rollout the application on Centos7, we have documented the instructions below. For those non HPC admins and anyone using SLURM, we've documented instructions here.
To start with, we have spun up a HPC cluster on vScaler, for anyone wishing to do the same here is our video on how to do this. Otherwise, the below instructions pick up after we SSH into the head node.
To begin, lets pull down the RPMs from the folding@home website.
These links were correct at the time of writing – please visit the folding@home website for up to date installation instructions
wget https://download.foldingathome.org/releases/public/release/fahclient/centos-5.3-64bit/v7.4/fahclient-7.4.4-1.x86_64.rpm wget https://download.foldingathome.org/releases/public/release/fahcontrol/centos-5.3-64bit/v7.4/fahcontrol-7.4.4-1.noarch.rpm wget https://download.foldingathome.org/releases/public/release/fahviewer/centos-5.3-64bit/v7.4/fahviewer-7.4.4-1.x86_64.rpm
Now you can install the RPMs. In this example we have stored the RPMs in a shared location across the cluster.
pdsh -w node00[00-02] rpm -ivh --nodeps /opt/ohpc/pub/apps/fah-install-rpms/fah*rpm
Let's stop the client for the moment to update the configuration file
pdsh -w node00[01-02] /etc/init.d/FAHClient stop
And now we can update the Configuration file - the main thing to note is to set the power v=full to use all cores
[root@vscaler-head ~]# cat /etc/fahclient/config.xml <config> <!-- Folding Slot Configuration --> <gpu v='false'/> <!-- Slot Control --> <power v='full'/> <!-- User Information --> <user v='vscaler'/> <!-- Folding Slots --> <slot id='0' type='CPU'/> </config>
Now we can sync that file across the rest of the cluster nodes
pdcp -w node00[01-02] /etc/fahclient/config.xml /etc/fahclient/config.xml
Now we can start the service back up
pdsh -w node00[01-02] /etc/init.d/FAHClient start
Lets jump over to one of the nodes to check things out. Starting with the CPU utilisation. Note these nodes are 72 core Xeons 8176 dual socket nodes.
top - 23:11:44 up 42 days, 7:08, 2 users, load average: 30.13, 17.39, 8.84 Tasks: 676 total, 2 running, 674 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 15.0 sy, 85.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 39463936+total, 25307000 free, 9180968 used, 36015139+buff/cache KiB Swap: 0 total, 0 free, 0 used. 38385616+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 307957 fahclie+ 39 19 4972900 270856 6956 R 7196 0.1 29:44.34 FahCore_a7 308034 root 20 0 162556 2876 1580 R 0.7 0.0 0:00.17 top 204 root rt 0 0 0 0 S 0.3 0.0 0:17.97 watchdog/39 67779 zabbix 20 0 83812 6440 5624 S 0.3 0.0 98:29.71 zabbix_agentd 1 root 20 0 193812 6800 4036 S 0.0 0.0 0:58.89 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:01.18 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.68 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 8 root rt 0 0 0 0 S 0.0 0.0 0:07.41 migration/0 9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
We can check the logs to see how things are progressing
[root@node0002 ~]# tail -f /var/lib/fahclient/log.txt 23:11:18:WU00:FS00:0xa7: Mode: Release 23:11:18:WU00:FS00:0xa7:************************************ Build ************************************* 23:11:18:WU00:FS00:0xa7: SIMD: avx_256 23:11:18:WU00:FS00:0xa7:******************************************************************************** 23:11:18:WU00:FS00:0xa7:Project: 14182 (Run 7, Clone 5, Gen 105) 23:11:18:WU00:FS00:0xa7:Unit: 0x0000007c0002894b5cf684c1e093d5ee 23:11:18:WU00:FS00:0xa7:Digital signatures verified 23:11:18:WU00:FS00:0xa7:Calling: mdrun -s frame105.tpr -o frame105.trr -cpi state.cpt -cpt 15 -nt 72 23:11:18:WU00:FS00:0xa7:Steps: first=262500000 total=2500000 23:11:19:WU00:FS00:0xa7:Completed 197572 out of 2500000 steps (7%) 23:11:25:WU00:FS00:0xa7:Completed 200000 out of 2500000 steps (8%) 23:11:57:WU00:FS00:0xa7:Completed 225000 out of 2500000 steps (9%)
Now give yourself a pat on the back - you're doing your bit to help out! 🙂
-
Celebrating Exascale Day
-
The Open Infrastructure Summit 2020 goes Virtual
-
SLURM integration with vScaler
-
OpenStack High Availability (HA) with Masakari
-
vScaler speeds up Data Analytics at Atass Sports
-
vScaler and Hystax partner up to deliver seamless Cloud migration
-
Be part of the worlds first ExaFLOP system…today!
-
COVID-19 Use Your Computer to Help
-
GigaIO Announces Availability of FabreX for vScaler Cloud Platform
-
COVID-19 Continuity Plan
Pingback: COVID-19 Use Your Computer to Help - vScaler
Pingback: Be part of the worlds first ExaFLOP system...today! - vScaler