How Do I Fix dataskq Causing High Load On DirectAdmin ?

by admin on August 28, 2014

Today, When I go to DA Service Monitor, the top 5 processes are using 99.9% of the CPU and they’re all the same processes

11477 root      20   0  324m 196m 1580 R 53.3  5.2  17153:50 /usr/local/directadmin/dataskq                                                                 
 7738 root      20   0  315m 183m 1580 R 49.3  4.8  15702:50 /usr/local/directadmin/dataskq                                                                 
17973 root      20   0  307m 170m 1592 R 59.5  4.5  14271:42 /usr/local/directadmin/dataskq                                                                 
15411 root      20   0  285m 159m 1592 R 58.9  4.2   9984:23 /usr/local/directadmin/dataskq                                                                 
18812 root      20   0  299m 158m 1592 R 50.0  4.2  12829:33 /usr/local/directadmin/dataskq                                                                 
32016 root      20   0  292m 146m 1592 R 49.7  3.8  11400:47 /usr/local/directadmin/dataskq                                                                 
10846 root      20   0  254m 125m 1592 R 61.8  3.3   8605:18 /usr/local/directadmin/dataskq                                                                 
22175 root      20   0  248m 114m 1592 R 54.9  3.0   7239:59 /usr/local/directadmin/dataskq                                                                 
28472 root      20   0  241m 104m 1592 R 52.6  2.7   5916:28 /usr/local/directadmin/dataskq                                                                 
 2738 root      20   0  239m  98m 1700 R 49.7  2.6   4753:17 /usr/local/directadmin/dataskq                                                                 
 7807 root      20   0  212m  84m 1836 R 49.3  2.2   3698:22 /usr/local/directadmin/dataskq                                                                 
11449 root      20   0  202m  75m 1836 R 49.7  2.0   1870:38 /usr/local/directadmin/dataskq                                                                 
 6370 root      20   0  205m  73m 1836 R 50.0  1.9   2744:47 /usr/local/directadmin/dataskq                                                                 
22093 root      20   0  178m  52m 1836 R 52.0  1.4   1042:42 /usr/local/directadmin/dataskq                                                                 
26249 root      20   0  152m  27m 1844 R 54.9  0.7 266:05.08 /usr/local/directadmin/dataskq

My server is having high load and dataskq is on the ‘top’ list. All services was crashed. How do I fix this problem.

The first, I analytic a process to find problem with bellow command

# lsof -p 11477

Output

COMMAND   PID USER   FD   TYPE DEVICE  SIZE/OFF     NODE NAME
dataskq 11477 root  cwd    DIR  259,4      4096 21106008 /usr/local/directadmin
dataskq 11477 root  rtd    DIR  259,4      4096        2 /
dataskq 11477 root  txt    REG  259,4   8893140 21106014 /usr/local/directadmin/dataskq
dataskq 11477 root  mem    REG  259,4     65928  7602205 /lib64/libnss_files-2.12.so
dataskq 11477 root  mem    REG  259,4    122040  7602226 /lib64/libselinux.so.1
dataskq 11477 root  mem    REG  259,4     10192  7602373 /lib64/libkeyutils.so.1.3
dataskq 11477 root  mem    REG  259,4     43728  7602382 /lib64/libkrb5support.so.0.1
dataskq 11477 root  mem    REG  259,4    469528  7602181 /lib64/libfreebl3.so
dataskq 11477 root  mem    REG  259,4    277704  7602374 /lib64/libgssapi_krb5.so.2.2
dataskq 11477 root  mem    REG  259,4    142640  7602213 /lib64/libpthread-2.12.so
dataskq 11477 root  mem    REG  259,4   1921216  7602189 /lib64/libc-2.12.so
dataskq 11477 root  mem    REG  259,4     90880  7602578 /lib64/libgcc_s-4.4.7-20120601.so.1
dataskq 11477 root  mem    REG  259,4    596264  7602197 /lib64/libm-2.12.so
dataskq 11477 root  mem    REG  259,4    987096 21103632 /usr/lib64/libstdc++.so.6.0.13
dataskq 11477 root  mem    REG  259,4    110960  7602215 /lib64/libresolv-2.12.so
dataskq 11477 root  mem    REG  259,4     14664  7602247 /lib64/libcom_err.so.2.1
dataskq 11477 root  mem    REG  259,4    174840  7602378 /lib64/libk5crypto.so.3.1
dataskq 11477 root  mem    REG  259,4    941920  7602380 /lib64/libkrb5.so.3.3
dataskq 11477 root  mem    REG  259,4     19536  7602195 /lib64/libdl-2.12.so
dataskq 11477 root  mem    REG  259,4     98661 21106191 /usr/local/lib/libz.so.1.2.3
dataskq 11477 root  mem    REG  259,4   1950976 21104485 /usr/lib64/libcrypto.so.1.0.1e
dataskq 11477 root  mem    REG  259,4     40400  7602193 /lib64/libcrypt-2.12.so
dataskq 11477 root  mem    REG  259,4    437016 21104487 /usr/lib64/libssl.so.1.0.1e
dataskq 11477 root  mem    REG  259,4    154520  7602580 /lib64/ld-2.12.so
dataskq 11477 root    0r   REG  259,4      2795  8792180 /home/tmp/quota-dump (deleted)
dataskq 11477 root    1r   REG  259,4 710688943 21633577 /usr/local/directadmin/data/users/detvl/bandwidth.tally

And type the following command

# tail -n 10 /var/log/directadmin/errortaskq.log

Output

==> /var/log/directadmin/errortaskq.log <==
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing
2014:08:28-09:07:38: Dataskq USR1 signal: Currently processing: Tally::get_bandwidth_breakdown(..., 0) for detvl : done reading, begin parsing

Maybe, I get problem with parser processing log. Next, I check size of "/usr/local/directadmin/data/users/detvl/bandwidth.tally". Type the following command

# du -sh /usr/local/directadmin/data/users/detvl/bandwidth.tally

Output

678M	/usr/local/directadmin/data/users/detvl/bandwidth.tally

It's very big. To solve the problem, The first I kill all dataskq processes with bellow command

# killall -USR1 dataskq

Or run script

#!/bin/bash
PIDS=`ps aux | grep dataskq | awk '{print $2}'`
for P in $PIDS
do
kill -9 $P
done

The second, I truncate log data of file bandwidth.tally, type the following command

# echo "" > /usr/local/directadmin/data/users/detvl/bandwidth.tally

Finally, I set priority to slow down the dataskq, type the following command

# vi /etc/cron.d/directadmin_cron
# Replace with
* * * * * root nice -n 19 /usr/local/directadmin/dataskq

Leave a Comment

Previous post:

Next post: