Letsencrypt Wildcard Certificates, with acme.sh client

Took me a bit of time to figure this out, so I thought I’d make it public. Letsencrypt announced their new wildcard certs, and because I have to add the SSL cert to a load balancer covering many subdomains, I needed to make use of it.

First thing to note is that not all clients support the new v2 API which is required for wildcard certs. I looked at the list of v2 supporting clients on the Letsencrypt site, and chose the acme.sh bash script. Not sure if I’m going to stick with it at this point but it got me going.

First thing you need to do is to run it with the –issue flag. You’ll need to run it with DNS authentication, as that’s the supported method for wildcard certs. You’ll also need to run it with both the root domain AND the wildcard.

Read moreLetsencrypt Wildcard Certificates, with acme.sh client

Switch from UFW and fail2ban to CSF

Having played with CSF for a while on one server, I’ve decided I like it more than UFW and fail2ban. It seems much better at blocking mail bruteforce attacks and SSH as a distributed attack. So anyway, here’s a list of steps to achieve that, as much for my record as anything. The server is running Ubuntu 16.04, but these general steps should work anywhere. In addition the server I did it on is also running VestaCP, so there are a couple more steps for that.

Read moreSwitch from UFW and fail2ban to CSF

Command to find all image files which are not really image files!

Quick one this … so you’ve got a compromised webserver and you want to check the files on it. Many scanning tools will ignore images, but an image might not always be what it seems! Check them all with this command:

find /path/to/dir -regex ".*\.\(jpg\|png\|gif\)" -exec file {} \; | grep -i -v "image data"

If all is good, you won’t get any output. If your server is seriously borked, then you might see things like this …

./wp-content/uploads/2011/01/22.jpg: HTML document, ASCII text

This is a flag that the image is in fact a PHP file. Investigate!

If you get this kind of thing,
./wp-content/uploads/2011/01/221.jpg: Minix filesystem, V2, 46909 zones

its probably a bug in an old version of file, so check your OS version, copy the file to a more recent OS and try again.

Bash script to clean Bots out of Apache Logs

If you’ve ever spent some time looking at webserver logs, you know how much crap there is in there from crawlers, bots, indexers, and all the bottom feeders of the internet. If you’re looking for a specific problem with the webserver, this stuff can quickly become a nuisance, stopping you from finding the information you need. In addition, its often a surprise exactly how much of the traffic your website serves up to these bots.

The script below helps with both these problems. It takes stats of a logfile (apache, but should also work on nginx), makes a backup, counts the number of lines it removes and each kind of bot, and then repeats the new stats at the end. Copy the following into a file eg. log-squish.sh and run with the name of the logfile as an argument. eg cleanlog.sh logfile.log
You’ll definitely want to edit the LOCALTRAFFIC bit to fit your needs. You may also want to add bots to the BOTLIST. Run the script once on a sample logfile and then view it to see what bots are left …

#!/bin/bash

# Pass input file as a commandline argument, or set it here
INFILE=$1
OUTFILE=./$1.squish
TMPFILE=./squish.tmp

if [ -f $TMPFILE ] ; then 
    rm $TMPFILE
fi

# Check before we go ... 
read -p "Will copy $INFILE to $OUTFILE and perform all operations on the file copy. Press ENTER to proceed ..."

cp $INFILE $OUTFILE

# List of installation-specific patterns to delete from logfiles (this example for WP. Also excluding local IPaddress)
# Edit to suit your environment.
LOCALTRAFFIC=" wp-cron.php 10.10.0.2 wp-login.php \/wp-admin\/ "
echo
echo "-------- Removing local traffic ---------"
for TERM in $LOCALTRAFFIC; do
    TERMCOUNT=$( grep "$TERM" $OUTFILE | wc -l )
    echo $TERMCOUNT instances of $TERM removed >> $TMPFILE
    sed -i  "/$TERM/d" $OUTFILE
done
sort -nr $TMPFILE
rm $TMPFILE

# List of patterns to delete from logfiles, space separated
BOTLIST="ahrefs Baiduspider bingbot Cliqzbot cs.daum.net DomainCrawler DuckDuckGo Exabot Googlebot linkdexbot magpie-crawler MJ12bot msnbot OpenLinkProfiler.org opensiteexplorer pingdom rogerbot SemrushBot SeznamBot sogou.com\/docs tt-rss Wotbox YandexBot YandexImages ysearch\/slurp BLEXBot Flamingo_SearchEngine okhttp scalaj-http UptimeRobot YisouSpider proximic.com\/info\/spider "
echo
echo "------- Removing Bots ---------"
for TERM in $BOTLIST; do
    TERMCOUNT=$( grep "$TERM" $OUTFILE | wc -l )
    echo $TERMCOUNT instances of $TERM removed >> $TMPFILE
    sed -i  "/$TERM/d" $OUTFILE
done
sort -nr $TMPFILE
rm $TMPFILE

echo
echo "======Summary======="

#filestats before
PRELINES=$(cat $INFILE | wc -l )
PRESIZE=$( stat -c %s $INFILE )

#filestats after
POSTLINES=$(cat $OUTFILE | wc -l )
POSTSIZE=$( stat -c %s $OUTFILE )
PERCENT=$(awk "BEGIN { pc=100*${POSTLINES}/${PRELINES}; i=int(pc); print (pc-i<0.5)?i:i+1 }")

echo Original file $INFILE is $PRESIZE bytes and contains $PRELINES lines
echo Processed file $OUTFILE is $POSTSIZE bytes and contains $POSTLINES lines
echo Log reduced to $PERCENT percent of its original size.
echo Original file was untouched.

And here is a sample output.

~/temp $ ./log-squish.sh access.log.2017-09-03
Will copy access.log.2017-09-03 to ./access.log.2017-09-03.squish and perform all operations on the file copy. Press ENTER to proceed

-------- Removing local traffic ---------
5536 instances of wp-login.php removed
507 instances of \/wp-admin\/ removed
84 instances of wp-cron.php removed
0 instances of 10.10.0.2 removed

------- Removing Bots ---------
2769 instances of bingbot removed
2342 instances of Googlebot removed
2177 instances of sogou.com\/docs removed
1815 instances of MJ12bot removed
1651 instances of ahrefs removed
1016 instances of opensiteexplorer removed
578 instances of Baiduspider removed
447 instances of Flamingo_SearchEngine removed
357 instances of okhttp removed
295 instances of UptimeRobot removed
122 instances of scalaj-http removed
74 instances of YandexBot removed
60 instances of ysearch\/slurp removed
24 instances of YisouSpider removed
22 instances of magpie-crawler removed
9 instances of linkdexbot removed
7 instances of YandexImages removed
7 instances of SeznamBot removed
5 instances of rogerbot removed
2 instances of tt-rss removed
1 instances of SemrushBot removed
0 instances of Wotbox removed
0 instances of proximic.com\/info\/spider removed
0 instances of pingdom removed
0 instances of OpenLinkProfiler.org removed
0 instances of msnbot removed
0 instances of Exabot removed
0 instances of DuckDuckGo removed
0 instances of DomainCrawler removed
0 instances of cs.daum.net removed
0 instances of Cliqzbot removed
0 instances of BLEXBot removed

======Summary=======
Original file access.log.2017-09-03 is 19395785 bytes and contains 74872 lines
Processed file ./access.log.2017-09-03.squish is 15432796 bytes and contains 54965 lines
Log reduced to 73 percent of its original size.
Original file was untouched.

So you can see that around 20% of the traffic on here is crap. And now the log file is much easier to read.

 

More Control Over Logwatch Report Dates

I’ve been happily running Logwatch on several servers with the default ‘yesterday’ date range for several years. However I needed to run it for a client with a larger date range to check out a problem. But the options available for logwatch are only ‘today’, ‘yesterday’ and ‘all’. Or so it told me. And even worse, the ‘yesterday’ option takes the date from the previous day, and pulls out all the info on that date. So if you run your logwatch report at 4pm, you’re missing out on 16 hours worth of data! But it turns out logwatch is smarter than that …

Read moreMore Control Over Logwatch Report Dates