Comment Spam strikes back

An illustration of a robot turning web pages into canned meat product. Generated using Bing AI Image Generator

So now that I’m blogging again, it’s the return of comment spam on my blog posts.

Comment spam has always been a problem with blogs – ever since blogs first allowed comments, spam has followed. Despite the advert of the rel=”nofollow” link attribute, automated bots still crawl web sites and submit comments with links in the hope that this will boost the rankings in search engines.

In the early days of blogging, blogs often appeared high in Google’s search engine results – by their very nature, they featured lots of links, were updated frequently, and the blogging tools of the time often produced simple HTML which was easily parsed by crawlers. So it was only natural that those wanting to manipulate search engine rankings would try to take advantage of this.

I’ve always used Akismet for spam protection, even before I switched to WordPress, and it does a pretty good job. Even then, I currently have all comments set to be manually approved by me, and last week a few got through Akismet that I had to manually junk.

Humans, or AI?

These five interested me because they were more than just the usual generic platitudes about this being a ‘great post’ and ‘taught me so much about this topic’. They were all questions about the topic of the blog post in question, with unique names. However, as they all came through together, and had the same link in them, it was clear that they were spam – advertising a university in Indonesia, as it happens.

Had it not been for the prominent spam link and the fact they all came in together, I may have not picked up on them being spam. Either they were actually written by a human, or someone is harnessing an AI to write comment spam posts now. If it’s the latter, then I wonder how much that’s costing. As many will know already, AI requires a huge amount of processing power and whilst some services are offering free and low cost tools, I can’t see this lasting much longer as the costs add up. But it could also just be someone being paid using services like Amazon Mechanical Turk, even though such tasks are almost certainly against their terms of service.

I think I’m a little frustrated that comment spam is still a problem even after a few years’ break from blogging. But then email spam is a problem that we still haven’t got a fix for, despite tools like SPF, DKIM and DMARC. I’m guessing people still do it because, in some small way, it does work?

New theme, who dis?

Screenshots of the old and new themes for the blog, side by side

I’ve deployed a new theme on the blog. If you’re reading this in your feed reader, firstly, go you, because so few people do nowadays, but also, please click through and have a look.

The theme I’m using is GeneratePress, with mostly default settings. This replaces one of the default WordPress themes that I was using before.

Why the change? Mainly page bloat; whilst the default WordPress themes are very extensible, the output code includes shedloads of extra JavaScript, CSS and style tags which result in web pages which are bigger than they should be. Whilst I’m at no risk of exceeding the data transfer limits offered by my hosting company, it does affect the speed of the site, and not everyone has unlimited mobile data or a fast connection.

I learnt HTML at a time when it was the done thing to hand-code pages – indeed, back when I used Blogger and later Movable Type as my blogging tools, for the most part I used themes that I had written all myself. JavaScript was used very sparingly, and the HTML and CSS code was nice, clean and simple. So seeing the code soup that was being outputted by the default themes was off-putting.

I also think about this blog post by Terence Eden, ‘the unreasonable effectiveness of simple HTML‘, where he gives an example of someone applying for housing benefit on a PlayStation Portable (PSP). This is presumably because it’s the only portable device with a web browser that she can use. But because the HTML on gov.uk is so clean and lightweight, the old, under-powered web browser on the PSP is still able to render it, and she’s able to get the information that she needs. A big, flashy web site oozing with various JavaScript frameworks, loads of tracking scripts and adverts everywhere just isn’t going to work on such an old device.

And then I saw this toot today:

I can't help but notice the new Apple laptops rate "Video Playback 22 hours, Web Browsing 15 hours" under battery life.

Congratulations web developers everywhere, it's now more computationally intense to render a webpage than video playback!

— Brad L. :verified: (@reyjrar)2023-11-05T04:41:28.299Z

Web pages are getting so full of cruft, that they require more processing power than video playback.

So, that’s why I’m going with a lightweight theme. It makes the web site much more accessible to more people. GeneratePress seems to output lighter code that displays fast, and it offers a good balance between extensibility and speed. It won’t be for everyone, but it seems to work well for me.

The times, they are upgrading

An AI generated image of a superhero emerging from a server cabinet, generated using Microsoft's Bing AI Image Creator

Hello – if you can read this, then the server upgrade worked!

I’ve wiped the previous server image (yes, I remembered to do more than one type of backup this time), and installed a freshly upgraded version of Linux. This means it’s running on Debian 12 (codenamed ‘bookworm’), and version 12 of Sympl. Sympl is a set of tools for Debian that makes managing a web server remotely a little easier, and is forked from Symbiosis which was originally developed by my hosting company Bytemark.

Going nuclear and starting from a fresh installation was for two reasons:

  1. The next version of WordPress, which will be 6.4, will have a minimum recommended PHP version of 8.1. This server was running version PHP 7.3, and whilst I’m sure future versions would work up to a point, it’s a good opportunity to upgrade.
  2. I’ve had a few issues with the previous installation. The FTP server software never seemed to work correctly, and the database (MariaDB) would lock up almost every time I posted a new blog post. Hopefully, this won’t happen anymore.

As this is a fresh WordPress installation, there may be a few things which don’t quite work yet. I’ve imported the existing blog posts and pages, and the theme is mostly the same, but I need to re-install the plugins and probably need to amend some settings. I’ll sort these issues out over the next few days.

WordPress in the Fediverse

A screenshot of the settings page for the ActivityPub plugin for WordPress

If I’ve set up everything correctly, then you should be able to subscribe to this blog in your favourite Fediverse app, such as Mastodon, by following @nrturner@neilturner.me.uk .

You’ll need to install the ActivityPub plugin, and then it should just work where your fediverse username is @your-wordpress-username@your-domain.tld. If you’ve used a plugin to disable author archives, such as Yoast’s SEO plugin, you’ll need to re-enable it for this to work.

I found this guide particularly useful, as it links to Webfinger to test that you’ve set it up correctly.

(Update: since this post was written almost 12 months ago, the ActivityPub plugin has been formally adopted by Automattic and so enjoys wider support)

What’s this? A blog post?

Well, hello. This is my first blog post in almost four years.

I last wrote a post on here in September 2018, and then took an un-planned break from blogging. This was exacerbated at the end of 2018, when I attempted to upgrade the server that this web site runs on, and ended up wiping everything. And I mean, everything, including the backups that I thought I’d saved elsewhere but hadn’t.

Just like that, 16 and a half years of blog posts were gone, along with all the comments. Now, it’s possible that I could have re-built most of the blog posts, using things like the Web Archive and help from others, but between working full-time and being a parent, I just didn’t have the time or the inclination to do so.

Furthermore, I was beginning to become uncomfortable with how much I had shared about my life over the years. Back when I started the blog, aged 17, I had a tendency to over-share. Over time I reigned that in; I was in a relationship with someone between 2005 and 2009 where I agreed not to share her real name on here, and though we’ve both moved on I’m keeping that commitment – not least because we’re still in touch and actually met up recently.

But I also wanted to reign in how much I talk about my child, who is now six. I’m happy to share their age, but I’m afraid you won’t be knowing their name or seeing recent photos, and I’m even keeping their gender off here now too. It’s about consent and privacy – as a parent, I want to protect my child, and they’re too young to really know what a blog is, never mind have lots of information about their life made public.

I am hoping to get back into the habit of blogging regularly, though not on a daily basis as I had aimed for in the past. Initially I’m aiming for twice a week, as there are four years of news to catch up on, but my minimum aspiration is for one new blog post per week.

Why now? Well, I’ve wanted to get back into writing for pleasure again. I’ve written a few things on Medium, but it feels like writing for a magazine; I’d rather stick to somewhere more personal that’s just about and run by me. I feel like I have things to say now, and hopefully the time to put those things into written words.

If you’re an old-time reader of my blog, welcome back, and I hope that this wasn’t too much of a surprise when it popped up in your RSS reader. And if you’re a new reader, hello. You can read my very dry ‘about me‘ page which is more focussed on my work, but I hope you’ll stick around and will get to know me better.