Tag Archives: comments

Safely delete spam comments across a large WP network

I’m currently working on a university WordPress network that’s been running for four or five years (an MU veteran!) and has almost 5000 blogs, most of which are defunct (because they’re from previous semesters). Akismet is activated across the network, so there’s not much of a public spam problem. However, even spam comments are stored in the database, and some of the blogs have tens of thousands of spam comments sitting in their tables. I’m going to implement a couple of tricks to keep this from happening in the future (a lightweight honeypot for non-logged-in users, tell Akismet to auto-delete spam comments on old posts). But for now, I’ve got to clean up this mess, because the very large comment and commentmeta tables are causing resource issues.

I wrote a simple script that gradually cycles through all the blogs on the network and deletes comments that have been marked as spam by Akismet. Here it is, with some comments afterward:

Notes:

  • The number of blogs is hardcoded (4980)
  • The ‘qw_delete_in_progress’ key is a throttle, ensuring that only one of these routines is running at a time. You might call this the poor man’s poor man’s cron.
  • I’ve limited it to 10 comments per pageload, but you could change that if you wanted
  • Put it in an mu-plugins file. When it’s finished running (check the ‘qw_delete_next_blog’ flag in the wp_sitemeta table – it’s done if it’s greater than the total number of blogs on the system), be sure to remove it, or at least comment out the register_shutdown_function line.

Use at your own risk – I’m posting here primarily for my own records 🙂

New BuddyPress plugin: BP Include Non-Member Comments

I wrote a plugin this afternoon that solves a small but potentially annoying limitation of BuddyPress: its inability to show comments from non-members in the sitewide activity stream. In a streak of extreme creativity, I dubbed the plugin “BP Include Non-Member Comments”. Read more about it, and download it for your own use, here.

True cross-platform comment syncing with Disqus and Wordpress

FeedWordPress works well if you want to syndicate content from various sources into a single Wordpress blog. Syndicating comments is, of course, more difficult. I’m finishing up a job for a client who wanted real-time synced comments, and suggested that Disqus might do the trick. I quickly discovered that Disqus is clearly not made to do what I wanted it to do. But, being the cool guy that I am, I hacked something together that is more or less functional.

Here were the requirements: Comments on a blog post needed to be synchronized between the source blogs and the hub blog. Readers had to be able to comment in both places and have the comments sync. While I’d be using Wordpress to create the hub blog, the source blogs would be hosted on various platforms: Tumblr, Typepad, Blogger, self-hosted Wordpress. (The distributed requirement is especially important. If the blogs were all on the same installation of WPMU, the job would be trivial and would not require a third-party solution like Disqus.) Because bloggers would be coming from different platforms, I not only had to be able to accomodate those platforms, but I also had to make sure that the system would work with the platforms’ stock configuration. That is, since I (and, generally speaking, the bloggers) don’t have access to the platform code, all custom modifications need to happen at the hub blog.

I don’t particularly recommend that anyone try to replicate what I’ve done here. But hopefully it will point the way toward what might be a viable third-party system for true comment syncing.

The details

Here’s my strategy with regard to Disqus. If all the source blogs were registered to the same Disqus Comments account (ie corresponding to a single shortname), then they’d all have the same forum_key, which is to say they’d be accessible by the same API request. Thus the strategy is to make Disqus unable to distinguish between API calls from the source blogs (which are, recall, making stock API calls to Disqus) and API calls from the corresponding posts on the hub blog.

I installed the Disqus Comment System plugin for the Wordpress hub blog and registered with the same credentials that would be given to the source blogs. When feeds starting syndicating to the hub blog, however, I found that the comment sections on the source post weren’t matching the comment section on the hub post. The URL for each comment thread’s RSS feed showed me why: Disqus indexes a forum’s comment thread based on some post information that it gets from the client platform, and each platform was formatting the information in a different way.

First problem: The Wordpress Disqus plugin uses a post variable called $thread_meta, which is set in disqus-comment-system/lib/api.php thus:
[code language=”php”]$thread_meta = $post->ID . ‘ ‘ . $post->guid;[/code]
Disqus would then create a comment thread based on this string. The problem is that $post->ID is the post ID number for the hub blog, and has nothing to do with the source blog (which, depending on platform, does not include post ids in its API request at all). So the source blog’s thread would be identified as test_post (for example) while the hub blog would be 34_test_post. I replaced the code above with
[code language=”php”]$thread_meta = $post->guid[/code]
which manages to stay pretty consistent across platforms. (NB: The same change has to be made on the source blog version of the Disqus plugin, if the source blog is running a self-hosted installation of Wordpress.)

Second problem: Getting a stable and unique identifier for each post thread is only the first step. You also need to make sure that the identifier is concatenated correctly when the actual API request is made. Disqus comment sections work by loading a piece of Javascript that is concatenated from an API request to disqus.com for the proper thread, then finds the comment section on the post page, and replaces the native comment code with the code returned from disqus.com. But I found (again, by looking at the URL for the RSS feeds) that each platform was making the request a little bit differently. At the end of disqus-comments-system/comments.php, the stock WP plugin reads
[code language=”html”]