Using Human Queue Worker to process comments

Some time ago, I released Human Queue Worker, a module that takes the concept of the Drupal Queue system, but where the processing of the items is done by human users rather than an automated process. I say 'takes the concept'; it in fact uses the Drupal Queue to create and claim queue items, but instead of declaring your queue with hook_cron_queue_info(), you declare it to Human Queue Worker as a queue that humans will be working on.

This was written for my current project and for a fairly specific need, and I didn't imagine many sites would be using it. However, it has an obvious and popular application: approving comments. I always figured it would be nice if someone wrote a little module to define a comment processing human queue.

Well, that someone is me, and the time is now. You see, I'm an idiot: when I set up this new blog site of mine, I totally forgot to set up a CAPTCHA, and then when I added Mollon, I didn't set it up properly. So this site has a few hundred spammy comments that I need to delete.

The problem is that comment management takes time. Unless there are some magical area of the core UI I've completely missed, I can either visit each node and delete them one by one, or use the comment admin form. There, I can mass-delete the ones with obvious spammy titles, but all the others will still need individual inspection.

The Human Queue UI simplifies this hugely. There's just one page for the queue. When you go to that page, you're presented with an item to process. In the case of comment approval, that's the comment itself, plus the parent node and parent comment to give you some context. To process the comment, click one of two buttons: 'Publish' or 'Delete'. The comment is dealt with, and the form reloads, with a brand new comment for you to process. Which means that the only clicking you do is the action buttons: Publish; Delete; Publish; Delete. (Though with the amount of spam on my site, it's probably Delete; Delete; Delete, like the Cybermen.)

I've not timed it, but I reckon I can probably go at quite a rate. And that's with just one of me: the core Queue system guarantees that only one worker can claim an item at any one time, and that applies to human workers too. So if another user were to work the queue too, by going to the same page, they would be getting shown different comments to work on, and we'd work through the comments at twice the rate.

Now I just need to find a compliant friend and make them into my worker drone. If you're interested, please don't post a comment!

The ripple effect

Hello! I'm back. I've not made any blog posts in over a year and a half due to the site where my blog was before,, closing down. And while it took me some time to get round to writing a Migrate script to import my posts from the old site's database, it was actually getting round to setting up this new domain that took the longest.

So what have I been doing all this time? Especially as I still don't have a single Drupal 7 site out there to my name? Well, these days I work on a humongous web application which has kept me busy for the last 18 months; it's a large Drupal site (we hit a million of one of its several custom entity types recently), but to the general public it's just a login page. I may talk more about the development challenges in future posts.

Prior to that, I was building what would have been one of the earliest big Drupal Commerce sites to launch... except that very shortly before launch in October 2012, the whole project got canned.

That was obviously rather demoralizing, as it had been a year in the building. However, it's interesting to look back and see just how much a large-scale project can contribute back to the Drupal community. The following is a list of the contrib modules that were created as part of the development of the project. Many of them are pretty simple; many of them don't get that much use. But they get some, and so it's interesting to see how much code a project can share if development is steered towards reusability, and also how a single project, even one that never sees the light of day, can have effects that ripple out and benefit many others.

Reference field option limit

This was built to improve the UI for selecting terms from large taxonomies, where one vocabulary groups another. For example, you could have a huge vocabulary of cities, and a smaller one of cities. Each city term has a term reference to the country it belongs to (the glee of the early days of Drupal 7: all the fields on all the things!), and the entity that you want to tag (in my case a product) has term reference fields to both the country and the city. The way this module works is that the city terms shown in one widget are filtered each time you change the selected country term in the other field's widget. So you select 'France' and the city field widget updates with AJAX to show only cities in France.

It sounds simple to explain (I hope!), but involves a fair amount of work under the hood in the form alteration to pull it off. It's one of my most popular modules, and has received a fair number of patches from other users.

Devel Contrib and Field Tools

These two are developer modules. Devel Contrib arose from my constantly needing to dsm() the data from info hooks such as hook_views_data() and hook_entity_property_info(). When the thing you're developing keeps going wrong, one of first things to do is poke around to check you've properly declared everything to the APIs you're using. I started out just having this as a couple of menu items declared in a custom module, but pretty soon I added more, and I wanted it on other development sites, so the obvious thing to do was clean it up and release it.

Field Tools was something I wrote because the Commerce site in question had dozens of different product types and corresponding node types, with lots of common fields, and I really didn't want to spend hours clicking through the field admin UI to set them up. This may seem like a case of condiment-passing, but I'm sure it's saved me hours of tedium: create the fields once, and then quickly clone them to any other entity types and bundles.

Field Instance Cardinality

A very small tweak module, this lets you override a particular instance of a multi-valued field and set it to be single-valued only. It's a hook_form_alter() hack made reusable by adding some field admin settings, and a good example of how site customization can often be done as modules rather than alter hacks.

Field Value Link Formatter

Very simple module: your taxonomy term fields (or other entity references) should link to a view with an entity ID argument. I guess the more traditional way of doing this is with a lot of path aliases for your taxonomy terms. I think we may also have used this to create Profile-style lists of related entities.

Commerce Shipping Weight Tariff

This was created to deal with the complex business rules we had for shipping costs. Written very near to our planned launch date, it's an example of what I call the 'skimp on the admin UI' contrib module: hardcode the settings, but do so in a way that's cleanly separated from the rest of the module, so that later on an admin UI can be added, perhaps contributed as a patch. Though I think that's yet to materialize for this one.

Views Grouped Table

This was created to help keep track of how the product entities and product display nodes all related to one another.

Views Dependent Filters

Allows the presence of exposed filters on a view to be controlled by values in another exposed filter. This was intended to handle filtering Views of products, though we later switched to using Views with Solr.

Taxonomy add previous

A little bit of UX sugar for when you're adding lots of taxonomy terms that are very similar. In our case, this was terms representing sizes. On creating a new term, this takes you straight back to the form for adding another term, and prefills the field values from the one you just added.

Flag Expire

Allows flags to have either an expiry date, or an expiry period. This was going to be used to feature products, or mark them as new, or discounted. (For new product, you could just automatically mark all products that were created in the last x days, say. But as I recall, the client didn't want ALL new products marked, just selected ones. The way clients do.)

As well as new contrib modules, the project resulted in work on existing contrib modules, in particular Flag, Data, Views Hacks, and Commerce Delivery.

Corralling permissions into a grid

I've just released Permissions Grid. It does what the name suggests: it presents related permissions in a grid, rather than the usual long list.

How are permissions structured into a grid? Well, only the ones that form natural groups are included: every set of permissions of the form 'create foo, edit foo, delete foo, create bar, edit bar, delete bar' is turned into a matrix of checkboxes with the verbs 'create, edit, delete' along the top, and the objects 'foo, bar' along the top. When modules such as node, taxonomy, and commerce define related permissions for nodes, vocabularies, and products respectively, that gives you something like this:

This gives an easy to grasp overview of what a role can do with different objects on the site: which node types can this role create? which can they edit, or delete? which product types can they edit? which vocabularies can they create terms in?

If this sounds and looks vaguely familiar, that's probably because this module has an ancestor: my Drupal 6 module node permissions grid module, which I wrote back when a site's content types started to become too numerous to easily make sense of. That operated only on node types, and like a great many contrib modules porting to Drupal 7, it's had to 'drop the node' and generalize. But in fact nothing restricts Permissions Grid to entities: all it cares about is permissions.

Structured permissions are declared to the module in an info hook, and each module may declare multiple sets of permissions. This allows for the fact that some modules add further vocabulary-related permissions which do not have the same pattern, and that commerce has entity permissions in both singular and plural form.

Are there any groups of permissions I've missed, whether in core or contrib? Post a feature request, or better still, take a look at the hook implementations already there and file a patch.

On rules versus hooks, or, abstraction shock

I need to add a bit of business logic to my Commerce site: a boolean field on product nodes marks that the corresponding products can't be delivered outside the UK.

And I know the way to do this in Commerce is to create a rule: react to the cart completing checkout, iterate over line items, check the corresponding products, and block the completion if the field in question is set.

Rules is great: with Rules, site builders can change site functionality and cause it to react to events. When non-techy people ask if my job involves designing websites, to put them right I say, 'I make websites go "bing!"'; and now, site builders can make them go 'bing!' too.

But I have a confession: I'm reluctant about using Rules. It's partly that I find the UI confusing, and it feels time-consuming to test them, but deeper than that I think it's just that I feel too far removed from the actual thing I'm trying to make.

And that makes me wonder: am I becoming a Drupal dinosaur?

Because I can imagine when Views first came along, developers who were used to writing their own query and formatting the result themselves, looking at the Views UI and thinking 'I don't feel in control of my lists of stuff any more'.

Or before CCK, developers wrote exactly the form elements they needed in the node form and saved it themselves in the database. I still sometimes speak to non-Drupal developers who want to be able to dump data into the node table directly (or pull it out) and when I tell them they can't, because the data that actually makes a node is spread out over the node table, the node_revision table, and then a multitude of field tables. And their feeling of disconnection at not being able to get their hands on 'the node' as a solid lump of database stuff must surely be akin to what I feel with Rules. And I'm going to call this feeling 'abstraction shock'.

I want to write a hook. I want to write the code for it, for it to feel like a solid thing. I know that my rule can (indeed, should) be exported to code, but I want code that I can read and see exactly what it says, rather than code that Rules will consume and understand. And most of all, I want to be able to put in debug statements to understand what I'm getting as I write it, and after I've written it when it's going wrong or when the site functionality has to change.

If that makes me a dinosaur, save me a seat next to the brontosaurus.

Git tricks: repatching for an issue branch

My workflow for making patches is to use a feature branch for a single issue. Whether you're a contributor or a maintainer it lets you advance the fixing of the problem in small increments, and safely experiment knowing you can roll back.

But where it goes wrong is when your patch is superseded by a newer one in the issue queue, and you want to work on it some more. How do you update your branch for the ongoing work? As ever, with git there's a way.

Let's start with the basics first: you're making a feature branch to work on an issue. I tend to follow the naming pattern '123456-fix-all-the-bugs', but for this example I'll call it 'issue'.

// Make a new branch and switch to it.
$ git co -b issue
// Make lots of commits.
// Ready to make a patch:
$ git diff > 123456.project.issue.patch

(Note that if you can make your patch to show all your commits one by one, which can sometimes aid in making it clear what you're changing, but that's for another day.)

You've now got a patch which you're uploading to the issue queue, and your tree looks something like this:

* [issue] Last commit, ready to roll a patch!
* Fixed the foobar.
* Added a bizbax.
* [master]

Now someone else comes along to the issue queue, reviews your patch, and posts a new patch of their own. You in turn look at patch 2, and while it's an improvement, you think it needs still more work.

The problem is how to apply the patch to your repository. It won't apply to the tip of the issue branch, and if you checkout master, you can't get back to your issue branch. You can of course just discard your original issue branch, and create a branch issue2 for patch 2.

Or you can do this:

// Start on the issue branch.
// Stash any work in progress!
$ git stash
// Checkout just the *files* of master, while keeping the HEAD pointer on the
// issue branch.
$ git checkout master -- .
// This puts the files from master into the working tree, but keeps the index
// on the issue branch. In simpler terms, the reverse of patch 1 will appear
// staged (as git believes that your files *ought* to look like patch 1, but
// actually look like master).
// We want the index clean, so unstage everything:
$ git reset HEAD .
// Now apply the new patch.
$ patch -p1 < patch-2.patch
// Now commit this as patch 2.
// Remember to stash pop when you're done!

Because the working tree files (that is, the actual files on your system) look like the master branch, the patch applies cleanly. But because git still believes its on the tip of the issue branch, the commit you make goes on the tip of that branch, and the diff it records is effectively the interdiff between your patch-1 and the other contributor's patch-2. Your tree looks like this:

* [issue] Applied patch 2 from Ada Lovelace.
* Last commit, ready to roll a patch!
* Fixed the foobar.
* Added a bizbax.
* [master]

Result: you can now do more work on this branch, and make more commits, and when you're ready, diff against master to make patch-3, ready to upload to the issue queue.


Subscribe to Joachim's Drupal blog