Drupal Planet Posts

Making builder code look like output code

By joachim, Tue, 26/06/2018 - 21:30

One of the big challenges with updating Drupal Code Builder for Drupal 8 has been the sheer variety of code to be output. On earlier versions of Drupal, it was just about hooks, and all that needed to be done was to take the API documentation code and replace 'hook_' with the module name. There were info files too, and Drupal 7 added the placing of hooks into different .inc files, but compared to this, Drupal 8 has things like plugin annotations, fluent method calls for content entity baseFieldDefinitions(), FormAPI arrays, not to mention PHP class methods, and more.

But one of the things I enjoy about working on DCB is that I am free to experiment with different ideas, much more so than with work on core or even contrib. It is its own system, without any need to work with what a framework supplies, and it has no need to be extensible. So I can try a new way of doing things as often as I want, and clean up when I've had time to figure out which way works best.

For example, up until recently, the code for a field definition in baseFieldDefinitions() was getting generated in three different ways.

First, the old-fashioned way of doing it line by line, then concatenating the array with a "\n" to make the final code. This is the way most of the old code in DCB was done, but with things that need handling of terminal commas or semicolons, and nesting indents and so on, it was starting to get really clunky.

So then I tried writing something loosely inspired by Drupal's RenderAPI. Because that's a nice big hammer that seems to fit a lot of nails: make a big array of data, chuck your stuff into it, then hand it over to something that makes the output. Except, not so good. Writing the code to make the right sort of array was fiddly. The array of data needed to combine actual data and metadata (such as the class of an annotation), which added levels to the nesting.

Then I hit on an idea: baseFieldDefinitions() fields are a fluent interface, like this:

$fields['changed'] = BaseFieldDefinition::create('changed')
  ->setLabel(t('Changed'))
  ->setDescription(t('The time that the node was last edited.'))
  ->setRevisionable(TRUE)
  ->setTranslatable(TRUE);

What if the code that builds this could be the same, to the point where you could just copy-paste code from, say, the node entity class, and make a few tweaks? Creating the code in DCB would be much simpler, and having the DCB code look like the output code would make debugging easier too.

Using a class with the magic __call() method lets us have just that: a renderer object that treats a method call as some information about code to render. Here's what the builder code for the base field definition code looks like now:

$changed_field_calls = new FluentMethodCall;
$changed_field_calls
  ->setLabel(FluentMethodCall::t('Changed'))
  ->setDescription(FluentMethodCall::t('The time that the entity was last edited.'));
if ($use_revisionable) {
  $changed_field_calls->setRevisionable(TRUE);
}
if ($use_translatable) {
  $changed_field_calls->setTranslatable(TRUE);
}
$method_body = array_merge($method_body, $changed_field_calls->getCodeLines());

It's not yet perfect, as the first line isn't done by this, and the handling of the t() calls could do with some polish; probably by creating a separate class called something like FunctionCall, such that FunctionCall::somefunction() returns the code for a call to somefunction().

But the efficiency and elegance of this approach has led me to devise a new principle for DCB: builder code should look as much as possible like that code that it outputs.

So applying this approach to outputting annotations, the code now looks like this:

$annotation = ClassAnnotation::ContentEntityType([
  'id' => 'cat',
  'label' => ClassAnnotation::Translation("Cat"),
  'label_count' => ClassAnnotation::PluralTranslation([
    'singular' => "@count content item",
    'plural' => "@count content items",
  ]),
]);
$annotation_lines = $annotation->render();

Magic methods used there as well, this time for static calls. The similarity to the output code isn't as good, as annotations aren't PHP code, but it's still close enough that you can copy the code you want to output, make a few simple changes, and you have the builder code.

This work has embodied another principle that I've come to follow: complexity and ugliness should be pushed down, hidden, and encapsulated. Here, the ClassAnnotation and FluentMethodCall have to do fiddly stuff like quoting string values, recurse into nested arrays. They have to handle special cases, like the last line of a fluent call has a semicolon and the last line of an annotation has no comma. All of that is hidden from the code that uses them. That can get on with doing the interesting bits.

The quick and dirty debug module

By joachim, Tue, 15/05/2018 - 06:28

There's a great module called the debug module. I'd give you the link… but it doesn't exist. Or rather, it's not a module you download. It's a module you write yourself, and write again, over and over again.

Do you ever want to inspect the result of a method call, or the data you get back from a service, the result of a query, or the result of some other procedure, without having to wade through the steps in the UI, submit forms, and so on?

This is where the debug module comes in. It's just a single page which outputs whatever code you happen to want to poke around with at the time. On Drupal 8, that page is made with:

an info.yml file
a routing file
a file containing the route's callback. You could use a controller class for this, but it's easier to have the callback just be a plain old function in the module file, as there's no need to drill down a folder structure in a text editor to reach it.

(You could quickly whip this up with Module Builder!)

Here's what my router file looks like:

joachim_debug:
  path: '/joachim-debug'
  defaults:
    _controller: 'joachim_debug_page'
  options:
    _admin_route: TRUE
  requirements:
    _access: 'TRUE'

My debug module is called 'joachim_debug'; you might want to call yours something else. Here you can see we're granting access unconditionally, so that whichever user I happen to be logged in as (or none) can see the page. That's of course completely insecure, especially as we're going to output all sorts of internals. But this module is only meant to be run on your local environment and you should on no account commit it to your repository.

I don't want to worry about access, and I want the admin theme so the site theme doesn't get in the way of debug output or affect performance.

If you sometimes want to see themed output, you can add a second route with a different path, which serves up the same content but without the admin theme option:

joachim_debug_theme:
  path: '/joachim-debug-theme'
  defaults:
    _controller: 'joachim_debug_page'
  requirements:
    _access: 'TRUE'

The module file starts off looking like this:

opcache_reset();

function joachim_debug_page() {
  $build = [
    '#markup' => “aaaaarrrgh!!!!”,
  ];

  /*
  // ============================ TEMPLATE


  return $build;
  */

  return $build;
}

The commented-out section is there for me to quickly copy and paste a new section of code anytime I want to do something different. I always leave the old code in below the return, just in case I want to go back to it later on, or copy-paste snippets from it.

Back in the Drupal 6 and 7 days, the return of the callback function was merely a string. On Drupal 8, it has to be a proper render array. The return text used to be 'It's going wrong!' but these days it's the more expressive 'aaaaarrrgh'. Most of the time, the output I want will be the result of dsm() call, so the $build is there just so Drupal's routing system doesn't complain about a route callback not returning anything.

Here are some examples of the sort of code I might have in here.

  // ============================ Route provider
  $route_provider = \Drupal::service('router.route_provider');
  
  $path = 'node/%/edit';
  $rs = $route_provider->getRoutesByPattern($path);
  dsm($rs);

  return $build;

Here I wanted to see the what the route provider service returns. (I have no idea why, this is just something I found in the very long list of old code in my module's menu callback, pushed down by newer stuff.)

  // ============================ order receipt email
  $order = entity_load('commerce_order', 3);
  
  $build = [
    '#theme' => 'commerce_order_receipt',
    '#order_entity' => $order,
    '#totals' => \Drupal::service('commerce_order.order_total_summary')->buildTotals($order),
  ];

  return $build;

I wanted to work with the order receipt emails that Commerce sends. But I don't want to have to make a purchase, complete and order, and then look in the mail logger just to see the email! But this is quicker: all I have to do is load up my debug module's page (mine is at the path 'joachim-debug', which is easy to remember for me; you might want to have yours somewhere else), and vavoom, there's the rendered email. I can tweak the template, change the order, and just reload the page to see the effect.

As you can see, it's quick and simple. There's no safety checks, so if you ever put code here that does something (such as an entity_delete(), it's useful for deleting entities in bulk quickly), be sure to comment out the code once you're done with it, or your next reload might blow up! And of course, it's only ever to be used on your local environment; never on shared development sites, and certainly never on production!

I once read something about how a crucial piece of functionality required for programming, and more specifically, for ease of learning to program with a language or a framework, is being able to see and understand the outcomes of the code you are writing. In Drupal 8 more than ever, being able to understand the systems you're working with is vital. There are tools such as debuggers and the Devel and Devel Contrib modules' information pages, but sometimes quick and dirty does the job too.

Regenerating plugin dependency injection with Module Builder

By joachim, Wed, 21/02/2018 - 21:52

Dependency injection is a pattern that adds a lot of boilerplate code, but Drupal Code Builder makes it easy to add injected services to plugins, forms, and service classes.

Now that the Drupal 8 version of Module Builder (the Drupal front-end to the Drupal Code Builder library) uses an autocomplete for service names in the edit form, adding injected services is even easier, and any of the hundreds of services in your site’s codebase (443 on my local sandbox Drupal 8 site!) can be injected.

I often used this when I want to add a service to an existing plugin: re-generate the code, and copy-paste the new code I need.

This is an area in which Module Builder now outshines its Drush counterpart, because unlike the Drush front end for Drupal Code Builder, which generates code with input parameters every time, Module Builder lets you save your settings for the generated module (as a config entity).

So you can return to the plugin you generated to start with, add an extra service to it, and generate the code again. You can copy and paste, or have Module Builder write the file and then use git to revert custom code it’s removed. (The ability to insert generated code into existing files is on my list of desirable features, but is realistically a long way off, as it would be rather complex, a require the addition of a code parsing library.)

But why stop at generating code for your own modules? I recently filed an issue on Search API, suggesting that its plugins could do with tweaking to follow the standard core pattern of a static factory method and constructor, rather than rely on setters for injection. It’s not a complex change, but a lot of code churn. Then it occurred to me: Drupal Code Builder can generate that boilerplate code: simply create a module in Module Builder called ‘search_api’, and then add a plugin with the name of one that is already in Search API, and then set its injected services to the services the real plugin needs.

Drupal Code Builder already knows how to build a Search API plugin: its code analysis detects the right plugin base class and annotation to use, and also any parameters that the constructor method should pass up to the base class.

So it’s pretty quick to copy the plugin name and service names from Search API’s plugin class to the form in Module Builder, and then save and generate the code, and then copy the generated factory methods back to Search API to make a patch.

I’m now rather glad I decided to use config entities for generated entities. Originally, I did that just because it was a quick and convenient way to get storage for serialized data (and since then I discovered in other work that map fields are broken in D8 so I’m very glad I didn’t try to make then content entities!). But the ability to save the generating settings for a module, and then return to it to add to them has proved very useful.

Triggering events on the fly

By joachim, Fri, 12/01/2018 - 21:45

As far as I know, there's nothing (yet) for triggering an arbitrary event. The complication is that every event uses a unique event class, whose constructor requires specific things passing, such as entities pertaining to the event.

Today I wanted to test the emails that Commerce sends when an order completes, and to avoid having to keep buying a product and sending it through checkout, I figured I'd mock the event object with Prophecy, mocking the methods that the OrderReceiptSubscriber calls (this is the class that does the sending of the order emails). Prophecy is a unit testing tool, but its objects can be created outside of PHPUnit quite easily.

Here's my quick code:

  $order = entity_load('commerce_order', ORDER_ID);
 
  $prophet = new \Prophecy\Prophet;
  $event = $prophet->prophesize('Drupal\state_machine\Event\WorkflowTransitionEvent');

  $event->getEntity()->willReturn($order);
 
  $subscriber = \Drupal::service('commerce_order.order_receipt_subscriber');
 
  $subscriber->sendOrderReceipt($event->reveal());

Could some sort of generic tool be created for triggering any event in Drupal? Perhaps. We could use reflection to detect the methods on the event class, but at some point we need some real data for the event listeners to do something with. Here, I needed to load a specific order entity and to know which method on the event class returns it. For another event, I'd need some completely different entities and different methods.

We could maybe detect the type that the event method return (by sniffing in the docblock... once we go full PHP 7, we could use reflection on the return type), and the present an admin UI that shows a form element for each method, allowing you to enter an entity ID or a scalar value.

Still, you'd need to look at the code you want to run, the event listener, to know which of those you'd actually want to fill in.

Would it same more time than cobbling together code like the above? Only if you multiply it by a sufficiently large number of developers, as is the case with many developer tools.

It's the sort of idea I might have tinkered with back in the days when I had time. As it is, I'm just going to throw this idea out in the open.

Drupal Code Builder unit testing: bringing in the heavy stuff

By joachim, Mon, 13/11/2017 - 21:49

I started adding unit tests to Drupal Code Builder about 3 years ago, and I’ve been gradually expanding on them ever since. It’s made refactoring the code a pleasant rather than stressful experience.

However, while all generator types are covered, the level of detail the tests go into isn’t that deep. Back when I wrote the tests, they mostly needed to check for hook implementations that were generated, and so quick and dirty regexes on the generated code did the job: match 'mymodule_form_alter' in the generated code, basically. I gradually extended those to cover things like class declarations and methods, but that approach is very much cracking at the seams.

So it’s time to switch to something more powerful, and more suited to the task.

I’ve already removed my frankly hideous attempt at verifying that generated code is correctly-formed PHP, replacing it with a call to PHP’s own code linter. My own code was running the generated PHP code files through eval() (yes, I know!) to check nothing crashed, which was quick and worked but only up to a point: tests couldn’t create the same function twice, as eval()ing code that contains a function declaration brings it into the global namespace, and it didn’t work at all for classes where while tests were being run, as the parent classes in Drupal core or contrib aren't present.

It's already proved worthwhile, as once I'd converted the tests, I found an error in the generated code: a stray quote mark in base field definitions for a content entity, which my approach wasn't picking up, and never would have.

The second phase is going to be to use PHPCS and Drupal Coder to check that generated code follows Drupal Coding Standards. I'm currently getting that to work in my testing base class, although it might be a while before I push it, as I suspect it's going to complain about quite a few nipicks in the generated code that I'll then have to spend some time fixing.

The third phase (this is a 3-phrase programme, all the best ones are) is going to be to look into using PHP-Parser to check for the presence of functions and methods in the code, rather than my regex-based approach. This is going to allow for much more thorough checking of the generated code, with things such as method order in the code, service injection, and more.

After that, it'll be back to some more refactoring and code clean-up, and then some more code generators! I have a few ideas of what else Drupal Code Builder could generate, but more ideas are welcome in the issue queue on github.

Brief update on Drupal Code Builder

By joachim, Fri, 15/09/2017 - 20:31

I've completely revamped the Drush commands for Drupal Code Builder:

First, they're now in their own project on Github
Second, I've rewritten them completely for Drush 9, completely interactive.
Third, they are now geared towards adding to existing modules, rather than generating a module as a single shot. That approach made sense in the days of Drupal 6 when it was just hook implementations, but I increasingly find I want to add a plugin, a service, a form, to a module I've already started.

The downside is that installing these is rather tricky at the moment due to some current limitations in Drush 9 beta; see details in the project README, which has full instructions for workarounds.

Now that's out of the way, I'm pushing on with some new generators for the Drupal Code Builder library itself. On my list is:

plugin types (as in the plugin manager service, base class and interface, and declaration for Plugins module)
entity type
entity type handlers
your suggestions in the issue queue...

And of course more unit tests and refactoring of the codebase.

Dorgflow: a tool to automate your drupal.org patch workflow

By joachim, Wed, 22/02/2017 - 23:14

I’ve written previously about git workflow for working on drupal.org patches, and about how we don’t necessarily need to move to a github-style system on drupal.org, we just maybe need better tools for our existing workflow. It’s true that much of it is repetitive, but then repetitive tasks are ripe for automation. In the two years since I released Dorgpatch, a shell script that handles the making of patches for drupal.org issues, I’ve been thinking of how much more of the drupal.org patch workflow could be automated.

Now, I have released a new script, Dorgflow, and the answer is just about everything. The only thing that Dorgflow doesn’t automate is uploading the patch to drupal.org (and that’s because drupal.org’s REST API is read-only). Oh, and writing the code to actually fix bugs or create new features. You still have to do that yourself, along with your cup of coffee.

So assuming you’ve made your own hot beverage of choice, how does Dorgflow work?

Simply! To start with, you need to have an up to date git clone of the project you want to work on, be it Drupal core or a contrib project.

To start work on an issue, just do:

$ dorgflow https://www.drupal.org/node/123456

You can copy and paste the URL from your browser. It doesn’t matter if it has an anchor link on the end, so if you followed a link from your issue tracker and it has ‘#new’ at the end, or clicked down to a comment and it has ‘#comment-1234’ that’s all fine.

The first thing this comment does it make a new git branch for you, using the issue number and the name. It then also downloads and applies all the patch files from the issue node, and makes a commit for each one. Your local git now shows you the history of the work on the issue. (Note though that if a patch no longer applies against the main branch, then it’s skipped, and if a patch has been set to not be displayed on the issue’s file list, then it’s skipped too.)

Let’s see how this works with an actual issue. Today I wanted to review the patch on an issue for Token module. The issue URL is https://www.drupal.org/node/2782605. So I did:

$ dorgflow https://www.drupal.org/node/2782605

That got me a git history like this:

  * 6d07524 (2782605-Move-list-of-available-tokens-from-Help-to-Reports) Patch from Drupal.org. Comment: 35; URL: https://www.drupal.org/node/2782605#comment-11934790; file: token-move-list-of-available-tokens-2782605-34.patch; fid 5784728. Automatic commit by dorgflow.
 * 6f8f6e0 Patch from Drupal.org. Comment: 15; URL: https://www.drupal.org/node/2782605#comment-11666939; file: 2782605-13.patch; fid 5710235. Automatic commit by dorgflow.
 /
* a3b68cc (8.x-1.x) Issue #2833328 by Berdir: Handle bubbleable metadata for block title token replacements
* [older commits…]

What we can see here is:

Git is now on a feature branch, called ‘2782605-Move-list-of-available-tokens-from-Help-to-Reports’. The first part is the issue number, and the rest is from the title of the issue node on drupal.org.
Two patches were found on the issue, and a commit was made for each one. Each patch’s commit message gives the comment index where the patch was posted, the URL to the comment, the patch filename, and the patch file entity ID (these last two are less interesting, but are used by Dorgflow when you update a feature branch with newer patches from an issue).

The commit for patch 35 will obviously only show the difference between it and patch 15, an interdiff effectively. To see what the patch actually contains, take a diff from the master branch, 8.x-1.x.

(As an aside, the trick to applying a patch that’s against 8.x-1.x to a feature branch that already has commit for a patch is that there is a way to check out files from any git commit while still keeping git’s HEAD on the current branch. So the patch applies, because the files look like 8.x-1.x, but when you make a commit, you’re on the feature branch. Details are on this Stack Overflow question.)

At this point, the feature branch is ready for work. You can make as many commits as you want. (You can rename the branch if you like, provided the ‘2782605-’ part stays at the beginning.) To make your own patch with your work, just run the Dorgflow script without any argument:

$ dorgflow

The script detects the current branch, and from that, the issue number, and then fetches the issue node from drupal.org to get the number of the next comment to use in the patch filename. All you now have to do is upload the patch, and post a comment explaining your changes.

Alternatively, if you’re a maintainer for the project, and the latest patch is ready to be committed, you can do the following to put git into a state where the patch is applied to the main development branch:

$ dorgflow commit

At that point, you just need to obtain the git commit command from the issue node. (Remember the drupal standard git message format, and to check the attribution for the work on the issue is correct!)

What if you’ve previously reviewed a patch, and now there’s a new one? Dorgflow can download new patches with this command:

$ dorgflow update

This compares your feature branch to the issue node’s patches, and any patches you don’t yet have get new commits.

If you’ve made commits for your own work as well, then effectively there’s a fork in play, as your development in your commits and the other person’s patch are divergent lines of development. Appropriately, Dorgflow creates a separate branch. Your commits are moved onto this branch, while the feature branch is rewound to the last patch that was already there, and then has the new patches applied to it, so that it now reflects work on the issue. It’s then up to you to do a git merge of these two branches in order to combine the two lines of development back into one.

Dorgflow is still being developed. There are a few ideas for further features in the issue queue on github (not to mention a couple of bugs for some of the various possible cases the update command can encounter). I’m also pondering whether it’s worth the effort to convert the script to use Symfony Console; feel free to chime in with any opinions on the issue for that.

There are tests too, as it’s pretty important that a script that does things to your git repository does what it’s supposed to (though the only command that does anything destructive is ‘dorgflow cleanup’, which of course asks for confirmation). Having now written this, I’m obviously embarking upon cleaning it up and to some extent rewriting it, though I do have the excuse that the early weeks of working on this were the days after the late nights awake with my newborn daughter, and so the early versions of the code were written in a haze of sleep deprivation. If you’d like to submit a pull request, please do check in with me first on an issue to ensure it’s not going to clash with something I’m partway through changing.

Finally, if you find this as useful as I do (this was definitely an itch I’ve been wanting to scratch for a long time, as well as being a prime case of condiment-passing), please tell other Drupal developers about it. Let’s all spend less time downloading, applying, and rolling patches, and more time writing Drupal code!

Changing the type of a node

By joachim, Mon, 16/01/2017 - 22:22

There’s an old saying that no information architecture survives contact with the user. Or something like that. You’ll carefully design and build your content types and taxonomies, and then find that the users are actually not quite using what you’ve built in quite the way it was intended when you were building it.

And so there comes a point where you need to grit your teeth, change the structure of the site’s content, and convert existing content.

Back on Drupal 7, I wrote a plugin for Migrate which handled migrations within a single Drupal site, so for example from nodes to a custom entity type, or from one node type to another. (The patch works, though I never found the time to polish it sufficiently to be committed.)

On Drupal 8, without the time to learn the new version of Migrate, I recently had to cobble something together quickly.

Fortunately, this was just changing the type of some nodes, and where all the fields were identical on both source and destination node types. Anything more complex would definitely require Migrate.

First, I created the new node type, and cloned all its fields from the old type to the new type. Here I took the time to update some of the Field Tools module’s functionality to Drupal 8, as it pays off to have a single form to clone fields rather than have to add them to the new node type one by one.

Field Tools also copies display settings where form and view modes match (in other words, if the source bundle has a ‘teaser’ display mode configured, and the destination also has a ‘teaser’ display mode that’s enabled for custom settings, then all of the settings for the fields being cloned are copied over, with field groups too).

With all the new configuration in place, it was now time to get down to the content. This was plain and simple a hack, but one that worked fine for the case in question. Here’s how it went…

We basically want to change the bundle of a bunch of nodes. (Remember, the ‘bundle’ is the generic name for a node type. Node types are bundles, as taxonomy vocabularies are bundles.) The data for a single node is spread over lots of tables, and most of these have the bundle in them.

On Drupal 8 these tables are:

the entity base table
the entity data table
the entity revision data table
each field data table
each field data revision table

(It’s not entirely clear to me what the separation between base table and data table is for. It looks like it might be that base table is fields that don’t change for revisions, and data table is for fields that do. But then the language is on the base table, and that can be changed, and the created timestamp is on the data table, and while you can change that, I wouldn’t have thought that’s something that has past values kept. Answers on a postcard.)

So we’re basically going to hack the bundle column in a bunch of tables. We start by getting the names of these tables from the entity type storage:

$storage = \Drupal::service('entity_type.manager')->getStorage('node');

// Get the names of the base tables.
$base_table_names = [];
$base_table_names[] = $storage->getBaseTable();
$base_table_names[] = $storage->getDataTable();
// (Note that revision base tables don't have the bundle.)

For field tables, we need to ask the table mapping handler:

$table_mapping = \Drupal::service('entity_type.manager')->getStorage('node')
  ->getTableMapping();

// Get the names of the field tables for fields on the service node type.
$field_table_names = [];
foreach ($source_bundle_fields as $field) {
  $field_table = $table_mapping->getFieldTableName($field->getName());
  $field_table_names[] = $field_table;

  $field_storage_definition = $field->getFieldStorageDefinition();
  $field_revision_table = $table_mapping
    ->getDedicatedRevisionTableName($field_storage_definition);

  // Field revision tables DO have the bundle!
  $field_table_names[] = $field_revision_table;
}

(Note the inconsistency in which tables have a bundle field and which don’t! For that matter, surely it’s redundant in all field tables? Does it improve the indexing perhaps?)

Then, get the IDs of the nodes to update. Fortunately, in this case there were only a few, and it wasn’t necessary to write a batched hook_update_N().

// Get the node IDs to update.
$query = \Drupal::service('entity.query')->get('node');
// Your conditions here!
// In our case, page nodes with a certain field populated.
$query->condition('type', 'page');
$query->exists(‘field_in_question’);
$nids = $query->execute();

And now, loop over the lists of tables names and hack away!

// Base tables have 'nid' and 'type' columns.
foreach ($base_table_names as $table_name) {
  $query = \Drupal\Core\Database\Database::getConnection('default')
    ->update($table_name)
    ->fields(['type' => 'service'])
    ->condition('nid', $service_nids, 'IN')
    ->execute();
}
// Field tables have 'entity_id' and 'bundle' columns.
foreach ($field_table_names as $table_name) {
  $query = \Drupal\Core\Database\Database::getConnection('default')
    ->update($table_name)
    ->fields(['bundle' => 'service'])
    ->condition('entity_id', $service_nids, 'IN')
    ->execute();
}

Node-specific tables use ‘nid’ and ‘type’ for their names, because those are the base field names declared in the entity type class, whereas Field API tables use the generic ‘entity_id’ and ‘bundle’. The mapping between these two is declared in the entity type annotation’s entity_keys property.

This worked perfectly. The update system takes care of clearing caches, so entity caches will be fine. Other systems may need a nudge; for instance, Search API won’t notice the changed nodes and its indexes will literally need turning off and on again.

Though I do hope that the next time I have to do something like this, the amount of data justifies getting stuck into using Migrate!

What goes on in Drupal Code Builder?

By joachim, Sun, 15/05/2016 - 13:13

Drupal Code Builder library is the new library which powers Module Builder. I recently split Module Builder up, so Drupal Code Builder (DCB) is the engine for generating Drupal code, while what remains in the Module Builder module is just the UI.

DCB is an extensible framework, so if you wanted to have DCB create scaffold code for a particular Drupal component or system, you can.

DCB's API is documented in the README. It's based on the idea of tasks: for example, list the hooks and plugin types that DCB has parsed from the site code, analyze the site code to update that list, or generate code for a module. There are Task classes, and you call public methods on these to do something.

The generators

Broadly, there are three things you want to do with DCB: collect and analyze data about a Drupal codebase to learn about hooks and plugin types, report on that data, and actually generate some code.

The Generate task class is where the work of creating code begins. The other task classes are all pretty simple, or at least self-contained, but the Generate task is accompanied by a large number of classes in the DrupalCodeBuilder\Generate namespace. You can see from the file names that these represent all the different components that make up generated code.

Furthermore, as well as all inheriting from BaseGenerator, there are hierarchies which can probably be deduced from the names alone, where more specialized generators inherit from generic ones. For example, we have:

File
- PHPFile
  - ModuleCodeFile
  - PHPClassFile
    - Plugin
    - Service
  - API (this one's for your mymodule.api.php file)
- YMLFile
- Readme

and also:

PHPFunction
- HookImplementation
  - HookMenu
  - HookPermission

However, these hierarchies are only about code re-use. In terms of PHP code, HookImplementation is only related to ModuleCodeFile by the common BaseGenerator base class. As the process of code generation takes place, there will be a tree of components that represents components containing each other, but it's important to remember that class inheritance doesn't come into it.

Also, while the generators in the hierarchies above clearly represent some tangible part of the code we're going to generate, some are more abstract, such as Module and Hooks. These aren't abstract in the OO sense, as they will get instantiated, but I think of them as abstract in the sense that they're not concrete and are responsible for code across different files. (Suggestions for a better word to describe them please!)

The process of generating code starts with a call to the Generate task's generateComponent() method. The host UI application (such as Module Builder module, or the Drush command) passes it an array of data that looks something like this:

[
  'base' => 'module',
  'root_name' => 'mymodule,
  'readable_name' => 'My module',
  'hooks' => [
    'form_alter' => TRUE,
    'install' => TRUE,
  ],
  'plugins => [
    0 => [
      'plugin_type' => 'block',
      'plugin_name' => 'my_plugin',
      'injected_services' => [
        'current_user',
      ],
    ],
  ],
  'settings_form' => TRUE,
  'readme' => TRUE,
]

(How you get the specification for this array as a list of properties and their expected format is a detailed topic of its own, which will be covered later. For now, we're jumping in at the point where code is generated.)

Assembling components

The first job for the Generate task class is to turn this array of data into a list of generator classes for the necessary components.

This list is built up in a cascade, where each component gets to request further components, and those get to request components too, and so on, until we reach components that don't request anything. We start with the root component that was initially requested, Module, let that request components, and then repeat the process.

This is best illustrated with the AdminSettingsForm generator. This implements the requiredComponents() method to request:

a permission
a router item (on Drupal 7 that's a menu item, but in DCB we refer to these a router item whatever the core Drupal version)
a form

In turn, the Permission generator requests a permissions YAML file. You'll see that there are two Permission generators, each with a version suffix. The Permission7 generator requests a hook_permission() hook, which in turn requests a .module file. The Permission8 generator is somewhat simpler, and just requests a YMLFile component.

Meanwhile, the router item requests a routing.yml file on D8, and a hook_menu() on D7.

These two parts of the cascade end when we reach the various file generators: ModuleCodeFile and YMLFile don't request anything. The process that gathers all these generators works iteratively: every iteration it calls requiredComponents() on all the components the previous iteration gave it, and it only stops once an iteration produces no new components. It's safe to request the same component multiple times; in the D7 version of our example, both our hook_menu() and hook_permission() will request a ModuleCodeFile component that represents the .module file. The cascade system knows to either combine these two requests into one component, or ignore the second if it's identical to what's already been requested.

We now have a list of about a dozen or so components, each of which is an instantiated Generator object. Some represent files, some represent functions, and some like Hooks represent a more vague concept of the module 'having some hooks'. There's also the Module generator which started the whole process, whose requiredComponents() did most of the work of interpreting the given array of data.

Assembling a tree of components

The second part of the process is to assemble this flat list of components into a tree. This is where the notion of which component contains others does come into play. This is a different concept from requested components: a component can request something that it won't end up containing, as we saw with the AdminSettingsForm, which requests a permission.

The Generate task calls the containingComponent() method on each component, and this is used to assemble an array of parentage data. There's nothing fancy or recursive going on here; the tree is just an array whose keys are the identifiers of components, and whose values are arrays of the child component identifiers.

This tree now represents a structure of components where child items will produce code to be included in their parents. One part of this structure could be represented like this:

module
- routing.yml
  - router item
- permission.yml
  - permission
- .install
  - hook_install()

Some components, such as the Hooks component, are no longer around now: their job was to be a sort of broker for other components in the requesting phase, and they're no longer involved. The root component, Module, is the root of the tree. All the files we'll be outputting are its immediate children. (This is not a file hierarchy, folders are not represented here.)

Assembling file contents

We now have everything we need to start actually generating some code. This is done in a way that's very similar to Drupal's Render API: we recurse into the tree, asking each component to return some content both from itself and its children.

So for example, the router items contribute some lines to the routing.yml file, which then turns them into YAML. The .install component, which is an instance of ModuleCodeFile, produces a @file docblock, and then gets the docblock, function declaration, and function body from the hook_install component, and glues them all together.

Finally, each file component (the immediate children of the module component in the tree) gets to say what its file name and path should be.

So the Generate task has an array of data about files, where each item has a file name, file path, and file contents. This is returned to the caller to be output to the user, or written to the filesystem. Module Builder presents the files in a form, and allows the files to be written. The Drush command outputs them to terminal and optionally writes them too.

Extending it with new components

The best way to add new things for DCB to generate is to inherit from existing basic classes. If these don’t provide the flexibility, there’s always a case to be made to give them more configurable options: for example, the AdminSettingsForm class inherits from Form, but neither of those do very little for the actual generated form class, as the work for that is mostly done by the PHPClass class.

The roadmap for DCB at the moment consists of the following:

Generalize the injected services functionality that’s already in Plugins, so generated Form classes and Services can have them too.
Add Forms as a basic component that you can request to generate. (It’s currently there only as a base for the AdminSettingsForm generator.)

And as ever, keep adding tests, keep refactoring and improving the code. But I'm always interested in hearing new ideas (or you know, better yet, patches) in the issue queue.

The Lazy Maintainer's Handbook, Part 1: Frequent Releases

By joachim, Thu, 31/03/2016 - 00:59

Every now and then I think about writing a series of posts called 'The Lazy Maintainer's Handbook', covering various aspects of how to maintain a project (or several) on drupal.org without it being a huge burden on your time (which a lot of people, and companies, think is the case). However, I never get much past the pondering stage. Since it's safe to say that I'm never going to manage to come up with the right order to write these in, I'm just going to start in the middle. Here goes.

Like with all software, bugs are a problem in the world of Drupal. In Drupal contrib, like with any software that has releases, we can classify bugs into three types:

bugs which are reported, but have no fix,
bugs which have a fix, but the patch hasn't been committed,
bugs which have been fixed, but are not part of a release yet.

The first and second type can involve a fair amount of work, and I will cover in another post in the future how much a LM can or should do about them.

The third type though is where the LM can really shine: all that needs to be done here is to make a release. What could be simpler? And let's be clear, making a release is a very quick job. I can do the whole thing in about a minute (though I've not timed it, yet).

For one thing, if you're still writing the release notes by hand, then stop: Git Release Notes for Drush does that for you.

Here's my process:

$ git tag This lists all the existing tags, so I can see what the next release number should be.
$ git tag 7.x-1.2 This creates the tag for the new release.
At this point, it's a good idea to check this in a graphical git client, to check for stupid mistakes like making the tag on a local development branch, or the wrong major branch. (I've done that at least once.)
$ git push origin 7.x-1.2 This creates the tag on the remote repository.
$ drush release-notes This creates the release notes, using all the commit messages between the previous commit and the commit you just made. (Another reason to use the standard format for commit messages: it will turn the #12345 issue numbers into tags for d.org to then render as links.)
Select the output from the command and copy it.
Go to your project's page on d.org and click the 'Add new release' link.
Select the new tag.
Paste the release notes and save the node.

I speed this up even more by having a bash alias for 'drush release-notes | pbcopy', which on OS X puts the text output by the Drush command onto the clipboard, so I can skip the selecting and copying step.

It's quick, right? Fun, even! Why don't maintainers do it more often? The reasons I can think of are:

the current branch HEAD (and thus -dev release) is unstable and badly broken
maintainers are worried about making releases too frequent and making site builders update all the time
maintainers have fallen prey to the 'just one more fix' syndrome, and are waiting for another issue (or issues) to be resolved.

Let's address these shall we?

The branch is unstable

This can happen when you're still on alpha releases, and something's caused you to take a new direction in development. This is a tough case: there's no going back, and you're stuck going forward. The only thing I would advise here is to look at the git history since the last release and see if there's a commit between then and now that could be tagged as the next alpha: for instance if the first few commits after the alpha were simple bug fixes. To try to prevent this problem, I recommend making a release immediately before you take a new direction in development, and if it's a very large rewrite, starting a new major branch, even if that means abandoning the 1.x branch at the alpha stage.

If a major rewrite happens and you're on beta releases or stable releases, then you're doing it wrong: a major rewrite should be cause to start a new major branch.

The last release was recent, and users may dislike frequent releases

Inspecting and testing new releases takes time. More importantly, perhaps, it has quite a high cognitive cost, as for each module you update, you need to review the release notes to look for any changes that might affect your site's functionality, and any parts of your site's codebase that make use of that module's code. It's also a bit stressful, because if something does break, it could be in a part of the site you don't think to check, so the first you'll hear of it is when your client or project manager calls in a panic two days later.

Understandably then, a lot of developers and site builders put off module updates, or don't bother with them until there's a security release.

This may be a time saving, but I don't believe it's an effort saving. Suppose you're on release 1.0 and releases 1.1 and then 1.2 come along. You can either upgrade to each one when it comes out, or you wait for 1.2 and then upgrade straight to that.

Doing two upgrades seems like more work, because you're only having to check the site once.

But I would counter that it's less overall cognitive load, because each single release has fewer changes. If releases are frequent, and include only up to a dozen or so commits, then it's easier to scan down the list of issues in the release notes, and maybe see that they're all very minor bug fixes, or clearly only affect functionality that your site doesn't use.

Ultimately, I don't think postponing upgrades pays off, because eventually, you'll have to bite the bullet and upgrade past several version numbers. Worse, you may have your hand forced when a security update comes along, and you'll be in the situation of having to read the release notes for all the versions you skipped and assessing them, while your site is on fire.

So I think small, frequent releases are actually a good thing, even from the point of view of existing users.

From the point of view of new users, they're a great thing: new users get better code, with fixed bugs. And that applies to existing users as well of course.

(A future episode in this series will cover an idea I've had for ages, of a metric for when you should do a release: after so many commits or weeks have passed.)

You're waiting for just one more fix

Don't. It might never come. In my experience, it probably won't. You'll think to yourself, 'just one more week and someone will review this patch', or 'I'll write a patch for this in the next week'. That week becomes two, and a month, and a year, and even more. I've seen comments on issues called 'Plan for a 1.0 release' where the maintainer says 'I'll make a release in the next two weeks' and that comment is over a year old. I saw one of those comments recently that was two years old. It was mine.

Fight the urge to wait for more fixes. Yes, you want your module to be good, even maybe perfect. But tell yourself: if you release now, you're still making it better. So here's what you need to do:

If you're still not on a stable release, release another alpha or beta. The real purpose of unstable releases should be to get people to test your code. You really want them to be testing something recent, not code that's six months old.
If you're on stable releases, just release another one. Those issues you were waiting for can go in the next release (or the one after…)

Hopefully that should assuage concerns regarding one's responsibility as a maintainer.

But as a lazy maintainer, what's the benefit to you?

Your recent fixes are out there and in use. A bug isn't truly eradicated until a release is made that includes the fix.
Fewer duplicate issues filed. With fewer bugs that are fixed in dev but not in a release, there's less chance of people encountering them.
More users, because the age of the most recent release is a metric people use when evaluating a module.
Users are using a more recent version of the code, which means newer bugs are more likely to get caught. (Because seriously, how many people are actually trying out the dev release, unless they're forced to by unreleased fixes?)

Get the code out. Dev code serves nobody. Releases are what matters.

Drupal Planet Posts

Making builder code look like output code

Tags

The quick and dirty debug module

Tags

Regenerating plugin dependency injection with Module Builder

Tags

Triggering events on the fly

Tags

Drupal Code Builder unit testing: bringing in the heavy stuff

Tags

Brief update on Drupal Code Builder

Tags

Dorgflow: a tool to automate your drupal.org patch workflow

Tags

Changing the type of a node

Tags

What goes on in Drupal Code Builder?

The generators

Assembling components

Assembling a tree of components

Assembling file contents

Extending it with new components

Tags

The Lazy Maintainer's Handbook, Part 1: Frequent Releases

The branch is unstable

The last release was recent, and users may dislike frequent releases

You're waiting for just one more fix

Tags