There’s an old saying that no information architecture survives contact with the user. Or something like that. You’ll carefully design and build your content types and taxonomies, and then find that the users are actually not quite using what you’ve built in quite the way it was intended when you were building it.
And so there comes a point where you need to grit your teeth, change the structure of the site’s content, and convert existing content.
Back on Drupal 7, I wrote a plugin for Migrate which handled migrations within a single Drupal site, so for example from nodes to a custom entity type, or from one node type to another. (The patch works, though I never found the time to polish it sufficiently to be committed.)
On Drupal 8, without the time to learn the new version of Migrate, I recently had to cobble something together quickly.
Fortunately, this was just changing the type of some nodes, and where all the fields were identical on both source and destination node types. Anything more complex would definitely require Migrate.
First, I created the new node type, and cloned all its fields from the old type to the new type. Here I took the time to update some of the Field Tools module’s functionality to Drupal 8, as it pays off to have a single form to clone fields rather than have to add them to the new node type one by one.
Field Tools also copies display settings where form and view modes match (in other words, if the source bundle has a ‘teaser’ display mode configured, and the destination also has a ‘teaser’ display mode that’s enabled for custom settings, then all of the settings for the fields being cloned are copied over, with field groups too).
With all the new configuration in place, it was now time to get down to the content. This was plain and simple a hack, but one that worked fine for the case in question. Here’s how it went…
We basically want to change the bundle of a bunch of nodes. (Remember, the ‘bundle’ is the generic name for a node type. Node types are bundles, as taxonomy vocabularies are bundles.) The data for a single node is spread over lots of tables, and most of these have the bundle in them.
On Drupal 8 these tables are:
- the entity base table
- the entity data table
- the entity revision data table
- each field data table
- each field data revision table
(It’s not entirely clear to me what the separation between base table and data table is for. It looks like it might be that base table is fields that don’t change for revisions, and data table is for fields that do. But then the language is on the base table, and that can be changed, and the created timestamp is on the data table, and while you can change that, I wouldn’t have thought that’s something that has past values kept. Answers on a postcard.)
So we’re basically going to hack the bundle column in a bunch of tables. We start by getting the names of these tables from the entity type storage:
$storage = \Drupal::service('entity_type.manager')->getStorage('node');
// Get the names of the base tables.
$base_table_names = [];
$base_table_names[] = $storage->getBaseTable();
$base_table_names[] = $storage->getDataTable();
// (Note that revision base tables don't have the bundle.)
For field tables, we need to ask the table mapping handler:
$table_mapping = \Drupal::service('entity_type.manager')->getStorage('node')
->getTableMapping();
// Get the names of the field tables for fields on the service node type.
$field_table_names = [];
foreach ($source_bundle_fields as $field) {
$field_table = $table_mapping->getFieldTableName($field->getName());
$field_table_names[] = $field_table;
$field_storage_definition = $field->getFieldStorageDefinition();
$field_revision_table = $table_mapping
->getDedicatedRevisionTableName($field_storage_definition);
// Field revision tables DO have the bundle!
$field_table_names[] = $field_revision_table;
}
(Note the inconsistency in which tables have a bundle field and which don’t! For that matter, surely it’s redundant in all field tables? Does it improve the indexing perhaps?)
Then, get the IDs of the nodes to update. Fortunately, in this case there were only a few, and it wasn’t necessary to write a batched hook_update_N().
// Get the node IDs to update.
$query = \Drupal::service('entity.query')->get('node');
// Your conditions here!
// In our case, page nodes with a certain field populated.
$query->condition('type', 'page');
$query->exists(‘field_in_question’);
$nids = $query->execute();
And now, loop over the lists of tables names and hack away!
// Base tables have 'nid' and 'type' columns.
foreach ($base_table_names as $table_name) {
$query = \Drupal\Core\Database\Database::getConnection('default')
->update($table_name)
->fields(['type' => 'service'])
->condition('nid', $service_nids, 'IN')
->execute();
}
// Field tables have 'entity_id' and 'bundle' columns.
foreach ($field_table_names as $table_name) {
$query = \Drupal\Core\Database\Database::getConnection('default')
->update($table_name)
->fields(['bundle' => 'service'])
->condition('entity_id', $service_nids, 'IN')
->execute();
}
Node-specific tables use ‘nid’ and ‘type’ for their names, because those are the base field names declared in the entity type class, whereas Field API tables use the generic ‘entity_id’ and ‘bundle’. The mapping between these two is declared in the entity type annotation’s entity_keys property.
This worked perfectly. The update system takes care of clearing caches, so entity caches will be fine. Other systems may need a nudge; for instance, Search API won’t notice the changed nodes and its indexes will literally need turning off and on again.
Though I do hope that the next time I have to do something like this, the amount of data justifies getting stuck into using Migrate!