batch scripting with drupal (featuring the entityfieldquery api)
DESCRIPTION
TRANSCRIPT
Batch Scripting with Drupal(Featuring the EntityFieldQuery API)
Engr. Ranel O. Padon
DrupalCamp Manila 2014
[email protected] | https://github.com/ranelpadon
About Me
Full-time Drupal developer for CNN Travel
Part-time Python lecturer.
Involved in computational Java and Python projects before.
Plays competitive football and futsal.
TOPICS
Why do batch scripting?
How to leverage Entities and the EntityFieldQuery API.
How to implement a batch module?
Sample Actual Use Cases
Why do batching?
There are things that are hard to teach to robots:
spatial awareness and image interpretation.
But there are things that machines could do and easily beat humans:
doing repetitive tasks.
In Drupal, you don't want your site editors to do repetitive tasks:
e.g. updating a field to the same value.
Why do batching?
Avoids PHP Timeout
For implementating intsallation profiles
Why do batching?
Batch processing is execution of a series of programs on a
computer without manual intervention .
Designed to integrate nicely with the Form API , but can also be used
by non-Form API scripts.
Why do batching?
Avoids PHP Timeout (max_execution_time errors)
For long and complex data processing
You can give the admins real-time feedback or summary of results.
When to do batching?
Implementing installation profiles
Used by Drupal's install.php and update.php
Updating the value of a field for all Event nodes.
Deleting all nodes older than 3 years.
Migrating Column nodes to Column taxonomy terms.
Creating custom nodes upon saving a content with an uploaded Excel file,
Batch API is triggered by hook_node_presave() and
goes through each row of the attached Excel file.
The Rise of Entities
“Oh no, I left my stuf f in our house.”
Stuff , just like entities, are useful abstraction.
They could change meaning depending on the context.
The Rise of Entities
In Physics, you could treat each object under study as part icles .
In Drupal, they are called entit ies .
Facilitates a unified way to work
with different data units
Concept simplification contributes
to better modularity, flexibility and maintainability.http://evolvingweb.ca/story/understanding-entity-api-module
The Rise of Entities
Before D7, users and comments didn't have the same power
that nodes (content) had.
no translations, fields, versioning, and so on.
Views (relies on fields) didn’t work with comments and users .
The Rise of Entities
Field is a reusable piece of content. Each field is a primitive data type,
with custom validators and widgets for editing and formatters for display.
Entity Type group together fields (use Entity API for custom ones):
Nodes (content)
Comments
Taxonomy Terms
User Prof i les
The Rise of Entities
Bundles are an implementation of an entity type.
They are subtypes of an entity type.
Bundles (subtypes) like articles, events, blog posts, or products could be
generated from node entity. You could add a f i le download field on
Basic Pages and a subtit le field on Articles.
You could also assign geographic coordinates field to all bundles/entities.
The Rise of Entities
Entity would be one instance of a particular entity type
(specific article or specific user via entity_load()).
Drupal 7 Core provides entity_load(), while the Entity API contrib
module provides entity_save() and entity_delete().
The Rise of Entities
In terms of Object-Oriented Programming:
An entity type is a base class
A bundle is an extended/derived class
A field is a class member, attribute, or property,
An entity is an object or instance of a base or extended class
EntityFieldQuery API
Tool for querying Entities (compared to db_select())
Can query entity properties and f ields
Can query f ield data across entity types:
Fetch all pages and users tagged with taxonomy term “premium”
Returns entity IDs that you could load using entity_load()
Database-agnostic (no issues when migrating from MySQL to PostgreSQL)
EntityFieldQuery API
Fetch all nodes.
EntityFieldQuery API
Fetch all nodes of type “Article”.
EntityFieldQuery API
Fetch all nodes of type “Article”, Published only.
EntityFieldQuery API
Tool for querying Entities
EntityFieldQuery API
Tool for querying Entities
EntityFieldQuery API
Tool for querying Entities
Custom Batch Module
Form API meets Batch API
Custom Batch Module
Batch 12 things, process 5 things at a time
http://www.ent.iastate.edu/it/Batch_and_Queue.pdf
Custom Batch Module
Form Submit trigger
http://www.ent.iastate.edu/it/Batch_and_Queue.pdf
Custom Batch Module
Multiple callback operations and the dynamic $context array (stored in db)
http://www.ent.iastate.edu/it/Batch_and_Queue.pdf
Custom Batch Module
Post Processing
http://www.ent.iastate.edu/it/Batch_and_Queue.pdf
Custom Batch Module
The usual suspects:
mybatch. info
mybatch.module
Then, enable the module in “admin/modules” or using Drush:
$ drush en -y mybatch
Custom Batch Module
Information required in mybatch.module , without a form:
I. URL that will be utilized.
II. Function callback definition for that URL.
Usually contains the setup for the batch process.
III. The batch operation's function definition.
IV. The function definition that will be called after
the batch operation.
Then, enable the module in “admin/modules” or using Drush:
$ drush en -y mybatch
Custom Batch Module
1. The batch module URL (http://localhost/admin/mybatch):
Custom Batch Module
1I. Function callback for the URL, setups the batch process.
Custom Batch Module
III. Define the operation callback.
Custom Batch Module
IV. Define the post-operation callback.
Custom Batch Module
Custom Batch Module
III. Define the operation callback (Enhancement).
III. Define the operation callback (Enhancement).
1I. Function callback for the URL (Enhancement).
Custom Batch Module
Information required in mybatch.module, when using a form:
I. URL that will be utilized.
II. Form callback definition for that URL (hook_form).
III. Form Submit definition.
Usually contains the setup for the batch process.
IV. The batch operation's function definition.
V. The function definition that will be called after the batch operation.
Then, enable the module in “admin/modules” or using Drush:
$ drush en -y mybatch
Custom Batch Module
Batch API with Form API:
Custom Batch Module
I. URL that will be utilized.
Custom Batch Module
II. Form callback definition for that URL (hook_form).
Custom Batch Module
III. Form Submit definition.
Custom Batch Module
Part IV and V don't need to be changed:
Our Use Case 1
We need to update the Short Bio field text format of User Profiles
from Full HTML to Editorial Filter:
Our Use Case 1
Using our last batch module example module (with form), all parts
except part IV and V will be almost similar.
IV (1/4):
IV (2/4):
IV (3/4):
IV (4/4):
V:
Our Use Case 2
Preview map images for City/Country taxonomy terms with no set
value: (http://travel.cnn.com/malaysia)
Our Use Case 2
Preview map images for City/Country taxonomy terms with no set
value: (http://travel.cnn.com/japan)
Our Use Case 2
Batch API with Form API (Updating the preview images,
part I and part V are almost similar to that of Use Case 1)
II. Form callback definition for that URL (hook_form).
III. Form Submit definition.
IV. Batch operations callback (highlights only)
IV. Batch operations callback (highlights only)
IV. Batch operations callback (highlights only)
IV. Batch operations callback (highlights only)
Summary
Information required in mybatch.module , without a form:
I. URL that will be utilized.
II. Function callback definition for that URL.
Usually contains the setup for the batch process.
III. The batch operation's function definition.
IV. The function definition that will be called after
the batch operation.
Then, enable the module in “admin/modules” or using Drush:
$ drush en -y mybatch
Summary
Information required in mybatch.module, when using a form:
I. URL that will be utilized.
II. Form callback definition for that URL (hook_form).
III. Form Submit definition.
Usually contains the setup for the batch process.
IV. The batch operation's function definition.
V. The function definition that will be called after the batch operation.
Then, enable the module in “admin/modules” or using Drush:
$ drush en -y mybatch
Summary
Implementing the Batch API without a form could be also
integrated to Drupal hooks or even Drush.
EntityFieldQuery integration to batch operations callback
will facilitate a more readable, flexible, and maintainable
data fetching in the batch process.
Recommended Links
Batch API Docs:
https://drupal.org/node/180528
Examples Module (Batch API):
http://d7.drupalexamples.info/examples/batch_example
Batch API integration with Drush:
http://www.metaltoad.com/blog/using-drupal-batch-api
EntityFieldQuery Docs:
https://drupal.org/node/1343708
EntityFieldQuery as alternative to Views:
http://treehouseagency.com/blog/tim-cosgrove/2012/02/16/entityfieldquery-let-drupal-do-heavy-lifting-pt-1
Long Live Open-Source!