Tag Archives: anthologize

Extending Anthologize: Part 1

Anthologize has seen two version releases since its initial launch in August. Much of the progress since then (aside from some smallish – but needed – features and bugfixes) has been centered on the development of a plugin architecture for Anthologize, a system that will allow individuals to build custom output formats for their Anthologize content in a relatively straightforward way. Anthologize development team member (and metadata badass) Patrick Murray-John has been hard at work on this project: creating prototype plugins, writing blog posts about the plugin process, and of course building large parts of API itself. In this series of posts, I’d like to augment some of Patrick’s specific musings with a general explanation of how Anthologize works, and how the plugin API lets you tap into it. Think of it as an introduction to Anthologize plugin building.

First, a brief peek under the hood. Anthologize’s tagline is “Use the power of WordPress to transform online content into an electronic book.” How does Anthologize take you from point A – your WordPress content – to point B – your ebook? When you press the Export button on the final Export Project screen, you set into motion a two-step process.

Step 1: WordPress to TEI(ish)

WordPress content is stored in the wp_posts database table, and is typically accessed using WP’s “loop”, which is a set of iterative and indexical functions that make it easy to get and display post data in whichever way you’d like. When you hit the final Export button, Anthologize catches the form submit and hands things off to the format translator, which will get the content out of the database. We’ll use PDF as our example. The main translator file is base.php. That file acts like a top-level manager for all the things that have to be done in order for PDFs to be generated. The first important thing that any export format has to do is to call up the content of the project, which it does by instantiating the TeiDom class.

In many ways, TeiDom is the workhorse of Anthologize. Using the session data passed to it from the PDF base.php file – data that includes the project id, the desired page size, the content of the dedication, stuff like that – it uses a variation on The Loop to collect each part and item from an Anthologize project. Those objects, along with their metadata, are then fed into an empty TEI template file before getting handed back to the individual export format translator.

Why the middleman? Early in the development process, the team made a few decisions about the way that Anthologize ought to operate. First, though it’s currently being developed as a WordPress plugin, we should anticipate a time when much of Anthologize could be ported to another CMS or to a standalone application. If format translators like PDF had to dig directly into the WP database, or were forced to use The Loop to get project data, then they’d have to be refactored when the data lived somewhere other that WP. TEI is a platform-independent format, and since format translators like the PDF generator communicate only with the TeiDom (not directly with WordPress), they should be fairly platform-independent as well. Second, if we were going to have a middleman, we wanted it to be one with extremely broad expressive power, and one for which standards and translation techniques already exist. By choosing TEI, we open the door for armies of archivists, librarians, and other such format wizards, armed with XSLT ninjitsu. (In fact, that’s how Anthologize dev team member Patrick Rashleigh built the Anthologize epub generator!)

In the header for this section, I say TEI(ish) rather than TEI. That’s because the Anthologize middleman TEI layer is not the kind of TEI that your local text-encoder might expect. In particular, the content of your WordPress blog posts, which is already marked up with HTML, is untouched in the export process. A true TEI markup of your text would mean lots of manual encoding, so we just pass it along as-is. This untouched HTML post content, however, is embedded in a larger TEI framework for holding the metadata and generally explaining the structure of the project document. It’s not necessarily the kind of document you could use to build a richly marked-up text visualization, but it works well for the purposes of simple presentation.

Step 2: TEI to your format

Once the format generator (remember our friend templates/pdf/base.php?) gets the TEI document back from the TeiDom workhorse – which is essentially shared by all Anthologize export formats – the format-specific work can begin. PDF uses its own custom TEI-to-PDF class, along with an included PDF generation library, to parse the TeiDom object and turn it into something that, when delivered to your browser, is understood as a PDF. This, of course, is the hard part of building a translator, and is very format-specific.

The cool part about this part of the process, though, is the amount of flexibility that is emerging from the Anthologize architecture. Different format translators can deal with data in different ways; to wit:

  • The built-in PDF generator uses XPaths and some basic PHP loopage to format the final document. It’s also got some custom helper methods (eg get_book_title()) that it uses to make the the rendering code a bit easier to use.
  • The built-in ePub generator uses XSL transformations to move from the TEI document to the HTML-esque ePub output.
  • Patrick MJ has been working on a set of theming functions that will allow plugin authors to construct a loop very similar to the WordPress post Loop for the display of their data.
  • Because of the nature of PHP applications, export formats could always bypass Anthologize’s TEI and other API options and head directly for the WP database, using some embedded WP_Query/have_posts() loops.

Once your parser has turned Anthologize TEI data into the format necessary for your chosen format, you’ll need to deliver it to the browser by sending the write headers (here’s how ePub does it as an example).

In my next blog post, I’ll use an example to show how a plugin can register itself with Anthologize to take advantage of all these goodies.

Anthologize 0.4-alpha is released

The Anthologize team has been hard at work over the last week, fixing bugs behind some of the most commonly reported problems, and adding features to make Anthologizing easier and more fun. We’ve just tagged version 0.4-alpha in the WordPress repository. Visit your WordPress Dashboard’s Plugins page to upgrade.

Read more about the changes in 0.4-alpha.

Questions or thoughts about Anthologize? Visit the Anthologize home page or the Anthologize users group.

Hiding WordPress custom post type menu items without disabling edit access

WordPress 3.0’s custom post types are really cool, opening up a whole new world of use cases for WordPress. We used custom post types extensively when developing Anthologize. But there are still some rough spots.

For instance, the ‘show_ui’ parameter of register_post_type() is a little bit too coarse-grained for our purposes. For Anthologize, we wanted to allow the user to edit custom post types with the standard Edit page, but we didn’t want users to be able to access most of these post types through the menu items automatically created by register_post_types (all links to the edit pages would appear on our custom Dashboard panel, in order to reduce redundancy and confusion). With ‘show_ui’ set to true, users could access the edit screens, but they could also access the unwanted menu items; with ‘show_ui’ set to false, the menu items were hidden, but navigating to the Edit pages (directly, via URL) threw a “You don’t have permission to access this page” error.

Here’s how we resolved the dilemma. Note that it’s a bit hackish at the moment. In the future, I hope the WordPress team will split ‘show_ui’ gets into multiple, separate arguments.

  1. In your register_post_type() call, set ‘show_ui’ to true. Here’s an example from Anthologize:
    [code language=”php”]
    register_post_type( ‘library_items’, array(
    ‘label’ => __(‘Library Items’, ‘anthologize’ ),
    ‘public’ => true,
    ‘_builtin’ => false,
    ‘show_ui’ => true,
    ‘capability_type’ => ‘page’,
    ‘hierarchical’ => true,
    ‘supports’ => array(‘title’, ‘editor’, ‘revisions’),
    ‘rewrite’ => array(“slug” => “library_item”)
  2. To remove the unwanted menu items, we’ll take advantage of the fact that WordPress has built-in support for custom menu order. First, we have to tell WordPress to expect a custom menu order. (The following two functions are modified from Anthologize, where they’re methods on a loader class.)
    [code language=”php”]
    function toggle_custom_menu_order(){
    return true;
    add_filter( ‘custom_menu_order’, ‘toggle_custom_menu_order’ );
  3. Once custom_menu_order has been set to true (step 2), WordPress makes a new filter hook available, menu_order. As the name says, it’s really meant to reorder menu items, but we’ll use it to erase menu items altogether.
    [code language=”php”]
    function remove_those_menu_items( $menu_order ){
    global $menu;

    foreach ( $menu as $mkey => $m ) {
    $key = array_search( ‘edit.php?post_type=library_items’, $m );

    if ( $key )
    unset( $menu[$mkey] );

    return $menu_order;
    add_filter( ‘menu_order’, ‘remove_those_menu_items’ ) );

    Here’s what’s happening. The filter hook is meant to modify $menu_order. That’s why remove_those_menu_item() takes $menu_order as an argument, and returns it back to WordPress untouched on the last line of the function. On the first line of the function, we’re taking advantage of the fact that the $menu variable – where menu items are stored for construction into markup later on – is in the global scope. Once we’ve declared that we’ll be using $menu on the first line, we loop through each of the menu items, and when we find one that matches our custom post type (ie, when we find one that contains the string ‘edit.php?post_type=library_items’ – you’ll have to replace the post_type with your own, obviously), it gets removed from the $menu global.

You can iterate this for as many different custom post types as you’d like – just add more potential keys to the foreach loop in remove_those_menu_items(), eg
[code language=”php”]
$key = array_search( ‘edit.php?post_type=library_items’, $m );
$keyb = array_search( ‘edit.php?post_type=some_other_post_type’, $m );

if ( $key || $keyb )
unset( $menu[$mkey] );

Introducing Anthologize, a new WordPress plugin

The moment has arrived!



The product of One Week | One Tool, a one week digital humanities tool barn raising hosted by CHNM and sponsored by the NEH Office of Digital Humanities, is Anthologize. Anthologize is a WordPress plugin that lets you collect and curate content, organize and edit it into a form that works for you, and publish it in one of a number of ebook formats.

As I said in my last post, I was the lead developer for Anthologize. This stemmed from the fact that, for reasons of market penetration and ease of use, we’d chosen WordPress as a platform, and I was “the WordPress guy”. As such, I was the natural person to oversee the various parts of the development process, and to make sure that they fit together in a neat WordPress plugin package. It was an incredible and humbling experience to work with a group of developers who were, to a person, more talented and experienced than I am.

Anthologize PDF output

Anthologize PDF output

Today, the plugin is shipping with four different formats for exporting: PDF, ePub, RTF, and a modified version of TEI that leaves most content in HTML form. None of these export processes are perfect. Some require that certain libraries be installed on your server; some do not offer the kind of layout flexibility that we like; some are not great at text encoding; etc. This release is truly an alpha, a proof-of-concept. The goal is to show not only what a group of devoted individuals can conceive and develop in six short days, but also to provide the framework for further development in the world of independent authorship, publishing, and distribution.

As such, the plugin is designed, and will continue to be developed, with an eye toward maximum flexibility and modularity. Content can be created in WordPress or pulled in by RSS feeds, providing for greater choice of authoring platform. Export formats are generated by translators that work not with native WordPress data, but with an intermediary layer structured with TEI metadata markup. That means that you don’t have to know anything about WordPress to build a new export translator for yourself – you only have to know some PHP and XSLT. And we’re working on expanding Anthologize’s action and filter hooks to allow for true pluggability in the manner of WordPress itself.

I’m hoping that Anthologize will be a useful tool that draws development interest from folks who might not otherwise be interested in WordPress or web development, especially those who are working in the academic, cultural heritage, and digital humanities worlds. Get involved by checking out our Github repository at http://github.com/chnm/anthologize, our development list at http://groups.google.com/group/anthologize-dev, or stop in and chat with the dev team at #oneweek or #anthologize-dev on freenode.