Handling LaTeX in WordPress and React.js

I’m in the midst of building a WordPress plugin that interfaces with WeBWorK, a web application for math and science homework. The plugin features a JavaScript app powered by React and Redux, which lives inside of a WordPress page template and communicates with WP via custom API endpoints. The project is for City Tech OpenLab and it’s pretty cool. I’ll write up some more general details when it’s closer to release.

The tool is designed for use in undergraduate math courses, so LaTeX-formatted content features heavily. LaTeX is beautiful when used on its own, but it’s hard to handle in the context of web applications. A few lessons and strategies are described below.

Delimiters

I’m dealing with “mixed” content: plain text that is interspersed with LaTeX. Some of this content is coming directly from WeBWorK, which formats LaTeX to be rendered with MathJax. WeBWorK is sending markup that, when simplified, looks like this:

This is plain text.
Here is an inline equation: <script type="math/tex">E = mc^2</script>
And here is the same equation as a standalone block:
<script type="math/tex; mode=display">E = mc^2</script>

It’s also possible for users to enter LaTeX from the front-end of the WP application. In this case, we’ve settled on the verbose delimiters \begin{math} and \end{math}, but we also support a shorthand delimiter borrowed from Jetpack’s LaTeX renderer:

This is plain text.
Here is an inline equation: \begin{math}E = mc^2\end{math}
And here is the same equation as a standalone block:
$latex E = mc^2$

Some of these delimiter formats are more fragile than others, for various reasons – script tags that cannot be escaped, false positives due to use of literal dollar signs, etc – so I standardized on some invented delimiters for data storage. Before saving any LaTeX content to the database, I’m converting all delimiters to the following formats (of my own invention, which explains their Great Beauty):

This is plain text.
Here is an inline equation: {{{LATEX_DELIM_INLINE_OPEN}}}E = mc^2{{{LATEX_DELIM_INLINE_CLOSE}}}
And here is the same equation as a standalone block:
{{{LATEX_DELIM_DISPLAY_OPEN}}}E = mc^2{{{LATEX_DELIM_DISPLAY_CLOSE}}}

This way, the delimiters are unique and easy to process on display.

Slashes

I’m using WordPress custom post types to store data. The lowest-level function available for saving post data is wp_insert_post(). Thanks to a glorious accident of history, this function expects data to be “slashed”, as in addslashes() and magic quotes. If you’re looking for a fun way to spend a few hours, read through this WordPress ticket and figure out a solution, please.

LaTeX uses \ as its escape character, which makes LaTeX-formatted content look “slashed” to WordPress. As such, attempting to save chunks of LaTeX using wp_insert_post() will result in slashes being removed and formatting being lost. So I had two choices: don’t use wp_insert_post(), or don’t use slashes.

I chose the latter. Before saving any content to the database, I identify text that appears between LaTeX delimiters, and I replace all slashes with a custom character. In all its glory:

public function swap_latex_escape_characters( $text ) {
    $regex = ';(\{\{\{LATEX_DELIM_((?:DISPLAY)|(?:INLINE))_OPEN\}\}\})(.*?)(\{\{\{LATEX_DELIM_\2_CLOSE\}\}\});s';
    $text = preg_replace_callback( $regex, function( $matches ) {
        $tex = str_replace( '\\', '{{{LATEX_ESCAPE_CHARACTER}}}', $matches[3] );
        return $matches[1] . $tex . $matches[4];
    }, $text );

return $text;
}

So a chunk of LaTeX like E = \frac{mc^2}{\sqrt{1-\frac{v^2}{c^2}}} goes into the database as

E = {{{LATEX_ESCAPE_CHARACTER}}}frac{mc^2}{{{{LATEX_ESCAPE_CHARACTER}}}sqrt{1-{{{LATEX_ESCAPE_CHARACTER}}}frac{v^2}{c^2}}}

They get converted back to slashes before being served via the API endpoints.

Rendering in a React component

I decided to stick with MathJax for rendering the LaTeX. MathJax has some preprocessors that allow you to configure custom delimiters, but I had a hard time making them work inside of my larger JavaScript framework. So I decided to skip the preprocessors and generate the <script type="math/tex"> tags myself.

React escapes output pretty aggressively. This is generally a good thing. But it makes it hard to print script tags and unescaped LaTeX characters. To make it work, I needed to live dangerously. But I didn’t want to print any more raw content than necessary. So I have two components for rendering text that may contain LaTeX. Click the links for the full source code; here’s a summary of what’s going on:

  1. FormattedProblem performs a regular expression to find pieces of text that are set off by my custom delimiters. Text within delimiters gets passed to a LaTeX component. Text between LaTeX chunks becomes a span.
  2. LaTeX puts content to be formatted as LaTeX into a script tag, as expected by MathJax. Once the component is rendered by React, I then tell MathJax to queue it for (re)processing. See how updateTex() is called both in componentDidMount() and componentDidUpdate(). Getting the MathJax queue logic correct here was hard: I spent a lot of time dealing with unrendered or double-rendered TeX, as well as slow performance when a page contains dozens of LaTeX chunks.

A reminder that React’s dangerouslySetInnerHTML is dangerous. I’m running LaTeX content through WP’s esc_html() before delivering it to the endpoint. And because I control both the endpoint and the client, I can trust that the proper escaping is happening somewhere in the chain.

I use a very slightly modified technique to provide a live preview of formatted LaTeX. Here it is in action:

output

Pretty cool TeXnology, huh? ROFLMAO

Improved ‘equalto’ validation for Parsley.js

I’ve been playing with Parsely.js for form validation on a client project. It’s pretty nice, but I was unhappy with the ‘equalto’ implementation. ‘equalto’ allows you to link two fields whose entries should always match, such as when you have password or email confirmation fields during account registration. parsley-equalto is not symmetrical. If you enter some text into A, and enter non-matching text into B, B will not validate. If you correct B so that it matches A, then B will validate. So far, so good. But if you correct A so that it matches B, it won’t change the validation.

So I wrote a custom implementation that triggers validation on the paired field, making the link between the fields symmetrical. It’s pretty ugly (to avoid recursion) and doesn’t have any error handling, but it should point you in the right direction. (I’ve called it iff, which you can look up.)

The markup:

<input
  name="password"
  id="password"
  data-parsley-trigger="blur"
  data-parsley-iff="#password-confirm"
  data-parsley-iff-message=""
/>

<input
  name="password-confirm"
  id="password-confirm"
  data-parsley-trigger="blur"
  data-parsley-iff="#password"
  data-parsley-iff-message="Passwords must match."
/>

The validator:

var iffRecursion = false;
window.Parsley.addValidator( 'iff', {
    validateString: function( value, requirement, instance ) {
        var $partner = $( requirement );
        var isValid = $partner.val() == value;
        if ( iffRecursion ) {
            iffRecursion = false;
        } else {
            iffRecursion = true;
            $partner.parsley().validate();
        }
	return isValid;
    }
} );

Indirect funding and the limits of free software patronage

A few thoughts about direct and indirect funding for free software development.

Proprietary software is often sold at retail, which makes for a diverse economic model. Let’s say that a million people buy a $10 yearly license for your iWidgetFactoryPro™. This year you’ll have $10,000,000 dollars, much of which can be used to fund future iWidgetFactoryPro™ development. If you lose fifty thousand users next year, you’ll have $500,000 less to work with. That’s a lot of cabbage. But you’ll still have $9,500,000, enough to carry on your product development, perhaps with a slightly reduced scope. Retail pricing spreads the risk around.

Most free software is not sold in a retail fashion. A single company might pay for the development of a tool, and then decide to release it under a free license. Or a tool’s builders may fund development themselves, with their own money or their own free time. Or a project that starts off as a labor of love may become important enough that a number of companies volunteer resources to improve it. In any of these cases, the funding model is highly centralized. Instead of a million users who share the financial burden roughly equally, as with retail software, a free tool may have a million users but just a small handful of funders, each of which is footing a disproportionately large part of the bill. It’s a precarious setup not merely because the sheer number of funders is so small, but also because the costs are distributed in a way that’s so uneven – and unfair! – that individual funders have arguably more inclination to walk away than the retail licensee who’s paid a lousy ten bucks for a copy of iWidgetFactoryPro™.

The asymmetrical funding model for free software is the cause of much hand-wringing among the individuals who maintain free software projects. How does a volunteer, a solopreneur, a small businessperson, an underfunded or unfunded Desperado, take on the huge and often unrewarding task of maintaining a popular project, while still managing to make money? My friend Daniel Bachhuber has written a number of posts recently in which he struggles with this problem, and is experimenting with a couple different models:

The key idea behind Sparks is to create a space where WordPress-based businesses can contribute to an open source roadmap, collaboratively prioritize, and then share the cost of development and maintenance […] When I explain the concept of Sparks to a prospective customer, they get it. It makes a ton of sense to share the burden of building and maintaining boring, business-critical infrastructure.

The strategy is to hedge against some of the instability of the “patronage” model – free software tools being supported by a very small group of generous funders – toward a more diversified financial base.

The structure of “Sparks” – crowdfunding, but where the members of the “crowd” are businesses with budgets – moves toward retail software in two ways that are worth considering. First, by getting interested individuals to take an equal share in funding the software, it tries to make the funding model more fair. Second, by tying the decision of which tools to build to the number of voters (or backers, or whatever) in support of that specific tool, it tries to make the funding model more direct.

Can this work? Direct funding models are kind of like health insurance exchanges: the economics only work if there’s a mandate that everyone participate. Proprietary software licensing is one such mandate. With free software, it’s an uphill slog, and a hard sell.

I have mostly given up on direct client funding for my free software work. There are a couple of interconnected problems:

  • Clients can be convinced to pay for something new and shiny and released with public credit to their name. But fewer want to pay for maintaining something old and boring and anonymous.
  • Once a project is released and in broad use, the client’s ongoing needs for improvement often (usually) diverge with the needs of the broader community.
  • When you are a maintainer of a large software project, quid pro quo contributions – “we’ll pay you to add this feature to WordPress” – are fraught with ethical and practical difficulties.
  • The things that clients want to build are not usually the things that I want to build, or the things that I think need to be built.

As direct funding has become less attractive and more difficult to manage, I’ve turned more toward indirect models. I’ve spoken and written at length about what I call “the reputation cycle”. This is the idea that time spent contributing to free software can improve your reputation, which allows you to increase your rates, which allows you to bill fewer hours, which allows you to contribute more time to free software. Over the last few years, I’ve ratched myself up to the point where I spend roughly 50% of my working time doing work that is not paid for by a client.

Or, at least, not paid for directly. Client work subsidizes free software work, but the subsidy is indirect. This indirectness avoids most of the problems sketched above. My decisions about how and where to contribute to the projects I’m involved with are made based on my own interests and my own assessment of project priorities and needs. Since the work is not being done under the aegis of any specific client relationship, I’m not bound by any specific client expectations.

If not framed correctly, the indirect model can feel vaguely dishonest. It involves charging a higher rate to paying customers, without providing any direct benefit for the increased cost. In one sense, this feeling is clearly misguided; the software I choose to work on in my “free” time is the very same software that powers my clients’ sites. They’re reaping benefits that are indirect – but not that indirect.

More importantly, the sense of dishonesty is misplaced because there’s no deception involved. The model is indirect, but it’s not implicit. My message for potential clients is pretty explicit: When you hire me, you are not only buying top-quality technical work, but you are also funding the more general improvement of the free software projects in which I’m involved. It’s part of the brand. It’s less Robin Hood – illicit redistribution of funds – than, say, buying organic milk: you pay more for a slightly better product, knowing that part of the extra cost goes toward the normalization of a system of production that’s superior to the conventional system.

Can this funding model – indirect, but explicit about it – be scaled? Probably not. Like patronage, it depends on the good will of a fairly small number of benevolent folks – developers and clients – to shoulder the burden of the other 99% of users. But there’s something noble about it too. A totally “fair” system, in which each user pays an equal amount to use a piece of software, ignores the fact that not all users have equal resources. One of the beauties of free software is that the generosity of those who can afford to contribute can benefit those who cannot.

 

WordCamp Chicago 2016 slides

I just finished giving a talk at WordCamp Chicago titled “Backward Compatibility as a Design Principle”, in which I discussed WordPress’s approach to backward compatibility, how it’s evolved over the years, and its costs and benefits when compared to the alternatives. I’m not sure that the slides are very helpful in isolation, but someone asked me to post them, and I am not one to disappoint my Adoring Fans. Embedded below.


download as pdf

BuddyPress Docs 1.9.0 and Folder support

BuddyPress Docs is one of my more popular WordPress plugins. For years, one of the most popular feature requests has been the ability to sort Docs into folders. Docs 1.9.0, released earlier this week, finally introduced folder functionality.

The feature is pretty cool. When editing a Doc within the context of a group, you can select an existing folder, or create a new one, in which the Doc should appear. Folders can be nested arbitrarily. Breadcrumbs at the top of each Doc and each directory help to orient the reader. And a powerful, AJAX-powered directory interface makes it easy to drill down through the folder hierarchy. (Folders are currently limited to groups, which simplifies the question of where a given folder “lives”. An experimental plugin allows individual users to use folders to organize their personal Docs.)

docs-folders

I’ve got a couple reasons for drawing attention to this release. First, the Folders feature was developed as part of contract work I did for University of Florida Health. They use WordPress and BuddyPress for some of their internal workspaces, and the improvements to BuddyPress Docs have helped them to build a platform customized to their users’ specific needs. My partnership with UF Health is a great instance of a client commissioning a feature that then gets rolled into a publicly available tool – the type of patronage that demonstrates the best parts of free software development as well as IT in the public sector.

A bonus side note: UF Health Web Services is currently hiring a full-time web developer. If you know PHP, and want a chance to work with cool people on cool projects – including WordPress and BuddyPress – check out the job listing.

The other fun thing about this release is that it’s the first major release of Docs where I’ve worked closely with David Cavins, master luthier and BuddyPress maven. He’s a longtime contributor to Docs, and has done huge amounts of excellent work to bring 1.9.0 to fruition. Many thanks to David for his work on the release!

2015

Previously: 2014, 2013, 2012, 2011, 2010, 2009.

I wrote one year ago that 2015 would be a hard year. And so it was. Here’s the requisite Dec 31 braindump.

In January, I became a dad again. Seeing my two kids grow together and become friends has been one of the privileges of my life. But the logistics of having two kids is pretty different (and much more exhausting) than when you’ve got just one child. The process of finding balance is ongoing.

The other big event of the year is that, in July, our family moved from New York City to Chicago. Moving sucks. It’s expensive, it’s disorienting, it’s inconvenient. My possessions were in limbo with the moving company for something like 13 days. Practicalities aside, it’s hard to leave NYC. While I grew up in the Midwest, I spent my entire adult life in New York and feel like a New Yorker. There’s something about New York that features more prominently in its residents’ inner ideas about who they are than when you live in, say, Ohio. In the same way as when I left graduate school, I’ve had to face this miniature identity crisis by reevaluating those aspects of my former life that are actually (ie, not just conventionally) central to what makes me tick, and then find a way to fit them in the context of my new life. This project is also ongoing 🙂

Partly in response to my man-without-a-country malaise, and partly out of philosophical motivations, I poured myself into free software contribution in 2015. More than 50% of my working year was spent doing unpaid work on WordPress, BuddyPress, and related projects. (More details.) I’m a vocal proponent for structuring your work life in such a way that it subsidizes passion projects, though numbers like these make me wonder whether there’s a limit to how far this principle can be pushed. I guess I’ll continue to test these boundaries in 2016.

One of the things I’d like to do in 2016, as regards work balance, is to find more ways to work with cool people. I am a proud lone wolf, but sometimes I feel like there’s a big disconnect between my highly social free software work and my fairly solitary consulting work.

Happy new year!

WordCamp NYC talk on the history of the WordPress taxonomy component

At WordCamp NYC 2015, I was pleased to present on the history of the WordPress taxonomy component. Of all the WordCamp talks I’ve given, this one was the most fun to prepare. I spent days reading through old Trac tickets, the wp-hackers archives, and interview transcripts. The jokes are mostly mediocre and the Photoshopping is (mostly intentionally) lousy, but I think the talk turned out OK. Check it out below, or on wordpress.tv.


How to cherry-pick comments using Subversion (for WordPress at least)

I generally use git-svn for my work on WordPress and related projects. On occasion, I’m forced to touch svn directly. This occurs most often when merging commits from trunk to a stable branch: it’s best to do this in a way that preserves svn:mergeinfo, and git-svn doesn’t do it properly. Nearly every time I have to do these merges, I have to relearn the process. Here’s a reminder for Future Me.

  1. Commit as you normally do to trunk. Make note of the revision number. (Say, 36040.)
  2. Create a new svn checkout of the branch, or svn up your existing one. Eg: $ svn co https://develop.svn.wordpress.org/branches/4.4 /path/to/wp44/
  3. From within the branch checkout: $ svn merge -c 36040 ^/trunk  In other words: merge the specific commit (or comma-separated commits) from trunk of the same repo.
  4. This will leave you with a dirty index (or whatever svn calls it). Use svn status to verify. Run unit tests, etc.
  5. $ svn ci. Copy the trunk commit message to EDITOR, and add a line along the lines of “Merges [36040] to the 4.4 branch”
  6. Drink 11 beers.

Willa and Wally

My kids are named Wilhelmina and Walter. So when I saw this image in a New York Times article about items recently unearthed in Dr Seuss’s filing cabinets, I swooned:

willy-and-wally

I sent a link to my wife, and told her I wished I could get it in poster size.

Yesterday, I got an early birthday present – my wife had sent the link to my mother, who painted me a slightly modified version:

DSC_0294

Thanks, Mom and Rebecca 🙂