This is my first entry for this programming-related blog, and
there's really only one way I can open it.
My name is Jesse Millikan, and I am a language nut.
Have you ever been to a programming Meetup? I show up from time to
time at the local Ruby and Python meetups, IndyAdobe, Indy Alt.Net
and have been (once) to the local PHP meetup. Invariably, the first
thing that happens at these meetups is the organizer says,"Okay,
we're going to go around the room and I want each of you to justify
your presence here." Joy.
Why am I here, anyway? Oh, that's right. "I'm here
because I'm language nut."
And then... Some weak bit about what I do all day. "But... I do
C# at work." "I'm a .NET monkey during the day." You can't really
be a language nut at work. It doesn't take.
So you might think I would be a little bit bored doing .NET
content management stuff all day. There's not a lot of language
variety there, right? You've got C#, and then you've got VB.NET,
which is C#. What else is there? Some XSLT, some Javascript, some
HTML and CSS. The usual.
That's it, right?
Yeah. Basically. Pretty much. But... Well. Okay. It's just...
Every time I do something in one of those languages, I end up using
something that isn't just the language I thought
I was working in. Like XML for config files. And all those complex
bits of XPATH you need to keep your sanity while writing
XSLT. Sitecore query and Fast query for finding sitecore
items. Lucene query syntax for using custom
indexes. Regular expressions for testing form values. CSS
selectors for setting up stuff in jQuery. Fixing Emacs keyboard
macros that didn't quite take. .NET string formats. Specially
formatted Sitecore fields. Markdown for using Stackoverflow.
Powershell, globs, snippets.
I thought I was a C# monkey. Now I'm just confused. And probably
overworked! I mean, I'm using twenty-odd languages now, and that's
just my day job!
What's going on here?
Little Languages
Little
Languages. That appears to be what.
The term is self-explanatory, mostly. A little language is a
simple language for a narrow purpose. Sometimes you'd call one a
domain-specific language, or a macro language, or a query language.
Or something like that. But they're all small, and they're all
there for a reason better than "We have to have a language to have
a platform."
These things come in lots of flavors. You'll find these thigns
where there's something that turns out to be just a little (or
maybe a lot) too complicated to handle sanely in a so-called
general language. A lot of the time, it has to do
with searching and filtering. Sometimes it's templating, or macros,
or math, or... Could be about anything, really. It it's complicated
and doesn't fit nicely into an OO or procedural library, it's
likely to end up as a little language.
And where there's a little language, the pain and tedium of
using a not-terribly-good general language sort of fades away. The
little language shoulders the load. Let me show you what I'm
talking about.
The Zoo
XPATH
XPATH is used in XSLT to do searching and filtering operations,
math, concatenation, and almost everything else useful with
whatever data you have.
XPATH expressions tend to be really useful for building
navigation elements in Umbraco and Sitecore (and anywhere else you
have a hierarchical database that you can view as XML).
Want to build a breadcrumb?
The hard way is to start at your current node and recursively
climb to the root of your site. Miserable. So, you write a
template, call it recursively with the parent of the context and
then write out the current link... And write out a separator
before, no after, no, between the
current... No, only if its... And then you have to call it with the
maximum number of levels you want and there's xsl:param and
xsl:with-param everywhere...
Miserable.
Here's the easy way (in Umbraco):
<xsl:for-each select="ancestor-or-self::* [@isDoc]">
<!-- link -->
<xsl:if test="position() != last()">
<!-- separator -->
</xsl:if>
</xsl:for-each>
There. Done. XPATH has this nice bit called axes
and they do most of the heavy lifting when you're trying to write
navigation.
Want to only show certain items in the breadcrumb according to
the depth? Or halt at a certain level? Or exclude some
non-displayed items from the breadcrumb based on a property? Easy.
Just throw it in that XPATH expression.
XSLT
Okay, this one barely counts. Maybe it doesn't count. I included
it in the list of "big" languages up top, didn't I?
But then, it doesn't have its own parser, except for the one
parsing XPATH expressions; the rest is plain old XML. And it's for
a very narrow purpose; it's only for transforming XML into other
formats. 99% of the time, that means outputting simple HTML or
XHTML.
And XSLT 1.0 is an awfully small language, not counting XPATH.
Most of the power in XSLT is in XPATH, and the rest of the power of
XSLT... Is also in XPATH. And then, there's some simplicity gained
if you're outputting XML or HTML, because most of the time you just
use the tags you want to spit out...
Sitecore Query & Fast Query
These two are loosely based on XPATH, but they're not XPATH.
These are used exclusively for finding items in Sitecore. They are
used mainly through the .NET API, but also in other little places -
notably to set up filters for treelists, droptrees and so
forth.
Sitecore query is the most complicated of the two, reflecting a
great deal of the XPATH language but substituting sitecore item
names for element names and tags. Axes, predicates and functions
are still there.
Fast query is similar, but much slimmed down so that any fast
query statement can be run as a SQL query against Sitecore's
database structure. This makes fast query operations (usually)
acceptably fast even in situations where you need to search the
entire content tree for something.
Lucene Query Syntax
Lucene is an open source Java document indexing and searching
technology. Lucene.net is the .NET equivalent used by sitecore for
internal search indexing, and also made available by Sitecore for
custom indexes.
Lucene recently proved to be very useful in
building a cross-cutting taxonomy across a fairly large site.
The Lucene query syntax isn't essential; you can actually query
an index at length through the .NET API using lots and lots and
lots of classes.
But frankly, I'd rather just say
index.Search("keywords:'armadillo'");
than go through a bunch of instantiations.
Regular Expressions
People love to hate on RegEx. That's a discussion for a
different day, certainly. I'm going to leave it that I like RegEx
for what it's good at.
Anyway, you can use regular expressions for a lot of things, but
what they're best at is digesting text in very simple formats,
numbers, phone numbers, a useful subset of email adresses, simple
ad-hoc protocols, that sort of thing.
You should never, ever use them to parse HTML or XML, or any
language of any serious complexity.
Seriously. Don't do it.
CSS Selectors
It might seem unfair to call CSS selectors a little language.
But, if you look closely at a lot of jQuery code, you'll very often
see them being used to select elements outside of CSS. And there's
certainly a lot of
jQuery around these days.
From jQuery
for designers:
$(function () {
var tabContainers = $('div.tabs > div');
$('div.tabs ul.tabNavigation a').click(function () {
tabContainers.hide().filter(this.hash).show();
$('div.tabs ul.tabNavigation a').removeClass('selected');
$(this).addClass('selected');
return false;
}).filter(':first').click();
});
This code creates tabs (with an appropriate stylesheet). It's
about 7 lines of code when you trim out the boilerplate. Look at
what's doing all the work: CSS selectors. Wonderful.
Javascript is nice language, but it's not enough to make me
write a bunch of loops and filters and finds instead of a single
nice little CSS selector.
But then, what about CSS property lists? Those can be used
separately from selectors too. In fast, considering that you use
selectors separate from property lists, and property lists separate
from selectors, most of CSS appears to just be the sum
of two completely separate little languages, rather one single
language!
Thusly
You begin to see how it doesn't effect my sanity so much that I
supposedly spend all day working in the same one or two languages.
My sanity is intact because, in fact, I don't.
The little languages are here, keeping me company, walking by
my side.
XPATH, I CHOOSE
YOU!
I'm Jesse Millikan, and I'm a language nut. Even, apparently, at
work.