Welcome to the FLOG

The F648 Blog
Beating technology into submission,
one client at a time...

See What Our Clients Have To Say

Quote Author Image
"We wanted our solutions to be turnkey…
…for our fleet management customers. Patrick and his team were able to provide an exceptionally complex hosting environment that supports our current and future needs. "

Thomas Polan / Vice President of Technology
Synovia, Inc.

Fast Fact:

Why Foundation648?

We never forget the 'serve' in 'service.' You can install and upgrade or we can do it for you. And we'll always advise you with integrity and respect.

Languages and Little Languages

Tuesday, October 26, 2010

This is my first entry for this programming-related blog, and there's really only one way I can open it.

My name is Jesse Millikan, and I am a language nut.

Have you ever been to a programming Meetup? I show up from time to time at the local Ruby and Python meetups, IndyAdobe, Indy Alt.Net and have been (once) to the local PHP meetup. Invariably, the first thing that happens at these meetups is the organizer says,"Okay, we're going to go around the room and I want each of you to justify your presence here." Joy.

Why am I here, anyway? Oh, that's right. "I'm here because I'm language nut."

And then... Some weak bit about what I do all day. "But... I do C# at work." "I'm a .NET monkey during the day." You can't really be a language nut at work. It doesn't take.

So you might think I would be a little bit bored doing .NET content management stuff all day. There's not a lot of language variety there, right? You've got C#, and then you've got VB.NET, which is C#. What else is there? Some XSLT, some Javascript, some HTML and CSS. The usual.

That's it, right?

Yeah. Basically. Pretty much. But... Well. Okay. It's just... Every time I do something in one of those languages, I end up using something that isn't just the language I thought I was working in. Like XML for config files. And all those complex bits of XPATH you need to keep your sanity while writing XSLT. Sitecore query and Fast query for finding sitecore items. Lucene query syntax for using custom indexes. Regular expressions for testing form values. CSS selectors for setting up stuff in jQuery. Fixing Emacs keyboard macros that didn't quite take.  .NET string formats. Specially formatted Sitecore fields. Markdown for using Stackoverflow. Powershell, globs, snippets.

I thought I was a C# monkey. Now I'm just confused. And probably overworked! I mean, I'm using twenty-odd languages now, and that's just my day job!

What's going on here?

Little Languages

Little Languages. That appears to be what.

The term is self-explanatory, mostly. A little language is a simple language for a narrow purpose. Sometimes you'd call one a domain-specific language, or a macro language, or a query language. Or something like that. But they're all small, and they're all there for a reason better than "We have to have a language to have a platform."

These things come in lots of flavors. You'll find these thigns where there's something that turns out to be just a little (or maybe a lot) too complicated to handle sanely in a so-called general language. A lot of the time, it has to do with searching and filtering. Sometimes it's templating, or macros, or math, or... Could be about anything, really. It it's complicated and doesn't fit nicely into an OO or procedural library, it's likely to end up as a little language.

And where there's a little language, the pain and tedium of using a not-terribly-good general language sort of fades away. The little language shoulders the load. Let me show you what I'm talking about.

The Zoo

XPATH

XPATH is used in XSLT to do searching and filtering operations, math, concatenation, and almost everything else useful with whatever data you have.

XPATH expressions tend to be really useful for building navigation elements in Umbraco and Sitecore (and anywhere else you have a hierarchical database that you can view as XML).

Want to build a breadcrumb?

The hard way is to start at your current node and recursively climb to the root of your site. Miserable. So, you write a template, call it recursively with the parent of the context and then write out the current link... And write out a separator before, no after, no, between the current... No, only if its... And then you have to call it with the maximum number of levels you want and there's xsl:param and xsl:with-param everywhere...

Miserable.

Here's the easy way (in Umbraco):

<xsl:for-each select="ancestor-or-self::* [@isDoc]">
<!-- link -->
<xsl:if test="position() != last()">
<!-- separator -->
</xsl:if>
</xsl:for-each>

There. Done. XPATH has this nice bit called axes and they do most of the heavy lifting when you're trying to write navigation.

Want to only show certain items in the breadcrumb according to the depth? Or halt at a certain level? Or exclude some non-displayed items from the breadcrumb based on a property? Easy. Just throw it in that XPATH expression.

XSLT

Okay, this one barely counts. Maybe it doesn't count. I included it in the list of "big" languages up top, didn't I?

But then, it doesn't have its own parser, except for the one parsing XPATH expressions; the rest is plain old XML. And it's for a very narrow purpose; it's only for transforming XML into other formats. 99% of the time, that means outputting simple HTML or XHTML.

And XSLT 1.0 is an awfully small language, not counting XPATH. Most of the power in XSLT is in XPATH, and the rest of the power of XSLT... Is also in XPATH. And then, there's some simplicity gained if you're outputting XML or HTML, because most of the time you just use the tags you want to spit out...

Sitecore Query & Fast Query

These two are loosely based on XPATH, but they're not XPATH. These are used exclusively for finding items in Sitecore. They are used mainly through the .NET API, but also in other little places - notably to set up filters for treelists, droptrees and so forth.

Sitecore query is the most complicated of the two, reflecting a great deal of the XPATH language but substituting sitecore item names for element names and tags. Axes, predicates and functions are still there.

Fast query is similar, but much slimmed down so that any fast query statement can be run as a SQL query against Sitecore's database structure. This makes fast query operations (usually) acceptably fast even in situations where you need to search the entire content tree for something.

Lucene Query Syntax

Lucene is an open source Java document indexing and searching technology. Lucene.net is the .NET equivalent used by sitecore for internal search indexing, and also made available by Sitecore for custom indexes.

Lucene recently proved to be very useful in building a cross-cutting taxonomy across a fairly large site.

The Lucene query syntax isn't essential; you can actually query an index at length through the .NET API using lots and lots and lots of classes.

But frankly, I'd rather just say

index.Search("keywords:'armadillo'");

than go through a bunch of instantiations.

Regular Expressions

People love to hate on RegEx. That's a discussion for a different day, certainly. I'm going to leave it that I like RegEx for what it's good at.

Anyway, you can use regular expressions for a lot of things, but what they're best at is digesting text in very simple formats, numbers, phone numbers, a useful subset of email adresses, simple ad-hoc protocols, that sort of thing.

You should never, ever use them to parse HTML or XML, or any language of any serious complexity.

Seriously. Don't do it.

CSS Selectors

It might seem unfair to call CSS selectors a little language. But, if you look closely at a lot of jQuery code, you'll very often see them being used to select elements outside of CSS. And there's certainly a lot of jQuery around these days.

From jQuery for designers:

$(function () {
    var tabContainers = $('div.tabs > div');
    
    $('div.tabs ul.tabNavigation a').click(function () {
        tabContainers.hide().filter(this.hash).show();
        
        $('div.tabs ul.tabNavigation a').removeClass('selected');
        $(this).addClass('selected');
        
        return false;
    }).filter(':first').click();
});

This code creates tabs (with an appropriate stylesheet). It's about 7 lines of code when you trim out the boilerplate. Look at what's doing all the work: CSS selectors. Wonderful.

Javascript is nice language, but it's not enough to make me write a bunch of loops and filters and finds instead of a single nice little CSS selector.

But then, what about CSS property lists? Those can be used separately from selectors too. In fast, considering that you use selectors separate from property lists, and property lists separate from selectors, most of CSS appears to just be the sum of two completely separate little languages, rather one single language!

Thusly

You begin to see how it doesn't effect my sanity so much that I supposedly spend all day working in the same one or two languages. My sanity is intact because, in fact, I don't. The little languages are here, keeping me company, walking by my side.

XPATH, I CHOOSE YOU!

I'm Jesse Millikan, and I'm a language nut. Even, apparently, at work.

This entry was written by Administrator, posted on Tuesday, October 26, 2010 Bookmark the permalink. Follow any comments here with the RSS feed for this post. You can post a comment.

Post a comment