Category Archives: Identifiers

Test support for short URL auto-discovery

Short URL autodiscovery sounds like a good idea to me. The easiest way for a developer to integrate this mechanism is to use the link element in your document’s <head>, e.g:

<link rel="shorturl" href="http://irama.org/673" />

Or if you are using WordPress there are few handy plugins that do this automatically (for example: Short link maker and Shorter links).

So what?

But who’s going to know you’ve gone to all this trouble? How far has support for auto-discovery penetrated? I think it would be great if social media client applications would offer my short URL to users when they choose to shorten URLs within their posts, but I can’t find much documentation about which if any apps will pick up and use short URLs exposed in this way.

Go forth and test

This post can serve as a test, and a collection point for data, leave a comment if you have tested a new client application (include the version if known), and I’ll do the same.

Posted in

Slightly nicer URLs

As we know, all unique online resources should be addressable with a unique URL.

However, not all URLs were created equal. Some URLs are “nicer” than others. For example, URLs with query string parameters are often considered to belong to the “not so nice” URL category:
http://example.com/?p=1234&vH=10&Session_ID=er5DKJn838JK2dfs

In general, what I consider to be “nice” or “not so nice” URLs is a lengthy topic, and I’ll only touch on part of it today. Suffice to say, that for some purposes, I believe using query string parameters is not the worst crime you can commit. In fact, in some cases, I believe they are perfectly acceptable.

Take the following URL for instance:
http://example.com/books/?format=html&order=alphabetical&page=2

. Although query string parameters mean this URL is a little tricky to read, at least it uses human-readable parameter keys and values. And because slashes
/

in URLs imply heirarchy, the only good alternative for this type of URL would be a Matrix URL, like this:
http://example.com/books/;format=html;order=alphabetical;page=2

.

Implementing Matrix URLs within web applications can be difficult, requiring extra server-side redirects or client-side trickery because by default, a HTML form won’t submit data formatted as a Matrix URL.

That’s why I believe query strings aren’t so bad, sometimes they really come in handy.

Repeated parameters

That said, when using checkboxes (or heaven-forbid) multi-select controls to submit data using the GET method, some server-side languages (like PHP) require that you add [] to the end of the name attribute of each control, for example: <input type="checkbox" name="items[]" value="item1" /><input type="checkbox" name="items[]" value="item2" />

For my money, this results in “not so nice” URLs, for example:
http://example.com/books/?items[]=item1&items[]=item2

I know it’s a subtle difference, but I much prefer:
http://example.com/books/?items=item1&items=item2

The other benefit is that your HTML wouldn’t need to contain the [] either: <input type="checkbox" name="items" value="item1" /><input type="checkbox" name="items" value="item2" />

A problem

The problem is, by default, if [] doesn’t appear in your URLs, only the last ‘items’ parameter will be accessible to PHP in the $_GET array.

A solution

I spent some time thinking about this, and decided the best thing to do would be to parse the URL myself.

/**
 * Returns query string parameters more intelligently from the URL than by using the $_GET array.
 * 
 * When multiple parameters are encountered with the same name, they are stacked into an
 * array. This means all URL data can be accessed without using brackets in name attributes
 * For example, typically you would use: <input name="items[]" /> resulting in &items[]=id1&items[]=id2
 * However, using this method you can use: <input name="items" /> resulting in &items=id1&items=id2
 * 
 * @author Andrew Ramsden
 * @see: http://irama.org/news/2009/10/17/slightly-nicer-urls/
 * @license GNU GENERAL PUBLIC LICENSE (GPL) <http://www.gnu.org/licenses/gpl.html>
 * 
 * @param String $url (optional) A URL to parse for query string variables. If not set, the 
 *        current requested URI will be parsed.
 * @return Array An associative array with all query string variables. Multiple parameters
 *         are stacked into a nested array.
 */
function getURLVariables ($url='') {
	
	$url = !empty($url) ? parse_url($url) : parse_url($_SERVER['REQUEST_URI']);
	$result = array();
	$queryStrParams = explode('&',$url['query']);
	
	foreach ($queryStrParams as $param) {
		$paramKeyVals = explode('=',$param, 2);
		
		if (!isset($paramKeyVals[0])) continue;
		
		$key = $paramKeyVals[0];
		$val = isset($paramKeyVals[1])?$paramKeyVals[1]:'';
		
		if (substr($key,-6) == '%5B%5D') { // support ugly urls too
			$result[substr($key,0,-6)][] = $val;
		} else if (!isset($result[$key])) { // add new param to the results array
			$result[$key] = $val;
		} else { // this param already exists, stack into an array
				if (is_array($result[$key])) {
					$result[$key][] = $val; // add to existing array
				} else {
					$result[$key] = array($result[$key], $val); // create new array
				}
		}
	}
	return $result;
}

Now instead of using: $items = $_GET['items']; you can use $items = getURLVariables()['items']; and access all the data from your slightly nicer URLs.

Feedback appreciated, let me know what you think.

Posted in