The Power of the URL Service

Introduction

The A Beginner's Guide to v2.1 article covers how to implement a very basic video Channel. It relied on URL Service support for URLs hosted by blip.tv. In this article, we're going to discover what a URL Service actually is and their power! The actual blip.tv service is a little complicated for this short post, so instead we will focus on a nice simple one for the Euronews website.

A URL Service is essentially a mechanism for Plex to translate a given URL into the associated metadata and available media items. The specific framework documentation can be found here but basically as a developer, you are required to implement two specific functions:

MetadataObjectForURL(url)
Returns a metadata object for the given URL (VideoClipObject, 
MovieObject, EpisodeObject, TrackObject, PhotoObject)
MediaObjectsForURL(url)
Returns a list of MediaObject's which represent the video 
streams available for the specific video/photo/music 

This function is expected to execute and return very quickly. It 
should avoid making any HTTP request which could cause a delay.

There is also an optional function that should be implemented when a single site provides multiple URLs for the same video. This is as follows:

NormalizeURL(url)
Returns the standard normalised URL 

This function is expected to execute and return very quickly. It 
should avoid making any HTTP request which could cause a delay.

Structure and ServiceInfo.plist

Let's start by first looking into how these are defined within a plug-in. Plex Media Server needs to know some basic information about the implemented URL Service. All Service related code is placed within a separate subfolder within the bundle called "Services". It then has its own plist file. The following picture gives a good illustration of the file hierarchy:

It should be pretty obvious that the service's code is contained within the ServiceCode.pys file. However, we should take a quick look over the information contained within the plist file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>URL</key>
  <dict>
    <key>Euronews</key>
    <dict>
      <key>URLPatterns</key>
      <array>
        <string>http://([^.]+.)?euronews.net/.+</string>
      </array>
      <key>TestURLs</key>
      <array>
        <string>http://www.euronews.net/nocomment/2012/01/04/high-winds-and-heavy-rain-lash-uk/</string>
      </array>
    </dict>
  </dict>
</dict>
</plist>

The initial key defines the name, and therefore sub-directory, where the actual code resides under the URL folder. This can be essentially anything, but is always best to keep it short but descriptive. The next point of interest is the URLPatterns. This array of string define regular expressions which represents the URLs supported by this specific service. Think of it as an easy way for Plex to workout which URL Service should be executed against a requested URL. Last, but not least, the TestURLs element defines a collection of URLs which can be used to test the service. This is extremely important in order to quickly determine when a URL Service breaks due to a site change, or other dependent changes.

Now that the service is correctly defined within the ServiceInfo.plist, we must create the necessary folder structure in which the code will reside. As seen in the above image, there are a number of sub-folders required that contain all associated Services, along with one specifically for URL Service.

ServiceCode.pys

Once these have been created, the actual code file needs to be created named “ServiceCode.pys”. This file should contain all the code associated with your service, implementing all mandatory functions defined earlier. There are a few things we need to do in this file to get our URL Service up and running.

Metadata

Lets start by looking at an implementation to obtain the associated metadata:

def MetadataObjectForURL(url):
 
    # Request the URL
    page = HTML.ElementFromURL(url)
 
    # Extract the details available directly form the page.
    title = page.xpath("//head//meta[@property='og:title']")[0].get('content')
    description = page.xpath("//head//meta[@name='description']")[0].get('content')
    thumb = page.xpath("//head//meta[@property='og:image']")[0].get('content')
 
    return VideoClipObject(
        title = title,
        summary = description,
        thumb = thumb) 

As you can see, this is fairly basic. The Euronews website provides a number of video clips of topical news reports. You'll often find that some URL specific information can be obtained from the HEAD of the HTML document, rather than more complicated XPath into the more embedded information of the page. The functions available to you as a developer are identical to those you know (and love) in the framework. The common scenario is:

  • Make a request for the page
  • Extract information via XPath
  • Return the suitable metadata object

The amount of information available really depends on what's provided by the site. It's normal to have the title, description and thumb for a given item. However, sometimes obtaining further metadata can be a little bit more tricky but definitely worth it in the long run!

MediaObjects

Once we've got all of our nice juicy metadata, we can start by looking at how video(s) are available. The Euronews site actually provides a single FLV file which the embedded player utilities. This was simply found by viewing the source from a page, and searching for common video file extensions. There is only a single quality available so we can only return a single MediaObject. Here's the code:

def MediaObjectsForURL(url):
    return [
        MediaObject(
            video_codec = VideoCodec.VP6,
            audio_codec = AudioCodec.MP3,
            container = 'flv',
            parts = [PartObject(key=Callback(PlayVideo, url = url))]
        )
    ]

You might wonder how to determine the correct video/audio codecs to report. Generally, the best application to use is a free program called MediaInfo. It will give you lots of information about the specific file. You'll often find that for one particular site, the video/audio codecs are always the same. You'll also notice from the above code that the parts define a PartObject for which a Callback is required. As you might remember, this function needs to return quickly and cannot do any HTTP requests. Therefore, we know that an FLV file will be available, we just don't know where it is. The callback will only be executed if the user actively selects that video item to play. When they do, the PlayVideo function is called. Here's the code for that:

BASE_URL = 'http://video.euronews.net/'
RE_VIDEO_URL = Regex('videofile:"(?P<video_url>[^"]+)"')
 
def PlayVideo(url):
 
    # Request the URL
    page = HTTP.Request(url).content
 
    # The source of the page actually contains a link to the associated flv file. We can simply find
    # this by using a regular expression to find it. Then, we just redirect.
    video_url = RE_VIDEO_URL.search(page).group('video_url') + ".flv"
 
    return Redirect(BASE_URL + video_url)

This is doing the basic HTML request and using a regular expression to locate the known FLV file. The use of the Regex class provides a pre-compiled version of the expression, similar to re.compile. Once the actual video URL is found, returning a simple Redirect will result in the client being redirected to the actual video file.

Testing Your URL Service

Once you're happy with the implementation and you want to try it out, it's very easy! You need to manually construct a URL Service lookup URL and, well, hit it. The Channel documentation covers this here. You basically take the following PMS URL and add an encoded version of the URL that you want to test. The easiest way to encode your test URL is to use some quick and easy online tool (one example):

http://localhost:32400/system/services/url/lookup?url=INSERT_TEST_URL

So, that's basically it! We've managed to provide support for Plex to translate a URL from a specific site and convert it into the associated metadata and media content, allowing for a number of cool features. However, there is still more magic to explain.

TestURLs

Some of you might be thinking that the test URLs defined within the plug-in might expire. This happens frequently when content is only available for a limited amount of time. Therefore, tests can begin failing but are actually because the Test URL is wrong, rather than the service. Luckily, the framework also provides a mechanism to programmatically define the Test URLs via a function, rather than statically defined. Here's a little example of how to do this:

def TestURLs():
    test_urls = []
 
    page = HTML.ElementFromURL('http://www.euronews.net/')
 
    for link in page.xpath("//span[@class='vid']/.."):
        if len(test_urls) < 3:
            url = link.get('href')
            url = "http://www.euronews.net" + url
 
            if url not in test_urls:
                test_urls.append(url)
        else:
            break
 
    return test_urls

All this is really doing is obtaining the main page for the site, and using XPath to find a few recent videos to be used. If you want to double check that this function has worked correctly, the easiest thing to do is to hit the Test URLs PMS URL to see what your plug-in has returned. You can either obtain all test URLs associated with all channels, or a specific one using the following PMS URLs:

http://localhost:32400/:/plugins/com.plexapp.system/serviceTestURLs 
http://localhost:32400/:/plugins/com.plexapp.system/serviceTestURLs/com.plexapp.plugins.blip 

URL Services and Watch Later

Ok, so creating test URLs might not be the sexiest thing to do, but there's one more thing! Plex's Watch Later feature can add items via a bookmarklet. Well, the bookmarklet basically loads some javascript, which captures the current URL from the browser and sends it to the Plex servers. Once the server has that URL, how do you think it converts it to the metadata and media you access in the clients? That's right: URL Services!

Now there's only one small change to support Watch Later. Instead of Plex needing to look through all Channels available from the store, the URL Services are moved into a centralized Services.bundle. This code is available via GitHub here. It's as simple as copying the files that you've already written and moving them into the appropriate sub-directories. Fork the repo and give it a go!