PubForge Blog

September 24, 2008

PubMedia CMS feature request

Filed under: open source, content management, drupal — Jack Brighton @ 11:21 am

(This post began life as an email thread, but maybe needs to be more public so here it is.  Edited and expanded for obsessive clarity…)

It strikes me as somewhat simple (OK maybe not exactly simple) to develop a Drupal-based CMS with enough commonly-needed features for public radio/TV stations.  You’d have your pre-built data types, skinable templates, forms, and possibly a set of pre-defined roles and workflows.  All nicely documented etc.

But what we all really want is a system that knows about media files.  You could upload (or link to) a media object, and the CMS would extract its available metadata.  The system would then save that metadata in its database for processing and display in various ways.  On web pages where media is published, the system would display its media type, length, bitrate, framerate, whatever.  Then of course we’d be adding by hand other metadata like title, subject, author, keywords, description, etc as we add media content to the website.  Ideally, the system would be able to automatically read ID3 tags, MXF, and EXIF metadata for both technical and descriptive information.  The idea is to automate the capturing of metadata as much as possible.

For web pages, we’d probably want to display mostly descriptive metadata, and not things like sampling rate, bit depth, color format, etc.

But for RSS feeds we need some of that technical metadata like filesize and mimetype.

And here’s the good part: If we capture enough information about our media objects, we can easily express it as “shareable metadata” via PBCore-compliant XML, and other standard schema.  So the CMS becomes a powerful tool for creating a large index of public media.  We can then write applications to search that index at a very fine level of detail.  Think Technorarti only focused entirely on media objects expressed as detailed XML records.

At WILL we currently catalog media objects (as I call them) using our CMS, but there’s no automatic extraction of anything.  We have to key in all the data.  But once that’s done, the output looks like this:

http://will.illinois.edu/metadata/pbcore/pf2008-04-17-a

Seems to me this is the beginning of a system-wide super API that doesn’t depend on any central organization, and is truly open source.

Existing open source PHP functions for automated metadata extraction could be integrated in a Drupal-like CMS.  The PHP ID3 function allows for reading and manipulating ID3 tags; the PHP Exif Functions can extract all kinds of metadata from JPEGs and TIFFs.  Similar functions may already exist for video files.

If we have a CMS that understands how to read existing metadata from the digital objects we feed it, we’re half-way to building an online digital asset management system.  More on that in Part Two…

Jack Brighton

August 21, 2008

Stupid API Tricks

Filed under: open source, collaboration, content management — Jack Brighton @ 2:45 pm

What have we done with the new NPR API? This would be a good place for people to share examples of cool mashups and apps they’ve devised to tap NPR’s open content. Or to suggest ideas on which we could perhaps collaborate.

Here’s one of mine: What if I tagged my news stories, interviews, and other content with keywords based on the NPR taxonomy (or even just my own keywords) and when I publish that content on my website, it generates a query to the NPR API? I could have a widget sitting next to my content pulling in related stuff dynamically. As the NPR API expands its reach to other public media sources, each content entry on my site becomes an entry point to a growing universe of related content. Of course that universe might get pretty big, so what if we wrote a script that could then parse everything and generate a navigational structure based on the metadata returned with the results of the query. So the query results would evolve over time, and so would the navigational structure.

What if next we expand the range of sources to query based on the metadata of our initial content? Scientific and cultural institutions have large collections of content, many accessible through an API. Funding is increasingly premised on open collections and public research results. What if we tap into that, so a given media object can serve semantically and programatically as merely a starting point to explore a growing web of deep knowledge and perspective?

Maybe that kind of language sounds cheesy or something, but it seems like a fun thing to me.  On the other hand, maybe let’s start with the NPR API and go from there…

Jack Brighton
WILL Public Media

The PBCore Saga: An Update

Filed under: open source, best practices, content management — Jack Brighton @ 11:30 am

Those of us consumed with passion about metadata for A/V objects (and who isn’t…) have been excited by the emergence of the PBCore. We present here an update.

In our last dramatic PBCore episode, CPB funded a multi-year project to develop a standard for shareable metadata about audio and video productions and files. This culminated in the release of the PBCore Data Dictionary and an associated XML schema, with Version 1.0 in April 2005, and an improved Version 1.1 in January 2007.

We’ll leave an actual description of PBCore for another time and place, or get full details on the PBCore site.

It turns out PBCore is darn useful. Film archives, academic media collections, and media curators including the Library of Congress are actively pursuing systems that speak PBCore. Not to mention PBS, NPR, and a growing number of local stations. At recent AMIA conferences PBCore has been a central topic. PBCore has become relevant and possibly important to all moving image archivists, because it fills a black hole in the metadata universe concerning digital media.

Color us surprised when the initial CPB project to develop and support PBCore ran out of money last August. Forthwith, the principle developers at WGBH and elsewhere proposed a second phase, to establish a PBCore change-management process, plus funding to maintain the website, workshops, and other support activities. So far the response from CPB has been PBCore who?

What’s at stake? Considerable time and intellectual effort to develop a really good standard for A/V metadata, something everyone in the moving image community needs. Plus a certain (large) degree of credibility, because CPB was leading the PBCore effort and now we’re in some danger of abandoning PBCore. Like we somehow just forgot about it. With this project, Public Broadcasting has been a hero to the librarians and archivists, but it looks like we’re dropping the ball just when everyone wants to play.

So let’s some of us carry the PBCore torch for the next bit while pushing for further CPB action. We might have to get a bit militant. When someone ticks off the librarians, you don’t want to see what happens. Or maybe you do…

Jack Brighton
WILL Public Media

May 17, 2008

“Radio Engage” Collaboration Enlists Participation, Leverages Open Source

Filed under: open source, collaboration, content management — johntynan @ 5:14 am

Bill Haenel, Dale Hobson and Jack Brighton at Public Media 2008 (Photo Credit: John Tynan)

I’ve worked as a webmaster in public broadcasting for almost a decade. And over the last several years, I’ve seen a slow, pragmatic shift towards increased collaboration in online ventures between local public broadcasting stations and national organizations and producers as evidenced (in NPR’s Podcasting initiative, their relaunch of NPR Music and) in the ongoing Election Collaboration. At the recent Public Media Conference in Los Angeles, Bruce Theriault recalled how he motivated national organizations collaborate around the 2008 Election by saying, “we will only fund this project if there is collaboration across silos - and if its shared with stations.”

And, while this initiative has exercised great strides towards increased cooperation across numerous organizations, it is my opinion that we still have yet to come into our own as a network. As Bruce Theriault says again “we need to get out of the walled garden of public media and allow the public and other institutions a chance to play.” To a greater or lesser degree, these initiatives are still fairly centrally controlled and (aside from the NPR podcasting initiative and Public Media Metrics Project) have yet to truly leverage the unique characteristic of public broadcasting as a distributed, network in general, and more specifically the potential of an open source model of collaboration.

Imagine what we could accomplish if we leveraged the combined efforts of the fifty or so interested and capable web professionals all working at public broadcasting stations (not to mention the larger community of programmers and the general public, many of whom happen to love public media… a lot) who would welcome the opportunity to work together towards a number of shared solutions (many of which would have clear benefits to our audience directly).

With that in mind, two weeks ago, I sent out an email to a half-dozen of my colleagues citing my reasons for why it would be useful to begin collaborating around open standards, common practices, and a common software and scripting platform distributed through an open source license. My email went something like this. I proposed that we form:

1) A co-op for public broadcasters to share code - and costs - where we agree on a similar solar system of scripting resources and practices - where we leverage upon an existing codebase and (ideally) share our efforts among stations and among the open source community as well. When needed, we can collectively raise money to pay outside developers to tailor code to our
needs and - where we are literally invested in the success of this venture and of each others sites.

2) Rather than relying on our own expertise alone to steer this ship, I propose we talk with a hosting provider or a organization like NPower or NTen or grassroots.org (which specializes in supporting non-profits with their technology needs) about providing hosting and (some of) the ongoing support. This way, we could focus on initiatives which we could band together and leverage shared code and programming costs and not have to be reliant on each other for the maintenance of the system.

Anyone who has gotten to know me over the years knows that this is my baileywick (As evidenced from This post from last year’s conference. However it turns out now this idea is not just important to me… or to a few of my friends… just recently…

The Knight Foundation awarded a $327,000.00 grant to Quiddities to develop an open source website and content management tool for KUSP as a model for public radio stations nationwide.

I’m sure the bright folks at the Knight Foundation and KUSP had given this idea a great deal of thought… and I know there are a ton of other excellent ideas percolating within public broadcasting right now as well… but I can’t help feeling like the guy who happened to step in front of the right parade at the right time. What I’m trying to say is this, I can’t take any credit for this grant, but I can say that I’ve seen it coming, and I could not be more delighted for us all!

With that in mind, as a first step in enlisting input from other stations on this project, Steve Laufer from KUSP got on the phone with Bill Haenel from the Integrated Media Association, Dale Hobson from North Country Public Radio, Jack Brighton from WILL, John Tynan (me) from KJZZ, and Matthew Tift from Wisconsin Public Radio to begin to discuss how we might work together on such a project and what first steps we would begin to take.

Some of the tasks that came out of today’s call were to:

  • Set up a wiki to generate and focus some specific questions about what people would want to see in an open source CMS for their radio or television station.
  • Create a survey to identify and prioritize features of the proposed CMS.
  • Identify the skills and interests of people wanting to be involved in this project.
  • Identify what existing project people would be willing to contribute to this endeavor.
  • Identify how this could promote participation (and interoperability) between stations and national producers and our audience.

Please know that these initial impressions of the project are more personal than they are official. Aside from our conference call, I had only talked with Steve Laufer a few times between sessions at Public Media 2008. I have not been privy to the discussions between KUSP, Quiddities and the Knight Foundation. However, I know I’ve been thinking about this for a long time. I am sure that there is more than a handful of people (like me) to whom the principal parties can turn to for assistance and who will be be happy to devote their energies to the project’s success.

Cross posted on my personal site at johntynan.com.

April 2, 2007

Suggested Next Steps from IMA Presenter

Filed under: open source, content management, drupal — johntynan @ 9:37 am

Just got off the phone with Seth Gotlieb (formerly of optaros.com, now at contenthere.net ) he had presented at IMA2007 as part of the discussion on choosing a cms.

Seth had some great advice that helped me form my thinking about how I should proceed as a technologist as well as how the folks rallying together at pubforge.org might best proceed as a group.

As someone who has built a good part of a station site using a particular brand of open source technologies (let’s say, I’ve chosen to drive our station around in the open source equivalent of a Ford), I will be facing a decision, given that there seems to be some considerable intertia in the Chevy camp. But now may not be the time to jump from one moving car to another, at least not yet.

Seth suggested that some good first steps would be for us to:

  • Identify group of stations (or individuals) who are willing to work together around a specific (technology) or goal.
  • Arrange for a week-long training session for the group in a single physical location. Either decide which city you would like to hold this as a group, or decide the city based on where the training is being held. (For plone users, he suggested contacting Joel Burton about a Plone Bootcamp — for drupal users, he suggested talking with Jeff Robbins at lullabot.com).

He went on to say that the benefit of getting together in the same place would:

  • be an indicator of commitment - those who would be willing to travel would be more invested
  • Getting out of the office would allow us to focus better
  • It would be an opportunity to forge bonds socially and increase networking opportunities

He suggested we identify which projects are currently in development (such as the drupal stations modules project, or find/start a broadcasting equivalent to the ploneforartists project). He suggested we identify which aspects of these projects we would like to see improved or added upon. He suggested that we could add an economy of scale by either collaborating on code as a group, or by pooling our cash to pay for additions to the codebase.

He suggested that we check into the pricepoints for training. If we have x number of participants, what will it cost us?

He suggested, in looking for people who would be willing to attend the training, that we should start with the folks who initially put the module together, for instance the drupal station modules were originally designed for KPSU, a college radio station in Portland, Oregon. Maybe this station would be a good place to start with a partnership, and then look outward from there.

I guess that leads to the question, is there a listing of folks from the latest IMA conference who were interested in using Drupal, Plone or alfresco (or perhaps frameworks such as jboss or ruby, or django — or even closed source cms’ like Jack Brighton’s work with expression engine) the list goes on? Do you think such a list should be put together at pubforge.org?

To get a better idea how these discussions might be beneficial to Seth in his work, I asked “what was in it for him?” He replied that he wanted to keep tabs on the progress of these initiatives, that he would be interested in helping us form an organization, for helping us decide how such an entity would be structured, and how we are going to go about making decisions. His emphasis is in identifying the requirements for a product, in product selection, in enabling developers to work together and enabling companies work together using collaborative techniques / open source tools. Perhaps we’ll draw on his expertise again further down the road?

Tags: beyondbroadcast, ima2007, opensourcebroadcasting, pubforge

P.S. I did not realize that the blog at pubforge.org was setup, so I had posted this at the old site at webresources.org. Here are some comments on this post that we’ll want to move over here as well:

Jeff Robbins Says:

I heard my name mentioned and I figured I’d say “hello!” Yes, if you’ve got any questions or need help, we’d be happy to point you in the right direction and/or help you out directly. I’ve got a lot of interest in audio and broadcasting and we use many of the modules that Andrew Morton has written for KPSU on Lullabot.com.

Tim Olson Says:
Perhaps there is an upcoming developer conference/training that is focused on one of those (Ruby on Rails, Plone..) that we could tie this to?


Powered by WordPress