PubForge Blog

June 1, 2009

How to: Get your content on the map via geoRSS

Filed under: Uncategorized, mapping, geotag, best practices, content management, How-to — Dale Hobson @ 2:09 pm

Bill Haenel of Haenel Communications Technologies has added geotagging functions to the main content modules of North Country Public Radio’s custom implementation of his open-source CMS, Public Media Manager. These features operate within both the news module and the events calendar module, adding precise geocoordinates to individual stories and events, derived from text addresses entered by the content creator.

These coordinates are exported from the CMS database into the site RSS feeds in “georss:point” format. By passing the geoRSS feed address as a search query to Google Maps, the feed items can then be rendered on a Google map that will update whenever a new item is added to the feed.

Why bother?

The internet is structured to serve communities of affinity much better than it serves communities of residence. A topical search term, for example, gives you a much better presentation of relevant search results than a geographical search term does. That is in part because content creators have traditionally had much better tools available to provide topical metadata, than they have had to provide location metadata. GeoRSS can provide that missing dimension. This is of increasing importance as traditional media continue to disinvest in local reporting. Geocoding resources also add an extra dimension to content syndication and collaborations where place may be as important as topic.

Some example maps:

North Country News Map: the latest twenty audio news features from North Country Public Radio. GeoRSS feed source: http://www.northcountrypublicradio.org/topicalRSS.php

A Year of Hard Choices Series Map: All the stories in an audio series on the economic impact of the recession in Northern New York. GeoRSS feed source: http://www.northcountrypublicradio.org/topicalRSS.php?topic=hardchoices

NCPR Community Calendar Map: All today’s events from the North Country Public Radio Community Calendar. GeoRss feed source: http://www.northcountrypublicradio.org/upnorth/comcal/rss.php

Basic work strategy:

To get buy-in from content creators, the process had to be dirt simple from their end. On the news side, reporters were provided with an Ajax look-up tool as part of the story submission form. They enter a text-based address, such “Canton, NY” or “80 E. Main St., Canton NY.” This queries the Google Maps API, returning latitude and longitude for the address. These coordinates are written into the story metadata in the news database when the form submits. On the calendar side, prexisting venue addresses were converted into latitude and longitude via a bulk query to the Google Maps API and were added to the venue table in the calendar database. A tool was then added to the venue creation form that would do the look-up as a background process upon creation of new event venues. In this way, content creators were not required to enter anything but a standard text address.

Feed modification:

To exploit the new geodata, the existing site feeds needed to be modified. The feed type had to be modified to reflect the geoRSS namespace. At the item level, the latitude and longitude had to be called out in valid georss:point form. And, of course, the feed needs to be valid RSS in all the usual ways as well.

Address to geopoint “on the fly”

Why use latitude and longitude in the database when the Google Maps API can do the look-ups on the fly? This can be done in theory, but runs into problems in application. If you are mapping only a few items and your traffic is low, this can work. We found that we quickly used up our query limit (15,000 queries per IP in a 24-hour period) at the Google Maps API using this method. Once the limit is reached, Google will not process the feed until the new 24-hour period begins. The better approach is to query the API once when the content is created. Thereafter, the feed item delivers latitude and longitude, which is read without further processing. Also, the number of lookups requested by a text-addressed feed hugely slowed the rendering of the map, and would often time out before completion.

Untagged content:

We found that it was important to have a default location for when no specific location had been applied to the content. In our implementation, we use our headquarters location as the default. This way, all items in the feed have a valid geoRSS:point tag.

Multiple items at the same location:

Google Maps does not stagger map pins for multiple items at the same location. This means that only the newest item at the location can be delivered by clicking on the map. This is a practical problem for NCPR, where we might have multiple reports from our state capital correspondent all located as “Albany, NY.” This can be addressed in a number of ways. Applying different street adresses to each item will differentiate them on the map. In practice, I have been manually editing the coordinates, offsetting them by one minute of latitude or longitude in different directions. A better approach we are looking into is to introduce a random “fuzziness” factor into locations that do not have street addresses. This would provide a unique latitude and longitude to each item, exposing them as discrete pins on the map.

More information:

For stategic implementation and other topics fuzzy enough for an English major, write to me: dale at ncpr dot org. For code and other deep geek: write to bill at hcomtech dot com.

April 11, 2009

Curating content

Filed under: content management, social media, open content — John McMellen @ 4:33 pm

Recalling a theme I heard throughout the Public Media Conference this year, I have been experimenting with a Google tool designed to tag and curate content. I have used Google Reader before, but never really thought it did anything that useful that my Outlook didn’t do. Then I found the Shared Items feature. What’s neat about this is not that you can share interesting information with other Google Readers users, which you can; but that you can pull in RSS feeds as well as make note of any webpage using the Google Reader bookmarklet, tag individual items, and output the stream of information as a standard RSS feed that could be subscribed to by anyone, or even fed to another CMS or social media system. This seems like a very simple way to ingest just about any kind of interesting content (text, podcast, video, etc.) and aggregate it into a standard format with very little editting or coding. I think it is a great way for staff at a media organization to share items that they think might edify their audience, and since it produces a standard format, it could easily be integrated into the organizations website or fed directly to subscribers.

 You can find an RSS feed of items that I have tagged as #publicmedia here.

April 10, 2009

How to: Use Twitter as part of the public media toolbox

Filed under: Uncategorized, best practices, content management, social media, How-to — Dale Hobson @ 1:46 pm

It took me a long time to warm up to Twitter. On first glance, it looked like a huge potential time-suck with little payback–one more social media platform to distract my audience from my main site. I’d see that question header “What are you doing?” and the answer was always the same–trying to figure out what this is good for. However, over time I began to see instances where it served my immediate needs as a web manager, and could serve my audience. A successful strategy to use Twitter will focus on places where those two converge.

Example 1: Tweeting the pledge drive

For years, my membership director had wanted a simple mechanism to project frequently-changing messages onto the home page during pledge drives–to update totals and new members, to promote special drawings and to point to special content. Twitter makes that easy, even in environments where non-web staff are not normally able to publish directly to the website. Here’s how:

  1. Open a twitter account
  2. Share log-in info with the person/people who will be posting. They can either post directly from the Twitter site (http://twitter.com/yourusername), or they can download a Twitter client such as twhirl that will allow them to post from their desktop or mobile device.
  3. Log-in to Twitter. Select “Apps” from the menu at the bottom of the page. Select “Widgets” from the Applications page
  4. Select “other” if the feed is going to display on your domain site.
  5. Choose either the Flash widget or the HTML widget. (You can control the appearance of the HTML widget using CSS)
  6. Copy the code for your selected widget and paste it into the page at your site where you want the feed to appear.

This same process could be used to make the home page available to a reporter covering a breaking news story, or to report quickly evolving content such as election returns.

Example 2: Syndicating existing news and blog content automatically via Twitter

Tweeting in real time is burdensome. However, you may have content from your site CMS and from blogs that can be automatically fed to Twitter via RSS feed. Here’s how (using twitterfeed):

  1. Open a Twitter account
  2. Open a twitterfeed account
  3. Log in at twitterfeed and select “go to my twitter feeds (or create a new one)”
  4. Select “Create new feed”
  5. Enter your Twitter username and password, and the URL of the RSS feed you want to send to Twitter. For best results at Twitter, configure the feed to show title only, to include the item link, and to shorten the link address.
  6. Select “Create”
  7. Repeat 4-6 to include multiple feeds.

Each time a new item enters your RSS feed/s, a tweet will go out containing the headline and link.

Example 3: Aggregating your content from multiple sources into one location using Twitter

One of the disadvantages of the Web 2.0 environment is that it can fragment your content and your audience across many locations. You may have a main news site, several blogs, multiple comment mechanisms, a Facebook page, a Flickr photo-sharing account, etc. Twitter provides a simple mechanism that can bring all that content together back on your site. It employs a combination of examples 1 & 2 above. Here’s how:

  1. Open a Twitter account (see Example 1. above)
  2. Open a twitterfeed account
  3. Add all the RSS feeds for the services you want to aggregate to twitterfeed (see Example 2. above)
  4. Get a Twitter widget (see Example 1. above)
  5. Paste the widget embed code where you want the aggregation to appear in your page.

Now, whenever anyone posts a news story, makes a blog post or comment, writes on your Facebook wall, uploads a photo, whatever–it will appear on your main site in a single location.

April 7, 2009

Jeffrey Zeldman on the future of open source, CSS, and CMSs

Filed under: open source, collaboration, best practices, content management, sustainability — Jack Brighton @ 4:13 pm

Jeffrey Zeldman, publisher and editor-in-chief of A List Apart, spoke to several interesting web development issues in a new video posted on Big Think. Though not as smart or celebrated as Zeldman, I’ve been thinking similar things: we’re at a point where open source tools, and open sharing of core ideas and code, enable any one of us to build great standards-compliant websites.

I was a bit floored at about 3:30 into this video when Zeldman reveals that A List Apart is moving from a Ruby on Rails CMS to ExpressionEngine.  As PubForge people may well know, I’m a big fan of EE and have been using it to build all my websites for the past two years.  Some in the open source community regard EE as a compromise in principles or something, because it’s produced, sold, and maintained by a company so its not technically open source.  My feeling has always been that EE is based on open source technologies (Apache, MySQL, and PHP), has an open API, and a huge community of developers working collaboratively around it.  The company that produced EE, EllisLab, supports it fanatically and continues to improve the product. And it only costs $99 for nonprofits (plus potentially additional dollars for certain add-ons).  EE isn’t open source in the same way Red Hat Linux isn’t.

I just want to say without offending anyone that while the whole world seems to be drinking the Drupal coolade, I think there are other legitimate CMS choices.  It seems to me the bottom line is adhering to web standards, good information architecture, sustainability of URIs, and the ability to interoperate with other systems.  That might be a good place to start for building a truly smart online public media system, no matter what CMS is involved.

January 15, 2009

Notes on PubForge Conference Call, 1/15/2009

Filed under: open source, collaboration, best practices, content management — Jack Brighton @ 7:12 pm

Today Matthew, TC, and I had a nice phone conversation as part of the PubForge coordinating group.  In the absence of Bill and Dale, the agenda was a bit unfocused, but that lead us down some interested paths.

For one thing, we decided PubForge could be useful for publishing notes on a wide variety of topics, including software and tools we’re using or evaluating for possible use, and just stuff that might be cool, interesting, or whatever.  With that in mind, in no particular order, her are some notes on today’s call:

1) TC says she is being asked to develop some Flash animations for her site, apparently in the form of an animated logo.  We feel her pain, and hope her people will listen to PubForge when we say “don’t do it!” Flash can be great for certain things, but animated logos are no longer cool. We could also talk about accessibility here.

Cool things can certainly be done with Flash as an interface to content, for example I mentioned this page from The New York Times: http://www.nytimes.com/interactive/2009/01/15/us/politics/20090115_HOPE.html?hp.

2) A discussion of CRM software occured, in which the name Convio figured prominently.  Jack said his view of Convio is favorable but limited.  Matthew said integrating it with Allegiance is problematic.  We all said this could be a useful discussion to have with more people, to find some good experiences and best practices.

3) TC mentioned a couple of webmasterly things that look really interesting:

  • An open source CMS called Concrete5 (http://www.concrete5.org/) she is working with.  We didn’t discuss lots of details about it, but on the surface it looks easy to use, and follows the MVC (model view controller) approach like Drupal.
  • A cool-looking CSS framework called Bluetrip (http://bluetrip.org/), enabling simple and valid multicolumn layouts, including support for the dreaded IE.

4) Jack mentioned an online video tool called Tubemogul (http://www.tubemogul.com/index.php) which allows you to upload a video to many of the top websites (including Google Video, MetaCafe, MySpace, AOL, Yahoo!, Revver, YouTube, etc) in one fell swoop.  It also gives you analytics.  I haven’t explored it yet, but PBS has adopted it for publishing its online video.  That either makes it cool or not-cool, I can’t keep track…

5) Finally, let me mention that I accepted the task of organizing a CMS Roundtable session at the IMA Conference next month. Which means I have to find panelists to talk about various CMSs.  So I’m looking for panelists with experience using Drupal, Plone, Joomla, and Rails.  Would also consider expanding the lineup to include other systems and frameworks, especially open source ones.  I invited TC to come talk about Concrete5, and I hope it gets her to the conference!

I think we might end up with a formal CMS Roundtable session, and another informal CMS discussion, because one session isn’t enough to cover the topic.  The information CMS session will likely also include beer.

If I forgot something, I hope the other participants in today’s PubForge call will follow up.

Yours,

Jack

December 8, 2008

Metadata for Social Networking sites

Filed under: best practices, content management, social media — John McMellen @ 5:19 pm

Cross-posted from Publist

I thought I would throw this bit of information out there FWIW. I had been wondering how people were able to get such great meta information into the links and stories that they posted on Facebook, especially when it came to stories from news websites or blogs. I found the page on Facebook’s sharing service here http://www.facebook.com/share_partners.php. It listed several meta and link tags that help the share service work better. I added these tags

  • <meta name=”title” content=”page_title” />
  • <meta name=”description” content=”audio_description” />
  • <link rel=”image_src” xhref=”audio_image_src url (eg. album art)” />
  • <link rel=”audio_src” xhref=”audio_src url” />
  • <meta name=”audio_type” content=”Content-Type header field” />
  • <meta name=”audio_title” content=”audio_title (eg. song name)” />
  • <meta name=”audio_artist” content=”audio_artist_name” />

systematically in our CMS, modifying the code to pick up the relevant content from the database when the page is hit. It works really well, and I noticed that Digg uses the same format, leading me to believe that this is becoming an important technique for distributing our content. I apologize if this is old hat to you, but I thought that someone else might benefit from this little tidbit.

September 24, 2008

PubMedia CMS feature request

Filed under: open source, content management, drupal — Jack Brighton @ 11:21 am

(This post began life as an email thread, but maybe needs to be more public so here it is.  Edited and expanded for obsessive clarity…)

It strikes me as somewhat simple (OK maybe not exactly simple) to develop a Drupal-based CMS with enough commonly-needed features for public radio/TV stations.  You’d have your pre-built data types, skinable templates, forms, and possibly a set of pre-defined roles and workflows.  All nicely documented etc.

But what we all really want is a system that knows about media files.  You could upload (or link to) a media object, and the CMS would extract its available metadata.  The system would then save that metadata in its database for processing and display in various ways.  On web pages where media is published, the system would display its media type, length, bitrate, framerate, whatever.  Then of course we’d be adding by hand other metadata like title, subject, author, keywords, description, etc as we add media content to the website.  Ideally, the system would be able to automatically read ID3 tags, MXF, and EXIF metadata for both technical and descriptive information.  The idea is to automate the capturing of metadata as much as possible.

For web pages, we’d probably want to display mostly descriptive metadata, and not things like sampling rate, bit depth, color format, etc.

But for RSS feeds we need some of that technical metadata like filesize and mimetype.

And here’s the good part: If we capture enough information about our media objects, we can easily express it as “shareable metadata” via PBCore-compliant XML, and other standard schema.  So the CMS becomes a powerful tool for creating a large index of public media.  We can then write applications to search that index at a very fine level of detail.  Think Technorarti only focused entirely on media objects expressed as detailed XML records.

At WILL we currently catalog media objects (as I call them) using our CMS, but there’s no automatic extraction of anything.  We have to key in all the data.  But once that’s done, the output looks like this:

http://will.illinois.edu/metadata/pbcore/pf2008-04-17-a

Seems to me this is the beginning of a system-wide super API that doesn’t depend on any central organization, and is truly open source.

Existing open source PHP functions for automated metadata extraction could be integrated in a Drupal-like CMS.  The PHP ID3 function allows for reading and manipulating ID3 tags; the PHP Exif Functions can extract all kinds of metadata from JPEGs and TIFFs.  Similar functions may already exist for video files.

If we have a CMS that understands how to read existing metadata from the digital objects we feed it, we’re half-way to building an online digital asset management system.  More on that in Part Two…

Jack Brighton

August 21, 2008

Stupid API Tricks

Filed under: open source, collaboration, content management — Jack Brighton @ 2:45 pm

What have we done with the new NPR API? This would be a good place for people to share examples of cool mashups and apps they’ve devised to tap NPR’s open content. Or to suggest ideas on which we could perhaps collaborate.

Here’s one of mine: What if I tagged my news stories, interviews, and other content with keywords based on the NPR taxonomy (or even just my own keywords) and when I publish that content on my website, it generates a query to the NPR API? I could have a widget sitting next to my content pulling in related stuff dynamically. As the NPR API expands its reach to other public media sources, each content entry on my site becomes an entry point to a growing universe of related content. Of course that universe might get pretty big, so what if we wrote a script that could then parse everything and generate a navigational structure based on the metadata returned with the results of the query. So the query results would evolve over time, and so would the navigational structure.

What if next we expand the range of sources to query based on the metadata of our initial content? Scientific and cultural institutions have large collections of content, many accessible through an API. Funding is increasingly premised on open collections and public research results. What if we tap into that, so a given media object can serve semantically and programatically as merely a starting point to explore a growing web of deep knowledge and perspective?

Maybe that kind of language sounds cheesy or something, but it seems like a fun thing to me.  On the other hand, maybe let’s start with the NPR API and go from there…

Jack Brighton
WILL Public Media

The PBCore Saga: An Update

Filed under: open source, best practices, content management — Jack Brighton @ 11:30 am

Those of us consumed with passion about metadata for A/V objects (and who isn’t…) have been excited by the emergence of the PBCore. We present here an update.

In our last dramatic PBCore episode, CPB funded a multi-year project to develop a standard for shareable metadata about audio and video productions and files. This culminated in the release of the PBCore Data Dictionary and an associated XML schema, with Version 1.0 in April 2005, and an improved Version 1.1 in January 2007.

We’ll leave an actual description of PBCore for another time and place, or get full details on the PBCore site.

It turns out PBCore is darn useful. Film archives, academic media collections, and media curators including the Library of Congress are actively pursuing systems that speak PBCore. Not to mention PBS, NPR, and a growing number of local stations. At recent AMIA conferences PBCore has been a central topic. PBCore has become relevant and possibly important to all moving image archivists, because it fills a black hole in the metadata universe concerning digital media.

Color us surprised when the initial CPB project to develop and support PBCore ran out of money last August. Forthwith, the principle developers at WGBH and elsewhere proposed a second phase, to establish a PBCore change-management process, plus funding to maintain the website, workshops, and other support activities. So far the response from CPB has been PBCore who?

What’s at stake? Considerable time and intellectual effort to develop a really good standard for A/V metadata, something everyone in the moving image community needs. Plus a certain (large) degree of credibility, because CPB was leading the PBCore effort and now we’re in some danger of abandoning PBCore. Like we somehow just forgot about it. With this project, Public Broadcasting has been a hero to the librarians and archivists, but it looks like we’re dropping the ball just when everyone wants to play.

So let’s some of us carry the PBCore torch for the next bit while pushing for further CPB action. We might have to get a bit militant. When someone ticks off the librarians, you don’t want to see what happens. Or maybe you do…

Jack Brighton
WILL Public Media

May 17, 2008

“Radio Engage” Collaboration Enlists Participation, Leverages Open Source

Filed under: open source, collaboration, content management — johntynan @ 5:14 am

Bill Haenel, Dale Hobson and Jack Brighton at Public Media 2008 (Photo Credit: John Tynan)

I’ve worked as a webmaster in public broadcasting for almost a decade. And over the last several years, I’ve seen a slow, pragmatic shift towards increased collaboration in online ventures between local public broadcasting stations and national organizations and producers as evidenced (in NPR’s Podcasting initiative, their relaunch of NPR Music and) in the ongoing Election Collaboration. At the recent Public Media Conference in Los Angeles, Bruce Theriault recalled how he motivated national organizations collaborate around the 2008 Election by saying, “we will only fund this project if there is collaboration across silos - and if its shared with stations.”

And, while this initiative has exercised great strides towards increased cooperation across numerous organizations, it is my opinion that we still have yet to come into our own as a network. As Bruce Theriault says again “we need to get out of the walled garden of public media and allow the public and other institutions a chance to play.” To a greater or lesser degree, these initiatives are still fairly centrally controlled and (aside from the NPR podcasting initiative and Public Media Metrics Project) have yet to truly leverage the unique characteristic of public broadcasting as a distributed, network in general, and more specifically the potential of an open source model of collaboration.

Imagine what we could accomplish if we leveraged the combined efforts of the fifty or so interested and capable web professionals all working at public broadcasting stations (not to mention the larger community of programmers and the general public, many of whom happen to love public media… a lot) who would welcome the opportunity to work together towards a number of shared solutions (many of which would have clear benefits to our audience directly).

With that in mind, two weeks ago, I sent out an email to a half-dozen of my colleagues citing my reasons for why it would be useful to begin collaborating around open standards, common practices, and a common software and scripting platform distributed through an open source license. My email went something like this. I proposed that we form:

1) A co-op for public broadcasters to share code - and costs - where we agree on a similar solar system of scripting resources and practices - where we leverage upon an existing codebase and (ideally) share our efforts among stations and among the open source community as well. When needed, we can collectively raise money to pay outside developers to tailor code to our
needs and - where we are literally invested in the success of this venture and of each others sites.

2) Rather than relying on our own expertise alone to steer this ship, I propose we talk with a hosting provider or a organization like NPower or NTen or grassroots.org (which specializes in supporting non-profits with their technology needs) about providing hosting and (some of) the ongoing support. This way, we could focus on initiatives which we could band together and leverage shared code and programming costs and not have to be reliant on each other for the maintenance of the system.

Anyone who has gotten to know me over the years knows that this is my baileywick (As evidenced from This post from last year’s conference. However it turns out now this idea is not just important to me… or to a few of my friends… just recently…

The Knight Foundation awarded a $327,000.00 grant to Quiddities to develop an open source website and content management tool for KUSP as a model for public radio stations nationwide.

I’m sure the bright folks at the Knight Foundation and KUSP had given this idea a great deal of thought… and I know there are a ton of other excellent ideas percolating within public broadcasting right now as well… but I can’t help feeling like the guy who happened to step in front of the right parade at the right time. What I’m trying to say is this, I can’t take any credit for this grant, but I can say that I’ve seen it coming, and I could not be more delighted for us all!

With that in mind, as a first step in enlisting input from other stations on this project, Steve Laufer from KUSP got on the phone with Bill Haenel from the Integrated Media Association, Dale Hobson from North Country Public Radio, Jack Brighton from WILL, John Tynan (me) from KJZZ, and Matthew Tift from Wisconsin Public Radio to begin to discuss how we might work together on such a project and what first steps we would begin to take.

Some of the tasks that came out of today’s call were to:

  • Set up a wiki to generate and focus some specific questions about what people would want to see in an open source CMS for their radio or television station.
  • Create a survey to identify and prioritize features of the proposed CMS.
  • Identify the skills and interests of people wanting to be involved in this project.
  • Identify what existing project people would be willing to contribute to this endeavor.
  • Identify how this could promote participation (and interoperability) between stations and national producers and our audience.

Please know that these initial impressions of the project are more personal than they are official. Aside from our conference call, I had only talked with Steve Laufer a few times between sessions at Public Media 2008. I have not been privy to the discussions between KUSP, Quiddities and the Knight Foundation. However, I know I’ve been thinking about this for a long time. I am sure that there is more than a handful of people (like me) to whom the principal parties can turn to for assistance and who will be be happy to devote their energies to the project’s success.

Cross posted on my personal site at johntynan.com.

Next Page »

Powered by WordPress