March 14, 2008

Embedded Journalism

I want you to place the text of this blog post on your own site. But I don't want you to do it just by copying and pasting it into your own blogging tool. I think there might be a different way to do it.

Now, I probably obsess over embedded objects and copying and pasting even more than most geeks. When I attended the recent Graphing Social Patterns conference, one of my great frustrations is that people are talking about platforms like Facebook and OpenSocial and MySpace and widgets, but they're leaving out fundamentals like copy and paste. It's a basic capability, but none of these platforms address even basic interoperability for the applications that are built on top of them.

I don't know how we get there; I've written in the past about Live Clipboard, Ajax Linking and Embedding, and more.

Despite all these developments, what's actually taken off with real users is the plain old browser and operating system's copy-and-paste, combined with <embed> or <script> tags to pull in content from other sites. It's powered the rise of YouTube and many of the biggest widget providers. (APIs are of course a big part of this, too; Flickr and Delicious propagated themselves by posting directly to blogs using standard APIs.) But regular people on the web have settled on copying inscrutable, nonstandard HTML markup as a pretty effective way of getting the functionality they want.

But we've only been using this stuff for the most complicated parts of the web, like rich media. What about text?

My blog is mostly text, with some bits of video and images embedded. So, I've created a javascript embed tag at the bottom of every post on my blog, to let you embed the title, an excerpt of the post, and a list of commenters on the post in your own blog or site.

What use is that? I have no idea. Obviously, you could copy and paste the raw text to excerpt it. And certainly, pulilng in a javascript from my site to live on your site means you've got to trust my content, unless it's sandboxed somehow.

But there seems to me to be something really interesting, some kind of potential, to including our posts (or parts of our posts) in other blogs that way, and while I'm no great coder, making the Movable Type templates to do this took about five minutes. I'm hoping something even more interesting comes from the world of compound objects or compound embeds, with a text post containing a video clip or image, and then being included on another page.

So: Has someone done this before? I've made blog templates that output widgets before, but what if we assume every blog post is a widget? How could we address the security issues? What does it mean that the included text and content can be updated remotely? What purpose does this serve, or is it just a really complicated way of copying and pasting text?

1 TrackBack

links for 2008-03-25 from One Man and His Blog on March 25, 2008 4:22 AM

Teaching Online Journalism » Happy newsrooms, sad newsrooms A very positive take on moving journalists into the digital age - help, mentor and enthuse, don't impose. (tags: journalism journalists digitaljournalism training mentoring) Anil Dash: Embedd... Read More

28 Comments

This just sounds like REST and offering up HTML/JSON/XML/JS equivalents based on the format requested.

I think readers get a better feature set from your previous recommendation: http://happynetbox.com/

It's an interesting thought process, at least. No use to us LJers, though.

I've always been sad that rebloging never really took off. Every time I share something in google reader I wish I could add a little note to it which is really only a short step away from that publishing that to a blog.

Some RSS readers are starting to include blog comments in the interface. I think in a lot of ways the "interface" for a reader has been standardized and maybe they will all focus alot more on added social functionality.

I would die if I could highlight text in google reader and hit reblog and have some pre-formated quote ready for me to make snarky remarks about.

haven't blog aggregator sites of various sorts been doing this for years? i'm thinking of things like damnhellasskings.com and such, though there are lots of others...

This is how the reblog feature on tumblr works, isn't it? Granted it's only of use to people using tumblr, but still, it's out there and happening.

Ted Nelson called this idea "transclusion" and it was one of two essential properties of Xanadu that distinguished it from the Web. The key idea here is that you're not including a stale copy of old text, you're including a live copy of whatever the other site has on it at the moment. We do this with images all the time, with occasional amusing results, but as you say it's not really done with text on the web. It's an interesting idea, particularly with dynamic content at fixed links like Wikipedia articles. It's also similar to what people do when they include RSS feeds in little boxes on another page.

The other essential Xanadu property that the Web lacks is bidirectional links. Unfortunately one non-essential property of Xanadu was actually building something anyone would use, so both of the other ideas now languish in obscurity. You might find this 1995ish paper "Where World Wide Web Went Wrong" amusing, or at least breathtaking in its misplaced arrogance. http://www.aus.xanadu.com/xanadu/6w-paper.html

Is this about that little girl on the bike?

My mammalbrain amplified your point about not trusting the outside world's sources with respect to security...

That said, it feels to me like we've been dancing around this (Semantic-Web-ish) inclusion concept for years. In 2004 Simon Willison showed me the output of a widget he built in ad-hoc collab with Tim Bray wherein individual sections or paragraphs of posts are automatically assigned anchors, the point being to make these components easily linkable.

This post tackles the same idea from a different bearing.

(Jebus on a pogo stick, I need more coooffeee...)

The "holy grail" would manifest the following characteristics:

- excerpts at a more or less arbitrary level of detail
- tagging facility in arbitrary context (own site, del.icio.us account, technorati, etc.)
- the option to copy desired excerpts over to one's own system per license (to conserve resources and avoid UX breakage)
- PKI mechanisms for verifying the trustworthiness of a resource (straight out of the def'n of the Semantic Web)

For both the author and the excerpting party, there would be a need to tag material not only in relation to coverage, but also context. Verbiage about, say, RSS would have touched on matters of implementation if written five years ago, but is more likely to touch on usability or versatility now. And in the event that a stretch was excerpted, an automated mechanism for notifying the original author of the context to which the excerpt was assigned would be present in the most useful implementations or platforms.

At the very least, such a sub-medium (where properly used) would make newsgathering simple beyond (current) belief, regardless of whether you're a reporter or a reader.

"I want you to place the text of this blog post on your own site. But I don't want you to do it just by copying and pasting it into your own blogging tool."

Do you not want people to just copy it because of possible lack of attribution and Google's inability to intelligently handle copied content?

It'd be ideal if the copying technique worked with search, e.g., I could copy your post into my site, and search for it in the context of my site.

The thing about "real" transclusion is that the transcluded text / object is literally in more than one place at a time, and that allows one to add more context to things, e.g., someone could put your post in the context of a page about transclusion-like techniques, and it literally would be part of the page.

But, your idea of the blog as a sharable widget is a good idea.

Brad Neuberg (of Dojo.Offline and others, now with Google) has been working on a web-version of transclusions called Purple Include. essentially, it lets you include a normal html tag with a link to an external resource. you can reference a portion of the page by simply indicating the start and end text you want to match (or you can also use XPath to select specific portions of the document). Here is some info on the idea for those that have not heard of it: http://codinginparadise.org/weblog/2007/12/straw-man-proposal-for-purple-include.html

I love the idea of making every blog a widget. For the person who likes to share things, it's great. For the ultimate reader, it's convenient.

Interesting thought. It would also help Google to find out where the source of a bit of text is, as Google won't associate embedded content with another domain, but rather, just index the source domain. On the other hand, this leaves the blog open to abuse; just imagine someone whose content you disagree with who then goes and changes their embedded content, dynamically, to either shock pictures... or to a text that is subtly altered from the original you were citing.

Do you give me permission to do so on my site? Since I am running Drupal, I have a module that can incorporate your feed into my blog using your RSS --and it will provide a linkback to your site and the actual permalink of each post.

The problem though is that either your way or my way wouldn't allow for fisking vis-a-vis quoting. Two entirely different blogging techniques.

I proposed a similar embed idea, not for blog post but for images, about a year and a half ago after I got into hot water with Time-Warner over some images of Brangelina and their new baby Shiloh.

It's thoroughly stupid for image agencies to not offer embedables of their images. Watermarking makes their images look like crap and really, is a picture of George Clooney coming out of a car worth $150 an impression?

It's stupid and ridiculous to try to impose old market standards of pricing photography on their digital distribution ... but we knew that with mp3s.


Going back to the EMBED idea though, the issue here is that sometimes I just want to quote a particular paragraph. So the embed really kills that purpose.

Leeching (the Drupal term) or aggregating your post through RSS as a blog post into my site is just another form of distribution and it's truly what syndication is all about. Again, it doesn't allow for actual quoting.

Quoting is beautiful thing and people shouldn't be paranoid about people taking bits and pieces of their stuff as inspiration for their writing.

Which is why i consider the EMBED more logical for images and other media as opposed to actual text.

here's the article I wrote about EMBEDs

Will Splash News sue themselves out of business by suing Perez Hilton?
http://www.culturekitchen.com/liza/blog/will_splash_news_sue_themselves_out_of_business

Can you imagine how revolutionary it would be if Getty decided to treat most of their editorial photos as stock photography (btw, the bought iStockphoto) and either offered branded EMBEDs or used the same price scale?

They'd blow all the competition out of business.

When will these people learn?

embedded. put in blockquote though i see it came with a purple border... might want that to be customizable... at least, as i noted in my entry, it's not an autoplagiarism mechanism like we had with Radio.

Hi Anil,

I've recently done some related work for sharedcopy - putting commentary (instead of posts) as embeds.

This should provide more info - http://sharedcopy.com/public/widget

Would love your thoughts on it!

just to add that I do try to be conscious of crawlers, etc... thus there's an effort to add sensible blockquote as default, and upgrading to js/iframe when possible

http://www.tumblr.com handles reblogging between different tumblogs really well - it includes a link to the original source and who posted it, along with a history of how it was reblogged if you didn't take it from the original source yourself.

I believe that the limitations in traditional blogging only serve to illuminate a more fundamental challenge to the current model. In a traditional blog model, you make a reference to another blog or online post either via a link or, as you state, straight copy/paste of the original blog text. This may be ok, but it is out of context to the original article itself. And it introduces inefficiencies in creating blog discussions and tracking comments to those discussions. Hence, it is really difficult for the reader to fully comprehend context in the post.

There is a new service available by a social media startup called Bookgoo – www.bookgoo.com – that I believe addresses some of these shortcomings. Bookgoo is different from traditional blogging. First, both online and offline documents can be uploaded as files for commentary. Documents are uploaded unmodified and represent the starting point for the document conversation. Second, users are given tools to markup the document. A document uploaded by a user can be marked up with highlighting, annotations, and free hand scribbles. Third, users can collaborate with their documents in real time by sharing out their document and allowing other users to apply their own highlights and annotations to the document. Each user has their own virtual copy of the document and can apply their own notes and highlights. All user notes applied to the document can then be reviewed by toggling through each user and reviewing the individual markup. The more highlights and annotations users apply onto the document, the louder the conversation becomes.

It is an interesting model and one that I believe addresses some of the issues raised in this discussion. Thoughts?

Cool Thing. It's way to go. I wrote a blogpost on reporTwitters' blog (about Twitter and Journalism)
http://blog.reportwitters.com/2008/03/23/embeddable-newspaper-content-its-slow-take-off/

Wondering if you get any responses from the people that matter here; the newspaper bosses. In my experience, all is very slow. But people WILL see at one point. The very phrase 'copy and paste' has been sophisticated into widgets etc. with in mind presumably issues like author rights (but those are issues that only contribute to the unreal sense that 'online' tends to invoke; smoking gun type content would trigger upheaval but the guns dont smoke so often in the blog-o-sphere believe it or not). In my view, newspapers that successfully embed their content elsewhere (like those that distribute links to their breaking news stories on Twitter) are bringing a sense of reality to the outfit that embeds them. This is where the newspapers have a leading edge; their brand is 'reality'.

Thanks again for a great idea and I will subscribe to your blog.
Regards,
Angelique van Engelen

Isn't that what's Scribd.com is for?

Hashimer! what the hell's a scribber? Coo, coo ca choo! Isn't scribd a repository looking to make fast money on the ignorance of users?

Hashimer! what the hell's a scribber? Coo, coo ca choo! Isn't scribd a repository looking to make fast money on the ignorance of users?

There are limitations with Scribd. And the same problem exists; you cannot have an in-context discussion over the document. Comments reside off the side of the original doc. The Bookgoo experience has all comments on top of the original document or URL - this is right in context with the item being highlighted. Its cool and very compelling.

Hi Anil,

Just had a conversation with Matt Mullenweg and Om about Data Portability and Matt pointed me to this page. Strangely I just wrote about this in a more general form at http://tagschema.com/blogs/tagschema/2008/05/how-many-times-do-i-have-to-tell-you.html
This needs to be part of the HTML standard so that bare text can have a URL that can be invoked and rendered in-place, just as images can, without forcing the user to use an iframe. This sort of "include" mechanism needs to be widely standardized.
Good stuff.


isn't this the same as transclusion?

john
www.cutcaster.com

You may be interested in http://www.tynt.com/

The service allows you to know what's being copied from your site. Also when people copy/paste it inserts a link back to your site.

Leave a comment