Protocol Summary

Introduction

Conversations are becoming distributed and fragmented on the Web. Content is increasingly syndicated and re-aggregated beyond its original context. Technologies such as RSS, Atom, and PubSubHubbub allow for a real time flow of updates to readers, but this leads to a fragmentation of conversations. The comments, ratings, and annotations increasingly happen at the aggregator and are invisible to the original source.

The Salmon Protocol is an open, simple, standards-based solution that lets aggregators and sources unify the conversations. It focuses initially on public conversations around public content.

There is a detailed specification for Salmon available, along with a separate specification for the signature mechanism. Please refer to those specifications for the most up to date information.

Protocol Flow

A source provides an RSS/Atom feed of content. It includes a Salmon link in its feed:

<link rel="salmon" class="example.org/salmon-endpoint"/>

An aggregator reads the feed (ideally via a push mechanism such as PubSubHubbub), and sees from the link that it is Salmon-enabled. It remembers the endpoint URL for later use.

When an aggregator's user leaves a comment on a feed item, the aggregator stores the comment as usual, and then also POSTs a salmon version of it to the source's Salmon endpoint:

POST /salmon-endpoint HTTP/1.1

Host: example.org

Content-Type: application/atom+xml

<?xml version='1.0' encoding='UTF-8'?>

<me:env xmlns:me="salmon-protocol.org/ns/magic-env">

    <me:data type='application/atom+xml'>
    PD94bWwgdmVyc2lvbj0nMS4wJyBlbmNvZGluZz0nVVRGLTgnPz4KPGVudHJ5IHhtbG5zPS
    dodHRwOi8vd3d3LnczLm9yZy8yMDA1L0F0b20nPgogIDxpZD50YWc6ZXhhbXBsZS5jb20s
    MjAwOTpjbXQtMC40NDc3NTcxODwvaWQ-ICAKICA8YXV0aG9yPjxuYW1lPnRlc3RAZXhhbX
    BsZS5jb208L25hbWUPHVyaT5hY2N0OmpwYW56ZXJAZ29vZ2xlLmNvbTwvdXJpPjwvYXV0a
    G9yPgogIDx0aHI6aW4tcmVwbHktdG8geG1sbnM6dGhyPSdodHRwOi8vcHVybC5vcmcvc3l
    uZGljYXRpb24vdGhyZWFkLzEuMCcKICAgICAgcmVmPSd0YWc6YmxvZ2dlci5jb20sMTk5O
    TpibG9nLTg5MzU5MTM3NDMxMzMxMjczNy5wb3N0LTM4NjE2NjMyNTg1Mzg4NTc5NTQnPnR
    hZzpibG9nZ2VyLmNvbSwxOTk5OmJsb2ctODkzNTkxMzc0MzEzMzEyNzM3LnBvc3QtMzg2M
    TY2MzI1ODUzODg1Nzk1NAogIDwvdGhyOmluLXJlcGx5LXRvPgogIDxjb250ZW50PlNhbG1
    vbiBzd2ltIHVwc3RyZWFtITwvY29udGVudD4KICA8dGl0bGUU2FsbW9uIHN3aW0gdXBzdH
    JlYW0hPC90aXRsZT4KICA8dXBkYXRlZD4yMDA5LTEyLTE4VDIwOjA0OjAzWjwvdXBkYXRl
    ZD4KPC9lbnRyeT4KICAgIA
    </me:data>
    <me:encoding>base64url</me: <me:alg>RSA-SHA256</me:alg>
    <me:sig>
    EvGSD2vi8qYcveHnb-rrlok07qnCXjn8YSeCDDXlbhILSabgvNsPpbe76up8w63i2f
    WHvLKJzeGLKfyHg8ZomQ
    </me:sig>
</me:env>

The source responds to the salmon with standard HTTP codes - 2xx for OK, 4xx for input problem, 5xx for source / server error. The usual result is for the salmon to be published along with other comments on the source's web page. Note that sources are not obligated to actually publish the salmon -- they may moderate them, spam block them, aggregate or analyze them instead. However, if the source does publish the salmon in a comment feed, it has to maintain certain fields to make the protocol work end-to-end.

Source Republishing Requirements

These requirements kick in if a source republishes salmon alongside native comments, and are intended as traffic signals to ensure smooth operation of the protocol for everybody.

Include the standard rel="comments" link in the original entry to point at all published comments including salmon.
Only publish salmon whose signatures validate (see below)
Must maintain and re-syndicate the me:provenance element to be used for downstream re-validation.
If an aggregator sees a comment come back to it with a known atom:id or crosspost:source/id, it should use that for any necessary correlation or de-duping. (See the draft spec at martin.atkins.me.uk/specs/atomcrosspost.)

The end result of the protocol is that sources and aggregators can co-operate to present a unified view of the global conversation around any topic represented by an RSS/Atom feed item.

User Experience

Users should be made aware of the publishing scope of the comments they leave. For some aggregators, this may be implied (all data is public), for others a warning or a checkbox may be necessary. We suggest enabling Salmon only when the original content is itself publicly visible. For simplicity, Salmon does not attempt to deal with private data or distributed access control, though these can be addressed in future extensions.

Security and Abuse Prevention

A major concern with this type of distributed protocol is how to prevent spam and abuse. Salmon provides building blocks to allow in-depth defense against attacks. Specifically, every salmon has a verifiable author and user agent. The basic security flow when salmon swims upstream looks like this:

aggregator-example.com: "Here is a salmon, authored and signed by 'acct:johndoe@aggregator-example.com'; please accept it.
Recipient: Uses LRDD/Webfinger/XRD to discover IdP for acct:johndoe@aggregator-example.com, which turns out to be owned by aggregator-example.com.
Recipient: Verifies signature using retrieved public key, which checks out. "Since the signature verifies, I'll accept this salmon!". (Returns HTTP 200 to aggregator-example.com)

The flow can get more complicated, especially if the aggregator is not also providing identity services for the user.

As a convenience, anyone can run a salmon validator service that does step 3 as a public service. Anyone who is willing to trust the salmon validator service can use it. So in the simplest possible case, depending on a validator service and not using OAuth to verify the sending service, the flow can be:

aggregator-example.com: "Here is a salmon, authored and signed by 'acct:johndoe@aggregator-example.com'; please accept it."
Recipient: "Hey validator, does this salmon check out?" Takes salmon and POSTs to its favorite validator, https://salmon-validator.example.com/.
Validator: "Yup, looks good!" Returns HTTP 200 to recipient.
Recipient: Returns HTTP 200 to aggregator-example.com.

This requires the recipient only to understand the data format and have an https library. The service is simply a convenience, not a central mechanism; the actual validation is always done via the public key signature contained in the salmon element, using the mechanism described in the Magic Signatures specification. Thus recipients do not need to depend on the validator service.

The Salmon validation step is intended as a first line of defense that lets other reputation and rate limiting mechanisms kick in. That is, it allows recipients to assign a fixed quota to authors,and IdPs, and to block those who are exceeding reasonable limits; it allows them to build up reputations for all three, to white and blacklist, and to federate as needed; and it allows third parties to double check the results as well (if a source simply makes up salmon comments, they will not validate; if it tampers with comments, they can be correlated via IDs and the tampering detected and exposed).

For flexibility and interoperability, salmon may be modified in reasonable ways before republishing. For example, truncating salmon to fit within a services' length limit; translating character set encodings; and even translation into another language would all be reasonable. But in all of these cases the me:provenance element will contain the original data as well.

Generating Signatures

Signatures are generated per the the Magic Signatures specification.

Activity Streams

Salmon is intended to be Activity Stream-compatible. Salmon endpoints should be able to accept Activity Stream activities as well as generic Atom formatted comments -- in fact a generic Atom formatted comment is also a valid activity in the current AS spec. Liking, rating, and linking to content all contributes to conversations. Note that the endpoint may not understand or accept all types of activities, and we would like to have a way for an endpoint to declare up front what kinds of activities it is prepared to accept (if for no other reason, to avoid bothering end users with checkboxes that won't do anything).

Other Formats

We believe there should be an alternative JSON format for salmon, and hope that we can simply adopt the Activity Streams JSON format.

Salmon supports RSS streams, and could also allow POSTing of an RSS formatted salmon. The exact details need to be worked out, but a natural representation would be:

POST /salmon-endpoint HTTP/1.1
Host: example.org
Content-Type: text/xml

<item>
   <description>Yes, but what about the llamas?</description>
   <pubDate>Tue, 03 Jun 2009 09:39:21 GMT</pubDate>

    <guid>tag:aggregator-example.com,2009:cmt-441071406174557701</guid>

...xpost:source, atom:author, sal:signature as above...

</item>

References

(See references in the specification documents.)

www.hixie.ch/specs/pingback/pingback

Author(s)

John Panzer (jpanzer@google.com, www.abstractioneer.org)

Comments