Pub/Sub at the edge with Fanout

IMPORTANT: The content on this page uses the following versions of Compute SDKs: Rust SDK: 0.9.1 (current is 0.10.0, see changes), JavaScript SDK: 3.1.1 (current is 3.13.1, see changes)

Fanout is a publish/subscribe message broker operating at the Fastly edge, powered by Pushpin. This makes it easy to build and scale real-time/streaming applications which push data instantly to browsers, mobile apps, servers, and other devices.

Without Fanout, clients make short-lived HTTP requests to Fastly, and Fastly responds with a resource from cache or an origin server. When a request is passed to Fanout, it keeps that connection open indefinitely, assigns the connection to one or more channels, and waits for data to be published to those channels. Origin servers or Fastly services may use an API to publish data to a channel, and Fanout will distribute that data to all the clients which have a subscription to that channel.

Any regular HTTP request may be upgraded into an event-driven response using a transport such as Server-Sent-Events, or Long-Polling. If the request from the client is a WebSocket request, the resulting stream is bidirectional.

This architecture has some nice advantages over more proprietary streaming services:

Integrated into your Fastly service, so any pre-processing you do to requests to your Fastly service (e.g. for authentication), you can do to streaming requests too.
Any HTTP client or server can be used - including serverless / functions-as-a-service.
Any HTTP response can be turned into an event stream (e.g. a progressive JPEG, a log file, or an API endpoint that performs asynchronous operations).
No need to use a separate domain for streaming data.

Quick start

To use Fanout, you need a paid Fastly account containing a compatible Compute service. A free trial can be enabled in the 'Fanout' section of the service configuration options in the web interface.

There are many different ways of using Fanout, but using a fully featured starter kit is a good option if you'd like to quickly see what it can do.

Create a project from template

To create a Compute project with Fanout pre-configured, use fastly compute init:

$ fastly compute init --from=https://github.com/fastly/compute-starter-kit-rust-fanout

You can then compile and publish it to a live Fastly service using fastly compute publish:

$ fastly compute publish
Create new service: [y/N] y

Domain: [some-funky-words.edgecompute.app]
Backend (hostname or IP address, or leave blank to stop adding backends):

✓ Creating domain 'some-funky-words.edgecompute.app'...
✓ Uploading package...
✓ Activating version...

SUCCESS: Deployed package (service 0eBOC1x5Q0HHadAlpeKbvt, version 1)

Add a backend

Fanout communicates with a backend server to get instructions on what to do with each new connection. To make it easier to get started, the starter kit is configured to allow the Compute service to act as the backend. To make use of this, add a backend called self to your service that directs requests back to the service itself:

$ fastly backend create --name self -s {SERVICE_ID} --address {PUBLIC_DOMAIN} --port 443 --version latest --autoclone
$ fastly service-version activate --version latest -s {SERVICE_ID}

The {SERVICE_ID} and {PUBLIC_DOMAIN} should be replaced by the values shown in the output from the publish step.

IMPORTANT: If you use the web interface or API to create the backend, ensure to set a host header override if your server's hosting is name-based. Learn more.

Enable Fanout

Fanout is an optional upgrade to Fastly service plans. If you have not yet purchased access, contact sales, or start a free trial by enabling the toggle in the web interface on any Compute service.

If your Fastly account has full access to Fanout, it can be enabled on an individual service in the web interface or by enabling the fanout product using the product enablement API.

Authenticating

To make the API calls to perform the publishing actions described in the section below, you'll need a Fastly API Token that has the global scope for your service.

Test the service

The starter kit project is set up to create long-lived connections for requests to /stream/sse, /stream/plain, /stream/websocket, and /stream/long-poll. It uses itself as the stream origin to start the streams, and subscribes all clients to two channels called test and foo. It also returns "Hello world" as an ordinary HTTP response if you send a request to the root path /.

You can now test the starter kit using any of the supported transports:

Server-Sent Events
Long polling
WebSockets

In one terminal window, make an HTTP request for /stream/sse:

$ curl -i "https://some-funky-words.edgecompute.app/stream/sse"
HTTP/2 200
content-type: text/event-stream
x-served-by: cache-lhr7380-LHR
date: Tue, 23 Aug 2022 12:48:05 GMT

You'll see output such as the above but you won't return to the shell prompt. Now, in another terminal window, run:

$ curl -H "Fastly-Key: {YOUR_FASTLY_TOKEN}" -d '{"items":[{"channel":"test","formats":{"http-stream":{"content": "event: message\ndata: {\"text\": \"hello world\"}\n\n"}}}]}' https://api.fastly.com/service/{SERVICE_ID}/publish/

The published data includes an http-stream representation of your data, which Fastly can use for SSE connections. The event you published appears on your curl output:

event: message
data: {"text": "hello world"}

You can continue to publish more events, and they will be appended to the event stream response.

The pattern created by the starter kit is well suited to use cases where you'll know at the edge what channels the client should be subscribed to, and means your origin only deals with publishing events, rather than also negotiating the setup of streams.

Next steps

Now you have an operational Fanout message broker operating on a Fastly service. Consider how you might want to modify this setup to suit your needs:

Learn more about subscribing, including examples of the front-end JavaScript code you need to interact with streams.
Learn more about publishing, including simple libraries that can abstract the complexity of message formatting for you.
If you only need one kind of transport (e.g., WebSockets, and not SSE), feel free to remove the code that enables the other transports.
If you prefer to have your origin server do the stream setup, then most of the edge code is no longer needed. See Connection setup below.
If you intend to use the new service in production, you'll want to add at least one origin server, a domain, and consider how you want Fastly to cache your non-streamed content.

Connection setup

Fanout connections are created by explicitly calling the appropriate handoff method in your preferred Compute language SDK. Fanout then queries a nominated origin server to find out what to do with the request. It's up to the origin to tell Fanout to treat the request as a stream and to provide a list of channels that client should subscribe to (this origin can also be a Compute service - it can even be the exact same service).

Workflow for connection setup

You can decide what kinds of requests to upgrade to Fanout (3), and in the stream origin can decide what channels to subscribe the client to.

What to upgrade

You should apply URL path constraints to the requests you upgrade to Fanout. Upgrading a request that isn't intended to be a stream may still work, because Fanout will relay that request to origin, and if the response is not WebSocket-over-HTTP or GRIP, Fanout will simply relay it back to the end user and close the connection. However, passing all requests through Fanout is not recommended, for a number of reasons:

Requests passed to Fanout cannot be modified before upgrading: Fanout will use the request as presented by the client.
Responses from origin will not interact with the Fastly cache, so content not intended to be streamed will not be cached.
Responses from origin will not be accessible within the Compute program, they will be delivered directly to the client by Fanout.

As a result it usually makes sense to upgrade requests only when they target a known path or set of paths on which you want to stream responses:

Rust
JavaScript

use fastly::experimental::RequestUpgradeWebsocket;
use fastly::{Error, Request};

fn main() -> Result<(), Error> {
    let req = Request::from_client();

    if req.get_path().starts_with("/stream/") {
        req.handoff_fanout("stream_backend")?
    }

    // Forward all non-stream requests to the primary backend
    let be_resp = req.send("primary_backend");

    Ok(be_resp?.send_to_client())
}

If the backend you use for Fanout is outside of Fastly, that's all you need to do in your Compute code to integrate Fanout.

Responding to Fanout requests

Fanout communicates with backends by forwarding client requests and interpreting instructions in the response formatted using Generic Realtime Intermediary Protocol (GRIP). When a client request is upgraded, Fanout will forward the request to the backend specified in the handoff. The backend response can tell Fanout how to handle the connection lifecycle, using GRIP instructions.

Server-Sent Events
Long polling
WebSockets

Fanout forwards regular HTTP requests to the backend unmodified. The backend must use GRIP headers to indicate to Fanout to treat the connection as a Server-Sent-Events stream:

HTTP/1.1 200 OK
Content-Type: text/event-stream
Grip-Hold: stream
Grip-Channel: mychannel
Grip-Channel: anotherchannel

The GRIP headers that are relevant to initiating SSE streams are:

Grip-Hold: Set to stream to tell Fanout to deliver the headers of the response to the client immediately, and then deliver messages as they are published to subscribed channels.
Grip-Channel: A channel to subscribe the request to. Multiple Grip-Channel headers may be specified in the response, to subscribe multiple channels to the request.
Grip-Keep-Alive: Data to be sent to the client after a certain amount of activity passes. The timeout parameter specifies the length of time a request must be idle before the keep alive data is sent (default 55 seconds). The format parameter specifies the format of the keep alive data. Allowed values are raw, cstring, and base64 (default raw). For example, if a newline character should be sent to the client after 20 seconds of inactivity, the following header could be used: Grip-Keep-Alive: \n; format=cstring; timeout=20.
Messages to be published to a client in a Grip-Hold: stream state must have an http-stream format available. Learn more about publishing.
Compliant Server-sent events clients (such as the EventSource API built into web browsers) will send a Last-Event-ID header with new connection requests. If you care about ensuring clients do not miss events during reconnects, consider parsing this header and including missed events in the initial response along with the Grip-Hold header, allowing subsequent events provided via the publishing API to be appended by Fanout later to the same response.

Validating GRIP requests

If the backend is running on a public server, then it's a good idea to validate that the request is coming from Fastly. The Grip-Sig header value can be used to do this. Grip-Sig is provided as a JSON Web Token, and tokens signed by Fastly can be validated using the following public key. If the token cannot be fully verified for any reason, including expiration, then the backend should behave as if the header wasn't present.

This is the public key we use for signing GRIP requests:

-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAECKo5A1ebyFcnmVV8SE5On+8G81Jy
BjSvcrx4VLetWCjuDAmppTo3xM/zz763COTCgHfp/6lPdCyYjjqc+GM7sw==
-----END PUBLIC KEY-----

Many language ecosystems provide libraries for validating JWTs. If your backend uses JavaScript, for example, the jsonwebtoken module can be used:

import * as jwt from 'jsonwebtoken';
const FANOUT_PUBLIC_KEY = `-----BEGIN PUB...`;

// Assuming ExpressJS or similar
app.get('/chat-stream/:user', (req, res) => {
  jwt.verify(req.header('Grip-Sig'), FANOUT_PUBLIC_KEY);
});

NOTE: If your backend is running on JavaScript on Fastly Compute, the runtime does not support PEM-formatted keys at this time. Use the equivalent JSON Web Key (JWK) format instead:
const FANOUT_PUBLIC_KEY = {
  "kty": "EC",
  "crv": "P-256",
  "x": "CKo5A1ebyFcnmVV8SE5On-8G81JyBjSvcrx4VLetWCg",
  "y": "7gwJqaU6N8TP88--twjkwoB36f-pT3QsmI46nPhjO7M"
};
You can import this key and then use it with a library such as jose to validate a JWT:
import * as jose from 'jose';

const cryptoKey = await crypto.subtle.importKey(
  'jwk', FANOUT_PUBLIC_KEY, { name: 'ECDSA', namedCurve: 'P-256' },
  false, ['verify']);
const result = await jose.jwtVerify(req.headers.get('Grip-Sig'), cryptoKey);

Using Fastly as a Fanout backend

The Rust-based quick start above uses a single Compute service as both a normal Fastly service that receives end-user traffic, and also as the backend used by Fanout to negotiate streams. This is achieved by adding a backend called self to the Compute service using the public domain of the service, and then passing that backend name in the handoff_fanout call. You could also use another, different Compute service as the Fanout backend.

Depending on your use case, it may make sense to use Fastly as the Fanout backend, or it may be better to use the same origin server you use for non-streaming requests. For example:

Consider using Fastly to provide the Fanout backend if:
- you have a small number of channels
- the request from the client specifies the channels they want to subscribe to
- the server publishing messages doesn't need to know how many subscribers there are
Consider using your own origin server as the Fanout backend if:
- it's not possible to know ahead of time whether a request will turn into a stream or not
- you need to know within your origin infrastructure how many subscribers there are
- it's important that clients don't miss messages that are published while the client is reconnecting
- you want to apply an authentication check to stream requests (and the authentication layer is implemented at your origin)
- you are using long-polling

When Fanout relays requests to the backend, the path is preserved, and a Grip-Sig header is added. The path therefore remains a good way to identify requests related to streaming endpoints, and the Grip-Sig header can be used to differentiate between requests from a client and requests relayed from Fanout:

Rust
JavaScript

fn main() -> Result<(), Error> {
    let req = Request::from_client();

    // Request is a stream request
    if req.get_path().starts_with("/stream/") {
        return Ok(if req.get_header_str("Grip-Sig").is_some() {
            // Request is from Fanout
            handle_fanout(req, "test").send_to_client()
        } else {
            // Not from Fanout, hand it off to Fanout to manage
            req.handoff_fanout("self")?
        });
    }

    // Forward all non-stream requests to the primary backend
    let be_resp = req.send("primary_backend");

    Ok(be_resp?.send_to_client())
}

The handle_fanout method invoked in the examples above should return a GRIP HTTP or a WebSockets-over-HTTP response. The Fanout starter kit contains an example implementation.

Subscribing

Fanout is designed to allow push messaging to integrate seamlessly into your domain. When clients make HTTP requests or WebSocket connections to Fastly, what happens next depends on the instructions provided by your backend. These instructions can include subscribing the client to one or more channels.

For HTTP-based transports (such as Server-sent events and long polling), this is typically done with response headers. For example:

Grip-Hold: stream
Grip-Channel: mychannel

For the WebSocket transport, this is done by sending GRIP control messages as part of a WebSockets-over-HTTP response. For example:

c:{"type": "subscribe", "channel": "mychannel"}

It's important to understand that clients don't assert their own subscriptions. Clients make arbitrary HTTP requests or send arbitrary WebSocket messages, and it is your backend that determines whether or not clients should be subscribed to anything. Your channel schema remains private between Fanout and your backend server, and in fact clients may not even be aware that publish-subscribe activities are occurring.

HINT: You can still include channel names in your client requests if you want to. A path such as /stream/departure-KR4N81 to get a real time stream of departure status for a flight booking, for example, is passing the name of the desired channel in the path. The backend could extract this token from the path and pass it back to Fanout in the GRIP channel subscribe instruction.

If your client is a web browser, you will use JavaScript to initiate streaming requests to the backend:

Server-Sent Events
Long polling
WebSockets

Web browsers have built in support for Server-sent events via the EventSource API:

const evtSource = new EventSource('/stream/sse');
const eventList = document.querySelector('ul');

evtSource.onmessage = (event) => {
  const newElement = document.createElement("li");
  newElement.textContent = `message: ${event.data}`;
  eventList.appendChild(newElement);
};

If your SSE events include an id: property, the EventSource will add a Last-Event-ID header to each request, which can be used to deliver missed messages when a new stream begins.

The EventSource API predates the fetch standard, so lacks some flexibility and observability. Consider the fetchEventSource library if you need to overcome these limitations.

Publishing

Messages are published to Fanout channels using the publishing API. To publish events, send an HTTP POST request to https://api.fastly.com/service/{SERVICE_ID}/publish/. You'll need to authenticate with a Fastly API Token that has the global scope for your service.

Messages can also be delivered during connection setup (often to provide events that the client missed while not connected), and in response to inbound WebSocket messages. Events delivered in this way go to the client making the request (or sending the inbound WebSocket message), and do not use pub/sub channel subscriptions.

IMPORTANT: Unlike other Fastly APIs, the publishing endpoint requires a trailing slash: publish/.

Publish requests include the messages to be published in a JSON data model:

Property	Type	Description
`items`	Array	A list of messages to publish
`└─ [i]`	Object	Each member of the array is a single message
`└─ id`	String	A string identifier for the message. See de-duplicatioon.
`└─ prev-id`	String	Identifier of the previous message that was published to the channel. See sequencing.
`└─ channel`	String	The name of the Fanout channel to which to publish the message. One channel per message.
`└─ formats`	Object	A set of representations of the message, suitable for different transports.
`└─ ws-message`	Object	A message representation suitable for delivery to WebSockets clients.
`└─ content`	String	Content of the WebSocket message.
`└─ content-bin`	String	Base-64 encoded content of the WebSocket message (use instead of `content` if the message is not a string).
`└─ action`	String	A publish action.
`└─ http-stream`	Object	A message representation suitable for delivery to Server-Sent events clients.
`└─ content`	String	Content of the SSE message. Must be compatible with the `text/event-stream` format.
`└─ content-bin`	String	Base-64 encoded content of the SSE message (use instead of `content` if the message is not a string).
`└─ action`	String	A publish action.
`└─ http-response`	Object	A message representation suitable for delivery to Long-polling clients.
`└─ action`	String	A publish action.
`└─ code`	Number	HTTP status code to apply to the response.
`└─ reason`	String	Informational label for HTTP status code (delivered only over HTTP/1.1)
`└─ headers`	Object	A key-value map of headers to set on the response.
`└─ body`	String	Complete body of the HTTP response to deliver.
`└─ body-bin`	String	Base-64 encoded body content (use instead of `content` if the body is not a string).

Minimally, a publish request must contain one message in at least one format, with the content property (for http-stream or ws-message) or the body property (for http-response specified). An example of a valid publish payload is:

{
  "items": [
    {
      "channel": "test",
      "formats": {
        "ws-message": {
          "content": "hello"
        }
      }
    }
  ]
}

This can be sent using curl as shown:

$ curl -H "Fastly-Key: {YOUR_FASTLY_TOKEN}" -d '{"items":[{"channel":"test","formats":{"ws-message":{"content":"hello"}}}]}' https://api.fastly.com/service/{SERVICE_ID}/publish/

WARNING: If you are migrating to Fastly Fanout from self hosted Pushpin or fanout.io, you may be using a GRIP library in your server application. These libraries currently are not compatible with Fastly Fanout.

Publish actions

Published items can optionally specify one of three actions:

send: The included content should be delivered to subscribers. This is the default if unspecified.
hint: The content to be delivered to subscribers must be externally retrieved. No content is included in the published item.
close: The request or connection associated with the subscription should be ended/closed.

Sequencing

If Fastly receives a message with a prev-id that doesn't match the id of an earlier message, then we will buffer it until we receive a message whose id matches the value, at which point both messages will be delivered in the right order. If the expected message is never received, the buffered message will eventually be delivered anyway (around 5-10 seconds later).

De-duplication

If Fastly receives a message with an id that we've seen already recently (within the last few seconds) the publish action will be accepted but no message will be created. This happens even if the message content is different from any prior messages which had the same id.

This feature is typically useful if you have an architecture with redundant publish paths. For example, you could have two publisher processes handling event triggers and have both send each message for high availability. Fastly would receive every message twice, but only process each message once. If one of the publishers fails, messages would still be received from the other.

Limits

By default, messages are limited to 32,767 bytes for the “content” portion of the format being published. For the normal HTTP and WebSocket transports, the content size is the number of HTTP body bytes or WebSocket message bytes (TEXT frames converted to UTF-8).

Inbound WebSockets messages

Unlike HTTP-based push messaging (e.g. server-sent events), WebSockets is bidirectional. When clients send messages to Fastly over an already-established WebSocket, Fanout will make a WebSockets-over-HTTP request to the Fanout backend, with a TEXT or BINARY segment containing the message from the client.

POST /stream/path HTTP/1.1
Sec-WebSocket-Extensions: grip
Content-Type: application/websocket-events
Accept: application/websocket-events

TEXT 16\r\n
Hello from the client!\r\n

The response from the backend may include TEXT or BINARY segments, which will be delivered to the client that sent the message (disregarding the channel-based pub/sub brokering). TEXT segments may also include GRIP control messages to instruct Fanout to modify the client stream, for example to change which channels it subscribes to.

HTTP/1.1 200 OK
Content-Type: application/websocket-events

TEXT 0C\r\n
You said Hi!
TEXT 45\r\n
c:{"type": "subscribe", "channel": "additional-channel-subscription"}\r\n

The starter kit for Fanout includes an example of handling inbound WebSockets messages by echoing the content of the message back to the client.

Libraries and SDKs

Libraries created to make it easier to interact with servers using GRIP exist for many languages, and make initializing streams and publishing easier in your preferred software platform or framework.

Python: gripcontrol, django-grip, django-eventstream
Ruby: gripcontrol, rails-grip
PHP: fanout/grip, laravel-grip
JavaScript: @fanoutio/grip, @fanoutio/serve-grip, @fanoutio/eventstream
Go: go-gripcontrol
Java: java-gripcontrol
.NET: grip.net

Best practices

To get the most out of using Fanout on Fastly, consider the following tips:

Avoid stateful protocol designs: for example keeping a client's last received message position in the server instead of in the client. These patterns will work, but they will be hard to reason about. It's best if the client asserts its own state.
Don't keep track of connections on the server: Very rarely is it important to know about connections, rather than users. If you're implementing presence detection, better to do at the user/device level using heartbeats independently of connections.

Network services

Security

Compute

Quick start

Building blocks

Pub/Sub at the edge with Fanout

Quick start

Create a project from template

Add a backend

Enable Fanout

Authenticating

Test the service

Next steps

Connection setup

What to upgrade

Responding to Fanout requests

Validating GRIP requests

Using Fastly as a Fanout backend

Subscribing

Publishing

Publish actions

Sequencing

De-duplication

Limits

Inbound WebSockets messages

Libraries and SDKs

Best practices