This post originally appeared on https://dedouss.is
In the opening part of this series we outlined the basics of Socket.IO and discussed the importance of documenting Socket.IO APIs. Now it’s time to bring AsyncAPI into play.
In this post we’re going to cover:
- A modelling exercise, in which Socket.IO semantics are mapped to AsyncAPI structures
- A tutorial involving the creation of an AsyncAPI specification given an existing Socket.IO API
- Asynction, a Socket.IO server framework driven by the AsyncAPI specification
Modelling the Socket.IO protocol using AsyncAPI
Don’t let the title of this section intimidate you. This modelling exercise ended up being relatively straightforward and I think it makes a great example of how AsyncAPI was designed to fit any event-driven protocol. If you are not interested in the thought process behind this exercise, you may jump straight to the Summary paragraph of this section, which presents the solution.
I will approach this problem by traversing the AsyncAPI object structure, attempting to map each of the objects to a semantic of the Socket.IO client API.
The root object of the specification is the AsyncAPI Object. The fields of this object that require special attention are channels and servers.
Channels
The Channels Object is a map structure that relates a channel path (relative URI) to a Channel Item Object.
1channels:
2 /: {} # Channel Item Object
3 /admin: {} # Channel Item Object
Channels are addressable components where messages/events flow through. The specification suggests that a server may support multiple channel instances enabling an application to separate its concerns. This sounds very much like the definition of the Socket.IO namespace. Namespaces are indeed addressable components that follow the relative URI convention. Since Socket.IO supports multiplexing, a client may emit messages to multiple namespaces over a single shared connection. However, it could also force a separate connection per namespace (using the forceNew
option). Thus, a Socket.IO namespace could either be a virtual or physical channel.
Given that connections are established on the namespace level, the Channel Item Object is the only object of the specification that MAY include bindings. For a Socket.IO API, the Channel Bindings Object should only contain the ws field, in which one can specify the handshake context (HTTP headers and query params) that a client should provide when connecting to that particular channel/namespace.
1channels:
2 /:
3 publish: {} # Operation object - Ignore this for now
4 subscribe: {} # Operation object - Ignore this for now
5 bindings:
6 ws:
7 query:
8 type: object
9 properties:
10 token:
11 type: string
12 required: [token]
Since a single connection (and thus binding) is going to be used across multiple channels, there is no need to repeat the same bindings object under each channel/namespace. We can introduce the convention of always including bindings under the main (/
) namespace but omitting them under the custom ones. At this point I would also like to propose the following bonus semantic: If a custom namespace includes bindings, then the client should always force a new connection when connecting to it.
You have probably noticed that I chose to stick to the WebSockets Channel Binding as the only possible binding that a Socket.IO API may define. One could ask why not use an HTTP Channel Binding object alongside the WebSockets one, since the protocol could also be implemented via HTTP long-polling. There are 2 answers to this question:
- The current latest version of the AsyncAPI bindings specifications does not allow HTTP bindings to be defined at the channel level.
- The HTTP long-polling implementation of Socket.IO is essentially a pseudo WebSocket. It is implemented in such a way to resemble the WebSocket implementation. The same HTTP headers and query params are sent to the server no matter the transport mechanism.
Hence, it is safe to use the ws bindings even for the HTTP long-polling fallback. However, in an ideal world, we would have AsyncAPI supporting SocketIO bindings through an explicit socketio field. In fact, I have created a github issue to pitch this proposal.
Along with bindings, the Channel Item Object includes the publish and subscribe fields, in which one defines the operations that a namespace supports. The publish Operation Object lists all the possible events that the client may emit (socket.emit
), while the subscribe operation defines the events that the client may listen to (socket.on
).
A Socket.IO event can be expressed using the Message Object, where the name field describes the eventName and the payload field describes the schema of the args that the client passes as part of the socket.emit invocation: socket.emit(eventName[, …args][, ack])
. For subscribe events, payload defines the structure of the arguments that the event handler callback expects: socket.on(eventName, (...args) => {})
.
The structure of the payload value depends on the number of arguments expected:
Scenario | Sender-side code | Payload value structure | AsyncAPI Message Object |
---|---|---|---|
No args expected | socket.emit("hello") | n/a — Payload field should be omitted |
|
Single arg expected | socket.emit("hello", {foo: “bar”}) | Any type other than tuple |
|
Multiple args expected | socket.emit("hello", {foo: “bar”}, 1) | Tuple type |
|
To account for multiple events (Message Objects) per namespace, the message field of each Operation Object allows the oneOf array structure. For example, in the message of the publish operation of the /admin
namespace, the oneOf array lists all the available eventName and args payload pairs that a client can pass to the adminNamespace.emit
call:
1channels:
2 /admin:
3 publish:
4 message:
5 oneOf:
6 - $ref: "#/components/messages/MessageOne"
7 - $ref: "#/components/messages/MessageTwo"
Now, let’s move on to the acknowledgement semantics of the protocol: The basic unit of information in the Socket.IO protocol is the packet. There are 7 distinct packet types. The payloads of the publish and subscribe Message Objects described above correspond to the EVENT
and BINARY_EVENT
packet types. These are essentially the packets that are transmitted when the Socket.IO sender invokes the emit API function of the Socket.IO library (regardless of implementation). In turn, the Socket.IO event receiver handles the received event using the on API function of the Socket.IO library. As part of the on handler, the receiver may choose to return an acknowledgement of the received message. This acknowledgement is conveyed back to the sender via the ACK
and BINARY_ACK
packet types. The ack data is passed as input to the callback that the message sender has provided through the emit invocation.
In order to express the above semantics, the Message Object (eventName and args payload pair) should be linked to an optional acknowledgement object. Since the specification in its current form does not support such a structure, I am proposing the following Specification Extension:
- Message Objects MAY include the
x-ack
field. The value of this field SHOULD be a Message Ack Object. - Components Object MAY include the
x-messageAcks
field. The value of this field should be of type:Map[string, Message Ack Object | Reference Object]
.
Message Ack Object
Field Name | Type | Description |
---|---|---|
args | Schema Object | Schema of the arguments that are passed as input to the acknowledgement callback function. In the case of multiple arguments, use the array type to express the tuple. |
In the case of a publish message, the x-ack
field informs the client that it should expect an acknowledgement from the server, and that this acknowledgement should adhere to the agreed schema. Likewise, for subscribe messages the x-ack
field encourages the client to send a structured acknowledgement, for each message it receives.
Servers
The Servers Object is – surprise surprise – a map of Server Objects. Each Server Object contains a url field from which the client may infer the custom path to the Socket.IO server. This custom path should then be provided via the path option upon the initialisation of the Socket.IO connection manager, alongside the url arg. The protocol field of the Server Object is also required, and specifies the scheme part of that url arg. Its value should equal any of the ws, wss, http or https protocols. For a Socket.IO client, it does not really matter whether the scheme is http or ws, due to the upgrade mechanism. Thus, for Socket.IO APIs, the only purpose of the protocol field is to indicate the use (or absence) of SSL.
Summary
We made it to the end of the modelling exercise the outcome of which is the following table, relating Socket.IO semantics to AsyncAPI structures.
Socket.IO | AsyncAPI |
---|---|
Namespace | Channel (described through the Channel Item Object) |
IO options | WebSockets Channel Binding |
namespaceSocket.emit(eventName[, …args][, ack]) | Operation Object defined under the publish field of a Channel Item Object. The available eventName & args pairs for this emit invocation are listed under the message field, through the oneOf array structure. |
namespaceSocket.on(eventName, callback) | Operation Object defined under the subscribe field of a Channel Item Object. The available eventName & callback argument pairs for this on invocation are listed under the message field, through the oneOf array structure. |
Event | Message (described through the Message Object) |
eventName | The name field of the Message Object) |
Event args | The payload field of the Message Object |
ack | The x-ack field of the Message Object. Requires an extension of the specification. The field may be populated for both publish and subscribe messages. |
Custom path (path option) | The url field of the Server Object |
Use of TLS (regardless of transport mechanism) | The protocol field of the Server Object |
In practice
With the modelling exercise out of the way, I’m now going to guide you through the process of creating an AsyncAPI spec from scratch given an existing Socket.IO API. For the purposes of this simple tutorial, let’s use this minimal chat application, which is one of the get-started demos featured in the Socket.IO website.
Below is the source of our Socket.IO server:
1// Setup basic express server
2const express = require("express");
3const app = express();
4const path = require("path");
5const server = require("http").createServer(app);
6const io = require("socket.io")(server);
7const port = process.env.PORT || 3000;
8
9server.listen(port, () => {
10 console.log("Server listening at port %d", port);
11});
12
13// Chatroom
14let numUsers = 0;
15
16io.on("connection", (socket) => {
17 let addedUser = false;
18
19 // when the client emits 'new message', this listens and executes
20 socket.on("new message", (data) => {
21 // we tell the client to execute 'new message'
22 socket.broadcast.emit("new message", {
23 username: socket.username,
24 message: data,
25 });
26 });
27
28 // when the client emits 'add user', this listens and executes
29 socket.on("add user", (username, cb) => {
30 if (addedUser) {
31 cb({ error: "User is already added" });
32 return;
33 }
34
35 // we store the username in the socket session for this client
36 socket.username = username;
37 ++numUsers;
38 addedUser = true;
39 socket.emit("login", {
40 numUsers: numUsers,
41 });
42 // echo globally (all clients) that a person has connected
43 socket.broadcast.emit("user joined", {
44 username: socket.username,
45 numUsers: numUsers,
46 });
47 cb({ error: null });
48 });
49
50 // when the client emits 'typing', we broadcast it to others
51 socket.on("typing", () => {
52 socket.broadcast.emit("typing", {
53 username: socket.username,
54 });
55 });
56
57 // when the client emits 'stop typing', we broadcast it to others
58 socket.on("stop typing", () => {
59 socket.broadcast.emit("stop typing", {
60 username: socket.username,
61 });
62 });
63
64 // when the user disconnects.. perform this
65 socket.on("disconnect", () => {
66 if (addedUser) {
67 --numUsers;
68
69 // echo globally that this client has left
70 socket.broadcast.emit("user left", {
71 username: socket.username,
72 numUsers: numUsers,
73 });
74 }
75 });
76});
77
78// Admin
79
80io.of("/admin").on("connection", (socket) => {
81 let token = socket.handshake.query.token;
82 if (token !== "admin") socket.disconnect();
83
84 socket.emit("server metric", {
85 name: "CPU_COUNT",
86 value: require("os").cpus().length,
87 });
88});
I’ve slightly tweaked the original source located at https://github.com/socketio/socket.io/tree/master/examples/chat to include acknowledgments and bindings, so that I can showcase the full spectrum of the AsyncAPI specification.
Let’s start by defining the version of the specification as well as the info object which provides metadata about the service:
1asyncapi: 2.2.0
2
3info:
4 title: Socket.IO chat service
5 version: 1.0.0
6 description: |
7 This is one of the get-started demos listed in the socket.io website: https://socket.io/demos/chat/
Moving on to the servers section, where one should provide connectivity information for all the instances of their service. In the case of our simple chat application, there is only one demo server accessible at socketio-chat-h9jt.herokuapp.com:
1servers:
2 demo:
3 url: socketio-chat-h9jt.herokuapp.com/socket.io
4 protocol: wss
Things get a bit more interesting when it comes to channels. Skimming through the server code we find 2 namespace instances (default and /admin), which means that the channel mapping should consist of 2 entries:
1channels:
2 /: {}
3 /admin: {}
Within each namespace connection block, there are multiple socket.on
, and socket.emit
references. For each unique reference, we need to append a Message Object under the publish and subscribe operations respectively:
1channels:
2 /:
3 publish:
4 message:
5 oneOf:
6 - $ref: "#/components/messages/NewMessage"
7 - $ref: "#/components/messages/Typing"
8 - $ref: "#/components/messages/StopTyping"
9 - $ref: "#/components/messages/AddUser"
10 subscribe:
11 message:
12 oneOf:
13 - $ref: "#/components/messages/NewMessageReceived"
14 - $ref: "#/components/messages/UserTyping"
15 - $ref: "#/components/messages/UserStopTyping"
16 - $ref: "#/components/messages/UserJoined"
17 - $ref: "#/components/messages/UserLeft"
18 - $ref: "#/components/messages/LogIn"
19 /admin:
20 subscribe:
21 message: # No need to use `oneOf` since there is only a single event
22 $ref: "#/components/messages/ServerMetric"
From the server code, we can also see that the connection handler of the admin namespace applies some very sophisticated authorization based on the token
query parameter. The spec should hence document that the API requires the presence of a valid token query param upon the handshake:
1channels:
2 /:
3 publish:
4 # ...
5 subscribe:
6 # ...
7 /admin:
8 subscribe:
9 # ...
10 bindings:
11 $ref: "#/components/channelBindings/AuthenticatedWsBindings"
Putting everything together into a single document:
1asyncapi: 2.2.0
2
3info:
4 title: Socket.IO chat demo service
5 version: 1.0.0
6 description: |
7 This is one of the get-started demos presented in the socket.io website: https://socket.io/demos/chat/
8
9servers:
10 demo:
11 url: socketio-chat-h9jt.herokuapp.com/socket.io
12 protocol: wss
13
14channels:
15 /:
16 publish:
17 message:
18 oneOf:
19 - $ref: "#/components/messages/NewMessage"
20 - $ref: "#/components/messages/Typing"
21 - $ref: "#/components/messages/StopTyping"
22 - $ref: "#/components/messages/AddUser"
23 subscribe:
24 message:
25 oneOf:
26 - $ref: "#/components/messages/NewMessageReceived"
27 - $ref: "#/components/messages/UserTyping"
28 - $ref: "#/components/messages/UserStopTyping"
29 - $ref: "#/components/messages/UserJoined"
30 - $ref: "#/components/messages/UserLeft"
31 - $ref: "#/components/messages/LogIn"
32 /admin:
33 subscribe:
34 message: # No need to use `oneOf` since there is only a single event
35 $ref: "#/components/messages/ServerMetric"
36 bindings:
37 $ref: "#/components/channelBindings/AuthenticatedWsBindings"
38
39components:
40 messages:
41 NewMessage:
42 name: new message
43 payload:
44 type: string
45 Typing:
46 name: typing
47 StopTyping:
48 name: stop typing
49 AddUser:
50 name: add user
51 payload:
52 type: string
53 x-ack: # Documents that this event is always acknowledged by the receiver
54 args:
55 type: object
56 properties:
57 error:
58 type: [string, "null"]
59 NewMessageReceived:
60 name: new message
61 payload:
62 type: object
63 properties:
64 username:
65 type: string
66 message:
67 type: string
68 UserTyping:
69 name: typing
70 payload:
71 type: object
72 properties:
73 username:
74 type: string
75 UserStopTyping:
76 name: stop typing
77 payload:
78 type: object
79 properties:
80 username:
81 type: string
82 UserJoined:
83 name: user joined
84 payload:
85 type: object
86 properties:
87 username:
88 type: string
89 numUsers:
90 type: integer
91 UserLeft:
92 name: user left
93 payload:
94 type: object
95 properties:
96 username:
97 type: string
98 numUsers:
99 type: integer
100 LogIn:
101 name: login
102 payload:
103 type: object
104 properties:
105 numUsers:
106 type: integer
107 ServerMetric:
108 name: server metric
109 payload:
110 type: object
111 properties:
112 name:
113 type: string
114 value:
115 type: number
116
117 channelBindings:
118 AuthenticatedWsBindings:
119 ws:
120 query:
121 type: object
122 properties:
123 token:
124 type: string
125 required: [token]
The modified server source code is pushed at https://github.com/dedoussis/asyncapi-socket.io-example, along with the above AsyncAPI spec, which can be viewed using the AsyncAPI playground.
Note that there is no point in documenting the reserved events since all Socket.IO APIs support these by default.
Asynction
In parallel to this exercise I have been developing Asynction, a Socket.IO python framework that is driven by the AsyncAPI specification. Asynction is built on top of Flask-Socket.IO and inspired by Connexion. It guarantees that your API will work in accordance with its documentation. In essence, Asynction is to AsyncAPI and Flask-SocketIO, what Connexion is to OpenAPI and Flask.
In this example, I forked the minimal chat application that we documented above and re-implemented the server in python, using Asynction. Be mindful of the x-handler
and x-handlers
extensions that have been introduced to relate AsyncAPI entities (such as message or channel objects) to python callables (event handlers).
You may find extensive documentation of Asynction at: https://asynction.dedouss.is
The framework is still at a beta stage, so please get in touch before using it in a production setup.
Any piece of feedback would be much appreciated.
The end
For any questions, comments, or corrections, feel free to reach out to me at dimitrios@dedouss.is.
A special shout out to derberq, alequetzalli, and the wider AsyncAPI community for being particularly helpful and responsive.
Photo by Matt Howard on Unsplash