Version: en

Multiplay QoS (Quality of Service)


This documentation is now deprecated. If you are using Matchmaker Self Serve though UDash, please use the documentation here.


Multiplay provides a quality of service (QoS) protocol to dynamically determine the available regions in which a client would expect to get the best connection quality for their online session. This document provides an overview of the QoS protocol components, how to use each component, and the best practices to consider when writing a custom implementation of the client.

The QoS protocol is comprised of two main components: Discovery and QoS.

  • Discovery allows the client to determine at runtime the regions that are currently active to test for connection quality.
  • QoS allows the client to test for connection quality to each of the available regions.

This document assumes familiarity with terms and concepts outlined by Multiplay, such as fleets, locations, and regions. For more information, see

Discovery service#

The Discovery service provides a way for the client to determine at runtime the regions that are currently available to host or join an online session. It performs this action by providing a REST endpoint to query for currently available QoS servers for a given Multiplay fleet.

The request endpoint supports a GET request at the following URLs:


[fleet_id] is replaced with the actual fleet ID that is provided by Multiplay for your title. The request body is empty.

Response types#

The possible HTTP response types from the request are listed in the following sections.

200 OK#

The request was successful, but the generated ETag did not match the value in the If-None-Match header.

  • Response Content Type: application/JSON
  • Response Body: List of all QoS servers in fleet regions associated with the provided fleet in a JSON array called “servers”

Each server in the array contains the following properties:

location_idIntegerMultiplay location ID
region_idStringMultiplay region ID
ipv4StringQoS server IPv4 address in dotted-quad format (empty string if not present)
ipv6StringQoS server IPv6 address in RFC 4291 colon-delimited format (empty string if not present)
portIntegerPort that the IPv4 and/or IPv6 server is listening to for requests.

Each server includes an IPv4 address, an IPv6 address, or both types of addresses. The port is the same for either IP version. If a server does not have a particular IP version, it is represented by an empty string, which is shown the following example:

"servers": [
"location_id": 123,
"region_id": "4f7d1d1a-a565-40b4-955a-ff0257d7ed3b",
"ipv4": "",
"ipv6": "",
"port": 9000
"location_id": 456,
"region_id": "22bf10c2-2565-4e75-848f-d2df25210896",
"ipv4": "",
"ipv6": "2606:4700:4700::1001",
"port": 9000

304 Not Modified#

The request was successful, and the generated or cached ETag matches the If-None-Match header. See the ETag support documentation.

  • Response Type: N/A
  • Response Body: None

403 Forbidden#

The request was not authenticated.

  • Response Type: text/plain
  • Response Body: “access denied for REMOTE_ADDR” (where REMOTE_ADDR is the network address of the caller)

404 Not Found#

The given fleet_id does not exist.

  • Response Content Type: application/JSON
  • Response Body: JSON object with details about the error

The response contains the following properties:

successBooleanLegacy: indicates whether the operation succeeded (always false)
errorBooleanLegacy: indicates whether the operation had an error (always true)
error_codeIntegerMultiplay-defined error code (generally -1)
error_messageStringInformation on the reason for the error
messagesArrayMultiplay-defined array of status messages (none currently defined)

The following code snippet shows an example of a 404 Not Found error.

"success": false,
"error": true,
"error_code": -1,
"error_message": "fleet does not exist",
"messages": []

500 Internal Server Error#

A server-side error has occurred.

  • Response Content Type: application/JSON
  • Response Body: Similar to the response for 404 Not Found, but with an error_message that contains information about the reason for the internal error.

ETag support#

A successful (2xx) Discovery service response includes a standard HTTP entity tag in the ETag response header. If this tag is provided in a subsequent request in the If-None-Match header, and there are no changes, the server responds with HTTP status 304 (not modified) and an empty body. This indicates that the list has not changed since the last request. In this scenario, you should use the previously cached results on the client.

Discovery best practices#

  • The Discovery service should be contacted once before the first QoS check after running the game. Cache the results locally, and use the provided ETag in subsequent requests whenever possible. This process can improve response times, reduce bytes over the wire, and reduce the load on the server when no changes are made to the servers list.
  • Discovery should be performed at no less than 20 minute intervals.
  • There are no duplicate entries in the “servers” array in the response. However, note that the same QoS server IP might serve multiple overlapping regions. Accordingly, you can isolate each unique server to be contacted only once, and then use the results for each region that uses that server. This process can reduce the amount of time that is spent sending requests and waiting for responses.
  • If the game client only supports IPv4 connections, contact the IPv4 QoS servers. Conversely, if the game client only supports IPv6, contact the IPv6 QoS servers. If the game client is IP version-agnostic, then you can contact either or both versions of the QoS servers.
  • The 4xx and 5xx responses contain legacy properties in the JSON object that should not be relied upon. Instead, only focus on the HTTP status code and the error_message field.

QoS server#

Use the QoS server to determine connection quality to a specific dedicated server region. Connection quality is defined as a combination of network latency and packet loss. The QoS server works with a very simple UDP protocol. Developers define almost all of the data that is included with the request, and the response includes an exact copy of the data that is sent. For example, you can include timestamps, sequence counters, and unique identifiers in the request with the goal of computing latency, packet loss, and detecting duplicate packets on the response.

You can also directly send the QoS request to a QoS server that is identified by the Discovery service at the provided IP address and port.

There is currently no authentication that is required for sending QoS request packets. Sending any valid request should generate a response, assuming that the server has capacity, that the client is not banned, and that the request or response is not lost in transit.

QoS Request#

The QoS Request packet is sent from the client to the QoS server. The payload (after the IP and UDP headers) is defined as shown in the following example:

Type1 byte0x59Magic value that identifies the packet as a valid QoS Request Packet.
VerAndFlow1 byte[0x00-0xF0]Upper 4 bits are reserved for the Version. Version starts at zero and increments by one for each version of the packet format (allows for a maximum of 16 versions). The packet format documented here is version 0000b. The lower 4 bits are reserved for the flow control. For a QoS Request, the flow control must always be set to 0000b.
TitlevariesvariesThe title of the game requesting QoS. The first byte of the title is the length of the title block, including the length byte. So the title “A” would have a length byte of 2: one for the length, and one for the letter “A”. The title itself is an array of UTF8-encoded bytes with no NULL-termination for the string. For example, the title “ワオ” would be encoded as [0x07, 0xe3, 0x83, 0xaf, 0xe3, 0x82, 0xaa], where 0x07 is the length.
CustomvariesvariesThis is custom data that is echoed back to the client in a response.

Note that the payload (not including the IP and UDP headers) cannot exceed 1500 bytes. For more information on packet sizes, see QoS best practices.

The following example shows how a developer might use the custom data1:

Sequence1 byte[0x00-0xFF]This value monotonically increases from 0 for each QoS request packet that is generated in a particular QoS check session. A session covers all of the packets that are issued for a single QoS (latency + packet loss) check with the same identifier.
Identifier2 bytesvariesA unique value that is used for the duration of the QoS check. Each request packet in a session uses the same identifier.
Timestamp8 bytesvariesThe number of milliseconds from the epoch on the client for when the packet was crafted. It displays in the response so the client can determine end-to-end latency to roughly millisecond accuracy.

A developer can either use this example format for their custom data, or can use some other custom format. Another option is to pad the packet out to a larger size to prefer servers on routes that do not discard large UDP packets if the game requires them, being mindful of the 1500 byte limit.

QoS Response#

The QoS Response packet is a mostly byte-for-byte copy of the payload in the QoS Request packet with the magic value set to the QoS Response packet type, the title removed, and any flow control data set. Because the data in the QoS packet is only useful to the client that sent the request, echoing the data received back to the client allows the client to compute the overall QoS benchmark.

The payload (not including the IP and UDP headers) is between 2 and 1500 bytes.

Type1 byte0x95Magic value that identifies the packet as a valid QoS Response Packet.
VerAndFlow1 byte[0x00-0xFF]Upper 4 bits are reserved for the Version. Version starts at zero and increments by one for each version of the packet format (allows for a maximum of 16 versions). The packet format documented here is version 0000b.

The lower 4 bits are reserved for the flow control. 0000b indicates no flow control. 0nnnb indicates that the client should voluntarily back-off for nnnb units (1-7), where each unit is 2 minutes. For example, 0010b is a 4 minute back-off.

1nnnb indicates that the client has been banned for nnnb units, where each unit is 2 minutes. For example, 1000b is a 2 minute ban, 1001b is a 4 minute ban, and then increment by 2 minutes for each unit beyond that. In both scenarios, the client should add a reasonable buffer to the total time to ensure edge cases are not an issue.
CustomvariesvariesCustom data from the request that is echoed back to the client in the response.

Flow control#

Flow control is defined as instructions from the server to the client to voluntarily back-off from sending requests, or to inform the client that they have temporarily been restricted from receiving responses.

The QoS protocol was designed to never require the developer to manage the byte order of data. Header data is all byte-based, and custom data is echoed in the same byte-order in which it was sent. Accordingly, instead of flow control using a potentially multi-byte value for the amount of time to back-off, the contract is that each unit of flow control represents 2 minutes of time, and the server indicates how many units to apply. The client should pad this amount with a reasonable buffer of time (for example, 15-30 seconds) to account for latency and processing time on the server.

A voluntary back-off is the server asking the client to stop sending requests for a certain amount of time. The server still responds to the client during a voluntary back-off. However, continuing to send requests to a server that has asked for a voluntary back-off might result in the client being temporarily banned. Note that the voluntary back-off is currently unused, so all non-zero flow control is of the 1nnnb variety, which indicates a temporary ban for the client.

During a ban, all QoS requests from the banned client go unanswered. The ban could be enforced because the server thinks the client is sending too many packets in a short amount of time (see QoS best practices), or because the server has exceeded capacity and is enforcing an algorithm to reduce usage to get back under capacity. In this scenario, the client was chosen at random to be turned away for a time. Note that the ban is server-specific, and being temporarily banned from one QoS server does not prevent the client from contacting other QoS servers.

QoS check best practices#

  • A single QoS check should involve sending between 10-20 requests to a QoS server and then waiting for responses. The requests can be batched and sent in succession without first waiting for a response. The time to wait for responses depends on how latency tolerant the game is and how much time is dedicated to checking QoS in the game flow.
  • Instead of checking one server at a time, developers can send requests to several servers at once and then wait for all responses from all servers. Note that contacting too many servers simultaneously might introduce latency when waiting for the responses to be read off of the sockets.
  • While the packet length of a request is 1500 bytes (plus the IP and UDP headers), it is generally not advisable to send such large UDP packets. A UDP frame that is subjected to fragmentation and reassembly has a much higher chance of being discarded by intermediate routers along the path.
  • Consider padding the QoS request with enough data to approximate the size of the game data packets to help identify servers on routes that are discarding larger UDP packets.
  • QoS can be refreshed at periodic intervals when outside of an online game session, but there should be a minimum of 3 minutes between automatic checks. When in an online game session, QoS checks must be stopped to avoid unnecessary network traffic on the client and load on the QoS servers.
  • Refrain from sending other network traffic when checking QoS. The time spent processing other network traffic might skew results or add additional latency.
  • Rechecking QoS inside of the recommended minimum interval of 3 minutes is acceptable under specific circumstances, such as changing networks from cellular to Wi-Fi or Ethernet. If this is happening so frequently that most or all QoS checks are happening inside of the minimum interval, consider waiting until the underlying network becomes more stable before you resume performing QoS checks to avoid getting temporarily banned by the server.


Identifying packet loss#

To identify packet loss, send a static number of requests and then count the number of responses. If fewer responses arrive than the number of sent requests, some packets were lost.

Computing latency#

To compute latency, include a timestamp in your request packets, and then read that timestamp in each response packet and compare it against the current time. The difference is the computed latency.

Identifying duplicate responses#

To identify duplicate reponses, include a unique value (for example, a sequence byte) in each of your request packets. When recording responses, check the unique values against those that are used in the requests. If a response contains a sequence that has already been accounted for, it is a duplicate and can be discarded without counting against the response totals or latency.

Identifying stale responses#

To identify stale responses, include a static value in each set of requests that are sent to a single server. When reading responses, if that static value is different, then it is a stale response from another QoS check. Rotate the value each time you initiate a new check with a new set of requests.

Example QoS flow#

A simplified QoS flow might look like the following example:

simplified QoS flow example image

The flow starts discovery as soon as the player progresses to the online menu or mode after they start the game (or any time after the network stack has been initialized). After discovery is performed, a QoS check is then performed. When in the online game setup, QoS is rechecked periodically if enough time has elapsed since the last check. The QoS results are used in the online game setup to determine which regions to attempt to create or join a game. If using the Multiplay matchmaking service with Multiplay dedicated hosting, the QoS results can be attached to the matchmaking ticket, and the developer configured matchmaking logic will make decisions for the player based on those results. For more information, see the Matchmaker beta user guide.

When the game is over and the player re-enters the online menu or mode, QoS discovery can be skipped because it was already performed, and the flow can continue directly to checking QoS. If a developer chooses to check Discovery again instead of proceeding directly to QoS (after the recommended interval of time has passed since the last Discovery check), they should use the provided ETag from the previous check.