Networking

Overview

Torque was designed from the foundations up to offer robust client/server networked simulations. Performance over the internet drove the design for the networking model. Torque attempts to deal with three fundamental problems of network simulation programming - limited bandwidth, packet loss and latency. For a more detailed, if somewhat outdated, description of the Torque network architecture, see "The Tribes II Engine Networking Model" paper by Tim Gift and Mark Frohnmayer and the accompanying PowerPoint slides in the Torque documentation area on GarageGames.com.

An instance of Torque can be set up as a dedicated server, a client, or both a client and a server. If the game is a client AND a server, it still behaves as a client connected to a server - instead of using the network, however, the NetConnection object has a short-circuit link to another NetConnection object in the same application instance.

Bandwidth is a problem because in the large, open environments that Torque allows, and with the large number of clients that your game supports (depending on amount of data sent per client, game world complexity, and available bandwidth), potentially many different objects can be moving and updating at once. The Torque uses three main strategies to maximize available bandwidth. First, it prioritizes data, sending updates to what is most "important" to a client at a greater frequency than it updates data that is less important. Second, it sends only data that is necessary - using the BitStream class, only the absolute minimum number of bits needed for a given piece of data will be sent. Also, when object state changes, Torque only sends the part of the object state that changed. Last, Torque caches common strings (NetStringTable) and data (SimDataBlock) so that they only need to be transmitted once.

Packet loss is a problem because the information in lost data packets must somehow be retransmitted, yet in many cases the data in the dropped packet, if resent directly, will be stale by the time it gets to the client - for example, suppose that packet 1 contains a position update for a player and packet 2 contains a more recent position update for that same player. If packet 1 is dropped but packet 2 makes it across the engine shouldn't resend the data that was in packet 1 - it is older than the version that was received by the client. In order to minimize data that gets resent unnecessarily, the engine classifies data into four groups:

  • Unguaranteed Data (NetEvent) - if this data is lost, don't re-transmit it. An example of this type of data could be real-time voice traffic - by the time it is resent subsequent voice segments will already have played.
  • Guaranteed Data (NetEvent) - if this data is lost, resend it. Chat messages, messages for players joining and leaving the game and mission end messages are all examples of guaranteed data.
  • Most-Recent State Data (NetObject) - Only the most current version of the data is important - if an update is lost, send the current state, unless it has been sent already.
  • Guaranteed Quickest Data (Move) - critical data that must get through as soon as possible.

Latency is a problem in the simulation because the network delay in information transfer (which, for modems, can be up to a quarter of a second or more) makes the client's view of the world perpetually out-of-sync with the server. Twitch FPS games, for which Torque was initially designed, require instant control response in order to feel anything but sluggish. Also, fast moving objects can be difficult for highly latent players to hit. In order to solve these problems Torque employs several strategies:

  • Interpolation is used to smoothly move an object from where the client thinks it is to where the server says it is.
  • Extrapolation is used to guess where the object is going based on its state and rules of movement.
  • Prediction is used to form an educated guess about where an object is going based on rules of movement and client input.

The network architecture is layered: at the bottom is the platform layer, above that the notify protocol layer, followed by the NetConnection object and event management layer. The following sections explain how each layer addresses some or all of the fundamental network simulation problems.

Platform Networking Layer (TCP/UDP)

The platform library provides the interface between the game engine and the OS dependent network functionality. The platform library's Net interface contains functions for opening reliable and unreliable communication sockets, converting between string and numeric network addresses and sending and receiving data.

Net::openPort() opens an unreliable socket, of which only one is allowed per application instance. Net::sendto() sends an unreliable datagram to the specified NetAddress. Net::openListenPort() opens a reliable socket for incoming TCP connections. Net::openConnectTo() begins the process of asynchronously connecting to a remote TCP socket. Net::sendtoSocket() sends data over an established TCP connection. Net::process() processes the platform network layer, possibly generating network related events that are then posted into the simulation via GameInterface::processEvent().

Torque also has some good debugging capabilities. The DEBUG_NET and DEBUG_LOG() macros are used to control debug output from the networking code. A full explanation of this functionality is beyond the scope of this overview; however, searching the source for instances of DEBUG_LOG should get you on the way to understanding this part of Torque.

Connection Protocol

Connection Negotiation

The negotiation of a game network connection is not actually a part of the network class tree in the Torque - instead a set of functions, declared in engine/game/netDispatch.cc perform this service. The function DemoGame::processPacketReceiveEvent() is the main dispatch function for incoming network packets.

The first step of the connection process is the console function connect(), which initiates a connection attempt by sending a connect challenge request packet to the server from sendConnectChallengeRequest().

The server, in handleConnectChallengeRequest(), may issue the client a connect challenge response, which the client will process in handleConnectChallengeResponse. The client will in turn issue a connect request (sendConnectRequest) with the challenge information it received from the server. The server processes this message in handleConnectRequest. If the server decides to accept the request, it issues a sendConnectAccept back to the client and constructs a NetConnection object on the server to handle that client. The client, in handleConnectAccept creates a complementary NetConnection object to manage the client side of the connection. The dispatchCheckTimeouts function periodically checks if a connection request or challenge has been waiting too long and reissues the request if it has.

Once We're Connected...

Once a connection has been established, the function of the ConnectionProtocol class is to provide a common low-level mechanism for supporting the delivery of the four fundamental types of network data in the Torque. The ConnectionProtocol abstract base class implements a sliding window connected message stream over an unreliable transport (UDP). Rather than supporting guaranteed messages directly, the ConnectionProtocol class implements a notify protocol. Each packet sent is prepended with a message header containing tracking information, including what packets the other end of the connection has received or were dropped in transit. When a ConnectionProtocol instance determines that a packet it sent has been either received or dropped, it calls ConnectionProtocol::handleNotify(). Notifies are always delivered in the order packets were sent - so for every packet sent through a ConnectionProtocol object, eventually a notification of successful (ack) or unsuccessful (nack) delivery will be executed.

Because the base network protocol exports the inherently unreliable nature of the network to the simulation, at a higher level Torque can directly support different types of data guarantee: for unguaranteed data, if it is nacked, there is no need to resend it. For guaranteed data, if it is nacked, the engine queues it up for resend (NetConnection::eventPacketDropped()). If the data is most recent state data and the packet is nacked and that object's state hasn't been subsequently changed and resent, queue the data up for resend (NetConnection::ghostPacketDropped()). If the data is set for quickest possible delivery, continue sending the data with every packet until a packet containing the data is acked (GameConnection::readPacket()).

NetConnection

The NetConnection class is derivative from both SimGroup and ConnectionProtocol, and is responsible for managing the data streaming between client and server. The NetEvent class encapsulates the guaranteed and unguaranteed message delivery types and the ghost management portion of the NetConnection class handles state updates of world objects from server to client. The Torque example game-specific subclass of NetConnection is GameConnection and handles transmission of game specific data such as player moves.

The NetConnection class sends packets of a fixed size in a regular stream between the client and server. When a message is posted for transmission, it is aggregated with other messages and sent based on the packet rate and packet size settings for that connection.

The BitStream

BitStream is a utility class used to pack data for transmission. BitStream has methods for reading and writing variable-sized integers, floats, vectors, Huffman-coded strings and bits.

When a NetConnection instance determines it is ready to send a packet across the network (NetConnection::checkPacketSend()), it allocates a BitStream and calls NetConnection::writePacket() with the stream. When a packet is received it is processed through the corresponding NetConnection::readPacket() function.

Network Events

The NetEvent class provides a foundation for guaranteed, guaranteed ordered and unguaranteed message transmission. NetEvent uses the same class instance creation mechanism as the console, but rather than instantiating by name, NetEvents use a class ID, which was assigned when the console initializes.

If the pack and unpack methods don't match in terms of what they read and write into the stream, serious network errors can occur. The client and server should gracefully disconnect in these cases, but the errors themselves can be very difficult to track down. If the DEBUG_NET macro is defined, a special key will be written into the packet stream after each event and object update, and the system will assert immediately when it detects that this problem has occurred.

Network Ghosts and Scoping

The NetObject class is a derivative of SimObject that can replicate (ghost) itself across a network connection. All world object classes are subclassed from NetObject (the superclass of SceneObject). In order to best utilize the available bandwidth, the NetConnection attempts to determine which objects are "interesting" to each client - and among those objects, which ones are most important. If an object is interesting to a client it is said to be "in scope" - for example, a visible enemy to a player in a first person shooter would be in scope.

Each NetConnection object maintains a scoping object - responsible for determining which objects are in scope for that client. Before the NetConnection writes ghost update information into each packet in NetConnection::ghostWritePacket(), it calls the scope object's onCameraScopeQuery() function which performs two services: first, it determines which objects are "in scope" for that client and calls NetConnection::objectInScope for each object on that client. Second, the onCameraScopeQuery() call fills in the CameraScopeQuery structure which is then used to determine the priority of object updates.

The default NetObject::onCameraScopeQuery() function scopes everything in the world, but the Torque game example overrides this in ShapeBase::onCameraScopeQuery(). ShapeBase calls the server SceneGraph::scopeScene() function to traverse the scene from the client's point of view and scope all potentially visible objects. Each scoped object that needs to be updated is then prioritized based on the return value from the NetObject::getUpdatePriority() function, which by default returns a constant value. This function is overridden in ShapeBase::getUpdatePriority() to take into account the object's distance from the camera, its velocity perpendicular to the view vector, and other factors.

Rather than always sending the full state of the object each time it is updated across the network, the Torque supports only sending portions of the object's state that have changed. To facilitate this, each NetObject can specify up to 32 independent sub-states that can be modified individually. For example, a player object might have a movement state, detailing its position and velocity, a damage state, detailing its damage level and hit locations, and an animation state, signifying what animation, if any, the player is performing.

Each state data group is assigned a bit position in the class. When an object's state changes, the object notifies the network system with the NetObject::setMaskBits function. When the object is to be written into a packet in NetObject::packUpdate, the object's current state mask is passed in. The object's state mask is NOT written into the packet directly - it is the responsibility of the pack function to accurately encode which states are updated.

Initially an object's state mask is set to all 1's - signifying that all the object's states need to be updated.

GameConnection, Moves and the Control Object

GameConnection is the game-specific subclass of NetConnection. Applications can subclass NetConnection to directly write and read data from packets, as well as hook into the notify mechanism. The NetConnection::allocNotify() function is called at the beginning of a packet write and is used to allocate a NetConnection::PacketNotify structure. This structure is used to store information about the data written into the network packet. When the packet is either acked or nacked, this notify structure is passed into the NetConnection::handleNotify() function. Subclasses of NetConnection can subclass the PacketNotify structure and override the allocNotify method to add custom data to the packet tracking record.

The GameConnection in the Torque example introduces the concept of the control object. The control object is simply the object that the client associated with that network connection controls. By default in the example the control object is an instance of the Player class, but can also be an instance of Camera (when editing the mission, for example).

The Torque example uses a model in which the server is the authoritative master of the simulation. To prevent clients from cheating, the server simulates all player moves and then tells the client where his player is in the world. This model, while secure, can have problems - if the network latency is high, this round-trip time can give the player a very noticeable sense of movement lag. To correct this problem, the example uses a form of prediction - it simulates the movement of the control object on the client and on the server both. This way the client doesn't need to wait for round-trip verification of his moves - only in the case of a force acting on the control object on the server that doesn't exist on the client does the client's position need to be forcefully changed.

To support this, all control objects (derivative of ShapeBase) must supply a writePacketData() and readPacketData() function that send enough data to accurately simulate the object on the client. These functions are only called for the current control object, and only when the server can determine that the client's simulation is somehow out of sync with the server. This occurs usually if the client is affected by a force not present on the server (like an interpolating object) or if the server object is affected by a server only force (such as the impulse from an explosion).

The Move structure is a 32 millisecond snapshot of player input, containing x, y, and z positional and rotational changes as well as trigger state changes. When time passes in the simulation moves are collected (depending on how much time passes), and applied to the current control object on the client. The same moves are then packed over to the server in GameConnection::writePacket(), for processing on the server's version of the control object.

Datablocks

Datablocks (ie, subclasses of SimDataBlock) are used in the network system to store common instance data for objects. For example, a datablock may store animation data, model information, physical movement properties, etc, all of which are shared across a set of common objects. All declared datablocks are sent to clients upon connection as guaranteed events (SimDataBlockEvent), and can then be referenced and sent as part of the initial ghost update. An advantage of datablocks is that they are declared only on the server, so mods to the game can be created without forcing the client to downloading any script data.

NetStringTable

The NetStringTable class manages string data across connections. Every tagged string in the console - those enclosed by single quotes ('), will be sent across a connection only a single time. Every subsequent time that string is sent, an integer tag is substituted for the actual string data. Strings like player names can be added with the addTaggedString console function and removed with the removeTaggedString console function.

Network Console Commands

There are two remote procedure call network console commands - commandToServer and commandToClient. The commandToServer function takes the form: commandToServer(functionNameTag, arg1, arg2, arg3, ... ), where functionNameTag is some string tag. This call is converted into a RemoteCommandEvent and set across to the server. Once there the server calls the local script function serverCmdXXX(clientId, arg1, arg2, arg3, ... ), where XXX is the text of the string tag. The commandToClient function takes the form: commandToClient(clientId, functionNameTag, arg1, arg2, arg3, ... ) where the clientId argument is the object id of the connection object to send to.

The commandTo* functions perform string argument substitution automatically using the in-string % modifier. For example:

commandToClient('EchoMessage', 
                'This %1 guy is super %2', 
                'Got Milk?', 
                'slow at writing documentation');

is executed on the client as:

 function clientCmdEchoMessage(%message, %a1, %a2, %a3, %a4)
{
    // tagged strings must be detagged in order to be displayed.
    echo(detag(%message));
    echo("a1 = " @ detag(%a1));
    echo("a2 = " @ detag(%a2));
    echo("a3 = " @ detag(%a3));
    echo("a4 = " @ detag(%a4));
}

and would echo:

This Got Milk? guy is super slow at writing documentation
a1 = Got Milk?
a2 = slow at writing documentation
a3 =
a4 =

The string substitution number (after the %) refers to the argument position n spaces after the current argument:

CommandToClient('EchoMessage', 
                '%1 is a good %2 for %3', 
                '%1 the good %2', 
                'Role Model', 
                'SuperDood %1', 
                'the dude of super');

Would echo:

Role Model the good SuperDood the dude of super is a good Role Model 
for SuperDood the dude of super
A1 = Role Model the good SuperDood the dude of super
A2 = Role Model
A3 = SuperDood the dude of super
A4 = the dude of super

This functionality is especially useful for status and game messages coming from the server, because each text message compresses into just a small array of tag identifiers.