1. Variant Server Overview
1.1. Code Variation Management (CVM)
Software application development is accelerating. Many leading teams release new code continuously, with each independently deployable code delta being released as soon as it’s ready, unbundled from other such deltas, — sometimes multiple times per second. In such dymanic operational environment, code variations play instrumental role in de-risking the SDLC. A code variation is when one or more alternate code paths, intended to co-exist with an existing code path, must be provided by an application. There are several use cases which call for the instrumentation of code variations, which are described below.
1.1.1. Online Controlled Experiments
In an online controlled experiment, the candidate user experience(s) are validated against the existing experience in the form of a randomized controlled trial. In such experiments, the existing experience serves as control and the candidate experience(s) as treatment(s). User traffic is split randomly (though not necessarily equally) between all the experiences, so that any observed difference between the experiences with respect to some metric can be interpreted as being caused by the difference in treatment. The experiment is run for as long as it takes for the measurements to reach statistical significance — a mathematical term, meaning that enough traffic has passed through the experience to provide a degree of confindence that the observed difference is not likely to be attributable to chance alone.
For example, you may want to run an experiment to find out the optimal order amount which entitles your customer to free shipping. In such an experiment you offer several experiences, each promoting a different minimal order amount and target your user traffic to these experiences randomly. As your customers pass through these experiences you can compare the revenue lift your offer of free shipping has generated.
Note, that in the case of online controlled experiences, session targeting must be random, if you are going to be able to interpret correlation as causation, because the ramdomness is a natural control for everything other than the difference in user experience. Refer to the Appendix A for further details on statistical analysis of Variant experiments.
1.1.2. Managed Feature Roll-Outs
The other use case for code variations is feature flags. They refer to a software delivery practice, where a new product feature is rolled out gradually to a carefully controlled group of customers before it is made generally available. Whenever you roll out a new product feature, a feature flag enables you to first publish it to a limited population of users, while sending all others into the stable existing experience. If all goes well, you gradually increase traffic into the new code path until you reach full production, at which point the existing code path can be discarded. But if a defect is discovered, the new feature can be temporarily toggled off until the problem is fixed.
In contrast with online controlled experiments, when instrumenting feature flags you will likely use some deterministic targeting rules for your user traffic. For example you may want to start by allowing into the new code users by their Zip code, or customers by their organization ID.
1.1. Key Features
1.2.1. Client-Server Architecture
Variant CVM Server is deployed on the same network as the host application(s), either on premises or in the cloud, facilitating low network latency and proximity to operational data. The server manages variation schemata, which contain the code variation metadata. Each variation schema is a JSON-superset human readable file containing complete definitions of all related code variations. A single server instance can manage an unlimited number of variation schemata.
Each component of the host application that needs to participate in an code variation, communicates with Variant server via a native client library. At the time of this writing the following Variant client libraries are available:
Java | Fully functional Variant client with complete support for all Variant server functionality. Any component of the host application written in Java or another JVM language can integrate with this client. Several higher-level adapters are also available to take advantage of a particular interactive framework, e.g. the servlet adapter. |
JavaScript | Partial Variant client supporting remote trace events from a Web browser. |
In addition to the client API, Variant server also exposes server side Extension API, used to extend the server’s default behavior with custom semantics.
1.2.2. Separation of Instrumentation and Implementation
At the core of Variant’s philosphy is the idea of strict separation between variation instrumentation and experience implementation. Variation schemata enable developers to define variations as abstract ideas about the behavior of the host application, leaving the implementation of that behavior out, i.e. up to the application developer.
The application developer uses familiar tools to implement new application behavior, unconcerned with how it will be instrumented as code variations. Variant server, on the other hand, handles the complexity of managing code variations, as defined in variation schemata, hiding enormous amounts of complexity from the application developer.
This clean separation dramatically reduces the amount of client code the application developer must write (and, in most cases, remove), in order to instrument code variations.
1.2.3. Distributed Session Management
Variant maintains its own user sessions, instead of relying on the native sessions, maintained by the host application. Variant user sessions are distributed, with Variant server managing their shared state. Any component of the host application, connected to a Variant server, can get a hold of a user session by its ID. Variant guarantees a consistent view of the session’s shared state to all concurrent clients.
1.2.4. Concurrent Variations
Variation concurrency refers to those cases when different code variations affect one or more code segments. Concurrent code variations are more likely than it may first seem because of the Pareto principle, which, as applied to interactive computer applications, states that your users spend 80% of their time on 20% of your application’s code. These few high-contention code paths will be instrumented by multiple concurrent experiments and features rollouts, and Variant gives you a cogent abstraction to manage this concurrency.
1.2.5. Targeting and Qualification Durability
When a user session is first qualified and targeted for an experience, it is typically desirable that the user continues to see the same experience for the remainder of the session and, likely, even on the return visit. Variant supports three durability scopes: state
, session
and variation
, which are declared in the variation’s schema definition. A variation’s targeting durability is declared independently from its qualification durability so they do not have to be the same. For example, a user’s eligibility for an experiment, e.g. related to a promotion, may vary from visit to visit. But whenever she is qualified, the experiment designer typically wants the her to see the same experiment experience.
State scoped durability means that the outcome of qualification or targeting is not reused. A variation with state-scoped qualification durability will be re-qualified for each state request, and a variation with state-scoped targeting durability, if qualified, will be re-targeted for each state request.
Session scoped durability means that the outcome of qualification or targeting is reused for the duration of this user session. A variation with session-scoped qualification durability will be qualified once per session and reused for the duration of this session, but will be re-qualified in a different session. A variation with session-scoped targeting durability will be targeted once per session and reused for the duration of this session, but will be re-targeted in a different session.
Variation scoped durability means that the outcome of qualification or targeting is reused for the entire lifespan of the variation, which is essentially forever. A variation with variation-scoped qualification durability will be qualified once, and the qualification decision will be reused for as long as this variation is defined in the schema. A variation with variation-scoped targeting durability will be targeted once, and the targeting decision will be reused for as long as the variation is defined in the schema.
1.2.6. Extensibility
Variant CVM Server’s default behavior can be extended via the server-side Extension API. It supports creation and configuration of user code which runs in the server’s address space, augmenting the server’s default behavior with custom semantics. ExtAPI exposes two principal extension mechanisms: lifecycle hooks and trace event flushers. Lifecycle hooks are listeners for various lifecycle events raised by Variant server, such as the session qualification or session targeting events. They are configured in the variation schema and are made available to the server’s JVM at run time via the /ext
directory. Lifecycle hooks can be chained to help you modularize and reuse your code.
Event flushers handle the terminal ingestion of Variant trace events. A few standard event flushers, intended for saving trace events in popular databases, such as PostgreSQL and MySQL, come with Variant server, but you may want to create your own, suitable for your operating environment — it is simply a matter of implementing a Java interface.
2. Variant Architecture
2.1. Overview
Variant CVM Server handles all the work related to managing code variations. Host applications accesses it via native client libraries, suitable for their language.
Variant server is deployed on the network local to the host application(s) and the operational database, facilitating low network latency and real-time integration with the host application’s operational data. This architecture is particularly attractive to modern distributed applications which are comprised of multiple service components. Each component communicates with Variant server independently, with the server responsible for the maintenance of shared session state.
The following diagram presents a high-level overview of the different components of Variant software platform:

Figure 1. Variant Architecture.
2.2. Server Configuration
Variant server is configured using the Lightbend Config library . At startup, Variant server looks for configuration in the file conf/variant.conf
. If it is found, its contents override the defaults. You may further override this configuration at run time by providing an alternative configuration file, or override an individual config key in a JVM system property.
Refer to the Variant CVM Server Reference for further information.
2.3. Integration With the Host Application
A host application communicates with Variant server via a native client library, supplied by Variant and suitable for the application’s language. Variant release 0.10 ships with a fully functional Java client and a partial JavaScript client, suitable for deployment to Web browsers.
2.3.1. Variant Java Client
Variant Java client is consumable by any host application running on a Java Virtual Machine (JVM) release 8 or later. It makes no assumptions about host application’s other technology details, which makes it universally applicable to any interactive JVM host application, e.g. an HTTP server or an IVR call center. This flexibility, inevitably, comes at the expense of some runtime environment dependencies, which had to be abstracted out and surfaced in the client’s API, such as a mechanism to track Variant session ID. These dependencies have to be provided at runtime.
Most JVM Web applications are written on top of a Web framework, like Java Servlets or Play!. Such applications should take advantage of the available Servlet adapter for Variant Java client or Play! adapter for Variant Java client. (Get in touch if you want to contribute a different adapter.) These adapters wrap Variant Java client API with a functionally equivalent API, which re-writes environment-dependent method signatures in terms of particular framework classes, such as javax.servlet.http.HttpServletRequest
and play.mvc.Http.Request
and provides framework-specific implementations of all environment-dependent objects.
Refer to the Variant Java Client User Guide for further information.
2.3.2. Variant JavaScript Client
Variant.js supports triggering of trace events from a Web browser environment. For more information, refer to the Variant JavaScript Client User Guide.
3. Code Variation Model (CVM)
3.1. Interactive Application as a Finate State Machine
The only assumption Variant makes about the host application is that it is interactive, i.e. responds to real time user input. Its control flow cycles between two states, as illustrated in Figure 2 below: the processing state, where the application reads, processes and responds to user’s input, and the interface state, where the application waits for it.

Figure 2. An interactive application is a two-state state machine.
Irrespective of the user interface mechanism, the host application pauses in an interface state while waiting for user’s response. Interface states render some response from the application and provide the means for the user to respond to that response. Depending on the type of the host application, an interface state may be manifested as a computer desktop window (desktop application, e.g. MFC), an HTML page (Web application), an activity (Android mobile app), a phone menu (an IVR application), an XML document (RESTful API), etc. These details are not relevant to the CVM.
A user experience then can be thought of as a traversal of a set of interface states, e.g. transitions from one Web page or one telephone menu to the next. The order of traversal is not important to the CVM and is left up to the host application.
3.2. Code Variations
Suppose now that the some interface state(s) exist in more than one variation: the base state and one or more state variants, which the host application may choose from, in place of the base state. The control user experience then is one that traverses the base states, while a variant user experience is one that traverses variant states.
The control experience and one or more related variant experiences form an code variation. A feature toggle or an online experiment are both examples of code variations: they comprise the control experience (the current code path) and one or more variant experiences (new code paths), which will co-exist for a time.
Whenever the host application is in a processing state, it must decide what next interface state to present to the user, Figure 3 below:

Figure 3. Processing state of the host application without code variation (A); and with code variation (B).
In the regular, uninstrumented case (A), the application simply figures out the next state based on the user’s input, carries out requisite computations, renders the state to the user, and pauses. However, if the next state is instrumented by one or more code variations (B), the host application has a set of additional state variants it can choose from. It is exactly this task of figuring out the particular state variant that the host application delegates to Variant server, just like it delegates the task of storing data on disk to a database server.
The step of the processing state where the host application turns to Variant for targeting, followed by the step where the host application carries out the computations needed for rendering of the targeted state variant is called a state request. A Variant session is, in the nutshell, a succession of state requests plus the common state, preserved between the requests.
3.3. CVM: a Domain Model for Code Variations
Code Variation Model (CVM) is a domain model for code variations. It offers a formal framework for defining code variations and for reasoning about them. Its key practical benefit is that it provides a way to externalize the metadata for a set of related code variations in a human-readable documents called variation schemata. Schemata are managed centrally and externally of the host application(s) by Variant server.
This, in turn, enables these two important benefits:
- Separation of instrumentation from implementation. Variation schemata enable developers to define variations as abstract ideas about the behavior of the host application, leaving the implementation of that behavior out and up to the application developer. The application developer uses familiar tools to implement new application behavior, unconcerned with how it will be instrumented as code variations. Variant server, on the other hand, handles the complexity of managing code variations, as defined in variation schemata, hiding enormous amounts of complexity from the application developer.
- Separation of lifecycles. Externalization of variation metadata out of the host application and onto Variant server leads to a very low compute overhead on the host application. All the actual overhead associated with instrumentation of code variations, such as computations, persistence of targeting and qualification information, and dealing with trace event back-pressure is handled by Variant server.
3.4. Variation Schemata
3.4.1. Metadata for Code Varations
Each variation schema is a human readable file containing complete definitions of all related code variations expressed with JSON-superset grammar. A single Variant server can manage any number of such schema files, located in the server’s /schemata
directory. This section introduces CVM’s concepts by example. For a complete reference, refer to the Variant CVM Server Reference.
The two top-level entities in CVM are states and variations. The states represent host application’s interface states. They are a rather opaque concept to CVM: all it needs to know about a state is its name and an optional set of state parameters. The state parameters are simple key/value pairs of strings, whose meaning is external to CVM and only meaningful to the host application.
The variations, on the other hand, are complex structures entirely managed by CVM. At a minimum, a variation must have:
- Name
- Exactly one control experience (typically mapped to the existing code path) and at least one variant experience(s), mapped to alternate code path(s).
- A list of on-state value objects, one per each state instrumented by this variation.
3.4.2. Minimal Valid Variation Schema
A minimal valid variation schema consists of a single state, instrumented by a single variation with one variant experience, like the one in Listing 1 below, where we model a feature rollout that adds Recaptcha to the existing password reset page.
// A very simple variation schema with a single state,
// instrumented by a single variation.
{
'meta':{
'name':'MinimalSchema'
},
'states':[{'name':'passwordResetPage'}],
'variations':[
{
'name':'RecaptchaOnPasswordReset',
'experiences':[
{'name':'noRecaptcha', 'isControl':true},
{'name':'withRecaptcha'}
],
'onStates':[{'stateRef':'passwordResetPage'}]
}
]
}
Listing 1. A minimal variation schema with one state and one variation.
The meta
section contains the schema’s name by which it is known to the connecting clients. The states
clause contains the sole state definition, representing the password reset page on which the new feature is expressed. The variations
clause contains the sole variation RecaptchaOnPasswordReset
with two experiences, noRecaptcha
and withRecaptcha
.
3.4.3. Minimal Practical Variation Schema
In the next example, borrowed varbatim from the Variant Demo Application , we introduce a number of important new concepts:
- Concurrent variations.
- Variations spanning multiple states.
- Lifecycle hooks.
/*
* Variant Java client + servlet adapter demo application.
* Demonstrates instrumentation of an experiment and a concurrent feature toggle.
* See https://github.com/getvariant/variant-java-demo for details.
*
* Copyright © 2015-2018 Variant, Inc. All Rights Reserved.
*/
{
'meta':{
'name':'petclinic',
'comment':'Variant schema for the Pet Clinic demo application'
},
'states':[
{'name':'vets'},
{'name':'newVisit'}
],
'variations':[
/*
* Vet's hourly rate feature toggle on the vets page only.
* Demonstrates lazy instrumentation.
*/
{
'name':'VetsHourlyRateFeature',
'experiences':[
{
'name':'existing',
'weight':1,
'isControl':true
},
{
'name':'rateColumn',
'weight':3
}
],
'onStates':[
{'stateRef':'vets'}
]
},
/*
* The Schedule-a-Visit Experiment on 2 pages.
* Demonstrate eager instrumentation and conjoint variation concurrency.
*/
{
'name':'ScheduleVisitTest',
'conjointVariationRefs':['VetsHourlyRateFeature'],
'experiences':[
{
'name':'noLink',
'weight':1,
'isControl':true
},
{
'name':'withLink',
'weight':3
}
],
'onStates':[
{'stateRef':'vets'},
{'stateRef':'newVisit'}
],
'hooks': [
{
// Disqualify blacklisted users.
'class':'com.variant.extapi.std.demo.UserQualifyingHook',
'init': {'blackList':['Nikita Krushchev']}
}
]
}
]
}
Listing 2. The variation schema of the Variant Demo Application .
This schema contains two variations: 1) the feature flag VetsHourlyRateFeature
exposes an early release of a the new feature on the vets
page; and 2) the experiment ScheduleVisitTest
on pages vets
and newVisit
. The new feature adds the hourly rate column to the vets table, and the experiment verifies the hypothesized lift in new appointment bookings due to the new Schedule vist
link to the newVisit
page also on the vets table.
The ScheduleVisitTest
experiment has a lifecycle hook disqualifying black-listed customers from this test. Variant server posts this hook whenever it must determine if a new user session qualifies for this experiment. Lifecycle hooks are managed via the server-side Extension API. For more information, refer to Section 5.1.
The fact that both variations in the petclinic
schema are instrumented on the vets
page makes them concurrent. By default, Variant assumes concurrent variations to be disjoint, which is to say that only one of them can be in a non-control variant. This convenient default shields application developers, working on new features, from needing to know what other developers are doing. If two independent features happen to overlap, Variant server will ensure that a user session is never targeted for both of them.
However, this default behavior can be overridden. If the two developers join forces and provide the code path, which implements both features simmultaneously, then they can direct Variant server to treat these two code variations as conjointly concurrent. This is accomplished by providing the conjointVariationRefs
schema property, e.g. like in the schema in Listing 2 above. Concurrent variations are considered in detail in Section 3.6.
3.5. State Variants
Whenever a state is instrumented by a variation, this instrumentation constitutes an obligation, on the part of the host application, to provide an implementation of any experience defined by the variation. In practical terms this means that the host application must provide a code path for each state variant defined by the schema either explicitly or implicitly.
By default, Variant can infer state variants from the onStates
properties. For example, in Listing 2 above, the onStates
properties define the many-to-many mapping between variations and states instrumented by them. This mapping is sufficient for Variant server to infer all state variants for each state. However, there are cases when the application developer wants to override one or more of these default state variants, as discussed in the following sections.
3.6. State Parameters
Code Variation Model makes no assumptions about technology or semantics of the host application. It is equally applicable to Web, native mobile, IVR or any other interactive applications. To help applications enrich variation schema with application-specific state, CVM provides state parameters, — simple key/value pairs of strings, whose meaning is entirely up to the host application.
State parameters can be specified either at the state or at the state variant level, as illustrated in Listing 3 below.
{
...
'states':[
...
{
'name':'state1',
/*
* State parameters, specified at the state level,
* provide the base values for all variants of this state.
*/
'parameters': {
'key1':'value1',
'key2':'value2'
}
},
...
],
'variations':[
{
'name':'variation1',
'experiences':[
{
'name':'existing',
'isControl':true
},
{
'name':'variant'
}
],
'onStates':[
{
'stateRef':'state1',
'variants': [
{
'experienceRef':'variant',
/*
* State parameters, specified at the state variant level,
* at runtime override the likely-keyed base values within
* the scope of the enclosing state variant.
*/
'parameters': {
'key2':'value2 in state variant',
'key3':'value3 in state variant'
}
},
// Other state variants
]
},
// Other states
]
},
// Other variations
]
}
Listing 3. State parameters at the state and state variant levels.
State parameters specified at the state level, provide the base values, which have the global scope. At runtime, these parameters are available to the host application within the scope of each state variant deriving from this state, across all variations.
State parameters specified at the state variant level have the scope of that state variant only and within that scope override the likely-named base parameter values. In other words, at runtime, these client calls will return the following results:
stateRequest.getResolvedStateParameters().get("key1"); // "value1"
stateRequest.getResolvedStateParameters().get("key2"); // "value2 in state variant"
stateRequest.getResolvedStateParameters().get("key3"); // "value3 in state variant"
This mechanism of state parameter overrides is a convenient way for the developer to introduce application state into the schema at both global and local scopes.
3.7. Phantom Instrumentation
Typically, all experiences in a variation will instrument the same set of states. But there are use cases where this assumption does not hold. For example, you may want to split a busy page in two, or to consolidate two sparse pages into one, as illustrated in Figure 4 below.

Figure 4. Phantom Instrumentation with a phantom state in the control experience (A) and in a variant experience (B).
The type of instrumentation where a state is instrumented by some, but not all experiences in a variation is referred to as phantom instrumentation. Whenever a state variant is undefined in some experience, it is referred to as phantom state variant in that experience. A phantom state variant constitutes an obligation on the part of the host application not to enter this state if it is in the experience where this variant is phantom. For example, in the Figure 4.A above, if a session targeted to the control experience attempts to enter state S2
, Variant will throw a runtime exception.
The next section introduces an example of phantom instrumentation.
3.8. Variation Concurrency
3.8.1. Motivation
If two variations instrument no states in common, they are referred to as serial variations, meaning that a user session can only traverse them one at a time. Conversely, whenever two variations instrument one or more states in common, they are referred to as concurrent variations because a user session may be traversing them concurrently. Variant Code Variation Model offers full support for variation concurrency; any possible interleaving of two concurrent variations can be defined in the variation schema.
In Figure 5 below, the Blue and the Green variations are serial, but the Red variation is concurrent with both Blue and Green.

Figure 5. Concurrent experiments. Blue and Green variations are serial, while Red is concurrent with both Blue and Green. The grey boxes denote control states, while the colored ones denote state variants.
When a user session targets a state that is instrumented by two or more variations, there is a state variant space of possible experience permutations from which a state variant can be chosen. For example, in Figure 5 above, state S2
is instrumented by Blue and Red variations. Blue only has one variant experience and Red has two variant experiences, so the complete variant space of the state S2
has 6 cells:

Figure 6. Variant space of the state S2
has one control, three proper, and two hybrid state variants.
The relationship of concurrence between two variations V1 and V2 has the following properties:
- Symmetric: If variation V1 is concurrent with variation V2, then V2 is concurrent with V1.
- Not Reflexive: a variation cannot be concurrent with itself.
- Not Transitive: If V1 is concurrent with V2 and V2 is concurrent with V3, then V1 and V3 need not be concurrent.
Variant server supports two runtime strategies for managing concurrent variations: a simplified, pseudo-serial strategy, called disjoint concurrency and the more powerful conjoint concurrency, as discussed in the next two sections.
3.8.2. Disjoint Concurrency
First, let’s consider a pseudo-serial execution, when the two variations are traversed in isolation. To support Blue variation by itself, application developer needs to implement the S2blue
experience. Similarly, to support Red variation in isolation, (probably some other) developer needs to implement its two variant experiences S21red
and S22red
. This is a perfectly acceptable scenario, so long as no user session ends up targeted to variant experiences in both variations. If that were to happen, the host application would have no code path, implementing both S2blue
and S21red
state variants at once.
This type of constrained concurrency is referred to as disjoint concurrency and is the default behavior. Unless instructed otherwise (as described in the next section), Variant will not target a user session to two variant experiences in two concurrent variations. This default makes sense: application developers should not have to communicate with each other simply because they work on overlapping features.
The price of this convenience is the potential starvation of downstream variations of user traffic, which is frequently acceptable.
3.8.3. Conjoint Concurrency
The unconstrained concurrency mode, where a session’s ability to participate in Red variation is not constrained by its participation in Blue variation, and vice versa, is referred to as conjoint concurrency. To instrument two conjointly concurrent variations, the application developer has to do the following:
- Implement all hybrid experiences, e.g. the two hybrid state variants shaded in two colors in Figure 6 above.
- Tell Variant to treat the two variations as conjoint by using the
conjointVariationRefs
schema property, as we did in thepetclinic
schema in Listing 2.
Listing 4 below is the complete variation schema for the Blue, Red and Green variations from Figure 5 above. To illustrate both concurrency modes, Red and Blue variations as defined as conjoint and Green and Red variations as disjoint.
{
'meta':{
'name':'Tricolor',
'comment':'Schema for Red, Green, Blue variations on Figure 5'
},
'states':[{'name':'S1'}, {'name':'S2'}, {'name':'S3'}, {'name':'S4'}],
'variations':[
{
'name':'Blue',
'experiences':[
{'name':'grey', 'isControl':true},
{'name':'blue'}
],
'onStates':[{'stateRef':'S1'}, {'stateRef':'S2'}]
},
{
'name':'Red',
'conjointVariationRefs':['Blue'], // Conjointly concurrent with Blue
'experiences':[
{'name':'grey', 'isControl':true},
{'name':'red1'},
{'name':'red2'}
],
'onStates':[{'stateRef':'S2'}, {'stateRef':'S3'}]
},
{
'name':'Green', // Serial with Blue and disjointly concurrent with Red
'experiences':[
{'name':'grey', 'isControl':true},
{'name':'green'}
],
'onStates':[
{'stateRef':'S3'},
{
'stateRef':'S4',
'variants':[
{
// Explicit phantom variant definition.
'isPhantom': true
'experienceRef': 'grey',
}
]
}
]
}
]
}
Listing 4. The Tricolor variation schema of concurrent tests from Figure 5.
Note the explicit state variant for the Green variation’s control experience on state S4
. It is needed in order to declare it as phantom to account for the fact that there is no control state variant, i.e. that a user session is not allowed to target for S4
if it has already been targeted to the control experience in Green variation.
4. Variant Runtime
4.1. The Lifecycle of a State Request
As already explained, Code Variation Model treats interactive applications as finite state machines. Each user session traverses some state graph, whose nodes are the interface states, where the host application pauses for user input. In real world, state nodes can be traditional HTML pages, Angular views, IVR menus, Android activities, etc. — the points in the host application where it pauses waiting for user input.
Whenever host application is about to return to the user session a particular interface state, it must determine if this state exists in more than one variant (i.e. is instrumented by any code variations), and, if so, which of these variants to return. Both of these tasks are accomplished by the Session.targetForState(state)
method. It returns the StateRequest
object which may be further examined for the list of live experiences in all variations instrumenting this state.
Thus, a Variant session can be thought of as a succession of consecutive state requests, united into a single user experience by Variant session. At runtime, the session must be created first, before any state targeting may happen, so we consider it first in the next section, followed by a closer look at the state request.
4.1.1. Variant Session
In order to communicate with Variant server, host application must connect to it and create a Variant session as follows:
// Arguments are environment-dependent
Session variantSession = variantConnection.getOrCreateSession(...);
The arguments to the getOrCreateSession()
method are environment-dependent and are discussed in detail in the Java Client User Guide.
Variant sessions provide
- A way to identify a user across multiple state requests;
- Storage for the session state that must be preserved between state requests;
- Metadata isolation context.
Variant server acts as the centralized session repository, accessible to any Variant client by the session ID. All clients sharing a session are guaranteed a consistent view of the session state. Sessions are expired after a configurable period of inactivity.
Variant hides any changes to variation schema from active sessions, which continue to see the variation metadata as it was at the time when the sessions were created. This isolation guarantee is critical in protecting user sessions from (potentially fatal) inconsistencies. For example, if a variation is taken offline, or one of its variant experiences is dropped, existing sessions, currently traversing this variation, would be thrown out of their experiences, if this change were visible.
Note, that Variant sessions are completely separate of the host application’s own native sessions. Variant sessions are configured independently and do not require that the host application even have any native notion of a session.
4.1.2. State Request
Whenever the host application is about to serve a user session a particular interface state, potentially instrumented by one or more code variation, it consults Variant server for the targeting information by calling the Session.targetForState(state)
method, which returns the StateRequest
object.
Continuing with our Tricolor schema from the Listing 4, this is how a Variant session gets targeted for the state S2
:
// Obtian the state from the variation schema.
State s2 = variantSession.getSchema().getState("S2").orElseThrow(
() -> new RuntimeException("State S2 is not in schema!"));
// Taraget current session for the state.
StateRequest variantStateRequest = variantSession.targetForState(s2);
Much of the complexity, hidden by Variant server from the application developer, is inside the targetForState(state)
method. Indeed, for each variation, instrumented on the given state, Variant server must perform the following steps:

Figure 7. Qualification and targeting of a session.
The StateRequest
object has methods that the host application can call to figure out to what experience in a particular variation it is targeted. For example, to find out to what experience in Red variation the session has been targeted:
// Obtain the variation from the variation schema.
Variation redVar = variantSession.getSchema().getVariation("Red").orElseThrow(
() -> new RuntimeException("Variation Red is not in schema!"));
Variation.Experience redVarExp = variantStateRequest.getLiveExperience(redVar).orElseThrow(
() -> new RuntimeException("No live experience in variation Red!"));
if (redVarExp == redVar.getExperience("grey").get() {
// Do control experience "grey"
}
else if (redVarExp == redVar.getExperience("red1").get() {
// Do experience "red1"
}
else if (redVarExp == redVar.getExperience("red2").get() {
// Do experience "red2"
}
else {
throw new RuntimeException("Don't know what to do for experience " + redExperience);
}
4.2. Session Qualification
4.2.1. How Variant Qualifies Sessions
Qualification is a distinct idea from targeting. Suppose, for example, that a newspaper wants to test promotional rates, offered on its website. This promotion cannot be combined with another promotion, so the traffic coming from other promotional offers must be disqualified from the experiment.
Whenever Variant determines that the calling session’s qualification for a particular vairation must be (re)established, it raises the VariationQualificationLifecycleEvent
lifecycle event, which posts eligible lifecycle hooks. If none were defined or none returned a usable result, the default built-in qualification hook is posted, which unconditionally qualifies all session for all variations. For more information on lifecycle hooks, refer to Section 5.1.
If the session is qualified for a variation, Variant proceeds to the targeting step, discussed in the next section. If the session is disqualified, it is assigned to the control experience, but not targeted for it. The differene between an assignment to the control experience outside of a variation and beting targeted to the control experience in variation is that
- The set, returned by the
StateRequest.getLiveExperiences()
method, doesn’t contain an entry for the disqualified variation; - No trace events are triggered on behalf of disqualified variations, neither explicit nor implicit.
4.2.2. Qualification Longevity
Once a session has been (dis)qualified for a variation, the natural question is how long this qualification decision should remain in effect before it is reëvaluated. Variant supports three longevity scopes: request, session and variation, which correspond to these three qualification guarantees:
Unstable qualification. Since each session is re-qualified for each state request, its qualification outcome may change midstream. While this may be desireble behavior in some cases, application developers must account for this possibility.
Stable qualification. Each session is qualified only once for each code variation it traverses and this qualification outcome stayis in effect until the session expires. This is the default qualification longevity.
Durable qualification. Once a recognized user is (dis)qualified for a particular variation, this qualification decision stays in effect this variation is removed from the schema.
Qualification longevity is defined in the variation schema on the per-variation basis as follows:
{
'meta':{
'name':'Tricolor',
'comment':'The revised tri-color schema with qualification longevity'
},
'states':[...],
'variations':[
{
'name':'Blue',
'qualification':'unstable', // Or 'stable' or 'durable'
...
},
...
]
}
Listing 5. The Tricolor variation schema with different levels of qualification longevity.
Unstable and stable qualification is provided by Variant automatically: all you have to do is to define it in the schema. (If you don’t define any, Variant will default to stable qualification.) But durable qualification, requires that host application identifies each session with a unique key, such as user ID, by which the user can be identified between sessions.
// Get or create session.
Session ssn = connection.getOrCreateSession(...).withIdentity(userId);
4.3. Session Targeting
4.3.1. How Variant Targets Sessions
After a session has been qualified for a code variation, Variant must target it to some experience in that variation. Even in a serial case, when the requested state is only instrumented by one variation, the targeting algorighm is complex. As will be discussed in the next section, targeting decision is subject to the same longevity rules as qualification. Consequently, there may already be an existing targeting information. The server will make best attempt to honored it, though it may not awalys be possible.
If this state is phantom in any of the experiences, these experiences must be excluded from the set of possible targets, unless there’s already targeting decision to be honored, in which case, if the requested state is phantom in that experience, the session’s attempt to target for this state is a user error. The complexity of the targeting algorithm grows dramatically for concurrent variations.
Whenever, inside the targetForState(state)
method, Variant determines that the calling session must be targeted for a particular vairation, it raises the VariationTargetingLifecycleEvent
lifecycle event, which posts eligible lifecycle hooks. If none were defined or none returned a usable result, the default built-in targeting hook is posted, which targets randomly, according to the weights provided in the schema, e.g
{
...
'variations':[
{
'name':'Blue',
'experiences':[
{
'name':'grey',
'isControl':true
'weight': 9 // Random weight
},
{
'name':'blue',
'weight': 1 // Random weight
}
],
'onStates':[{'stateRef':'S1'}, {'stateRef':'S2'}]
},
...
]
}
For more information on lifecycle hooks, refer to Section 5.1.
4.3.2Targeting Longevity
The longevity of a targeting decision is subject the same rules as that of qualification, already considered in Section 4.2.2. You can specify one of three longevity levels: unstable, stable, or durable, which correspond to the three longevigy scopes: request, session and variation.
Unstable targeting means that the session is re-targeted for each state request. In other words, each time the host application calls Session.targetForState()
all pre-existing targeting information is discarded. While this behavior may be desireble in some cases, application developers must account for the possibility of a session being retargeted mid-stream.
Stable targeting means that a session is (re)targeted exactly once, when it first requests a state instrumented by a particular variation. This targeting decision stays in effect until the session expires. This is the default targeting longevity. It guarantees each user session a consistent experience, but a return user may see a different experience.
Finally, durable targeting implies that once targeted for a particular experience, it persists between sessions, and a recognized return user will see the same experience, so long as the variation is defined in the schema. Note, that this guarantee is subject to certain conditions, as described in the next section.
Targeting longevity is defined in the variation schema on the per-variation basis as follows:
{
'meta':{
'name':'Tricolor',
'comment':'The revised tri-color schema with targeting longevity'
},
'states':[...],
'variations':[
{
'name':'Blue',
'targeting':'unstable', // Or 'stable' or 'durable'
...
},
...
]
}
Listing 6. The Tricolor variation schema with different levels of targeting longevity.
Unstable and stable targeting is provided by Variant automatically: all you have to do is to define it in the schema. (If you don’t define any, Variant will default to stable targeting.) But durable targeting, requires that host application identifies each session with a unique key, such as user ID, by which the user can be identified between sessions.
// Get or create session.
Session ssn = connection.getOrCreateSession(...).withIdentity(userId);
4.3.3. Metadata Modifications
Stable targeting is guaranteed by Variant unconditionally, because Variant sessions are isolated from any schema changes. Variant hides any changes to variation schema from active sessions, which continue to see the variation metadata as it was at the time when the sessions were created.
However, because the variation schema may have changed between a user’s two consecutive session, durable targeting cannot be guaranteed unconditionally. Consider the following scenario:
- Your schema contains two conjointly concurrent variations, both defined with durable targeting;
- Some user has traversed these variations and was randomly targeted to variant experiences in both;
- A bug was discovered in the hybrid experience and you’ve changed concurrency to disjoint;
- The same user visits again. His targeting information is no longer consistent with the schema and must be revised.
When cases like this arise, Variant will discard the least recently used targeting decision.
4.4. Schema Management
When Variant server starts, it looks for variation schema files in the schemata
directory and attempts to deploy them sequentially. A schema file must contain exactly one uniquely named Variant schema. There is no requirement that the schema file name match that of the schema it contains, though it is recommended that you name each schema file similarly to the schema therein.
For each schema file in the schemata
directory Variant server takes these steps:
- Parse the schema file. Any messages emitted by the parser are written to the server log file.
- Deploy if no parse errors. If any parser errors were encountered, Variant server skps this schema file. Otherwise, if no parser errors and provided no already deployed schemata has the same name, Variant will deploy this schema.
To (re)deploy a variation schema on a running Variant server, simply place the (updated) schema file in the schemata
directory. A running server detects the new (or updated) file and attempts to deploy the schema from it by following these steps:
- Parse the schema. Any messages emitted by the parser are written to the server log file.
- Deploy if no parse errors. If any parser errors were encountered, Variant server skips this schema file. Otherwise, if no parser errors, Variant will attempt to deploy this schema, subject to the following conditions:
- If no currently deployed schemata has the same name as this schema, this schema is deployed.
- If a currently deployed schema has the same name as this schema, their respective file names must also be the same.
- If both of the above conditions stand, the currently deployed schema is undeployed and the new one is deployed in its place.
To undeploy a currently deployed schema, simply remove the corresponding schema file.
Whenever a schema is undeployed, Variant server will hold on to its memory representation, while all active sessions connected to it naturally expire. All new sessions are created against the currently deployed generation, if any. Session draining isolates active sessions from schema updates, which is instrumental in Variant’s ability to provide stable qualification and targeting. In practice this means that, for instance, you can shut off a feature flag without worrying about disrupting active users who are already in the experience.
4.5. Trace Event Logging
Variant trace events are generated by user traffic, as it flows through Variant variations, with the purpose of subsequent analysis by a downstream process. Trace events can be triggered implicitly, by Variant, or explicitly by the host application. In either case, the host application can attach attributes to these events, to aid in the downstream analysis.
The only implicit trace event is the state visited event. It is created at the start of the state request, Figure 3, and triggered when StateRequest
is committed or failed. This gives the host application a chance to attach custom attributes to the event. For example, if the host application caught an exception, it may wish to set the status of the event to error, and add the name of the class that threw the exception. This information can be used downstream to exclude this session from the statistical analysis (if this is an experiment), or to shut off the variation (if this is a feature flag).
Explicist trace events are triggered by calling the Session.triggerTraceEvent()
method.
Trace events are egested onto external storage via Trace Event Flushers which are part of the Extension API, discussed next.
5. Extending Variant Server
Variant CVM Server’s default behavior can be extended via the server-side Extension API. It supports creation and configuration of user code which runs in the server’s address space, augmenting the server’s default behavior with custom semantics. ExtAPI exposes two principal extension mechanisms: lifecycle hooks and trace event flushers. They are configured in the variation schema and made available to the server’s JVM at run time via the /ext
directory.
Refer to the Variant CVM Server Reference for further details on configuring ExtAPI.
5.1. Lifecycle Hooks
The ScheduleVisitTest
from Listing 2 above defined a lifecycle hook class UserQualifyingHook
, which disqualifies black-listed users from the experiment. Here’s the relevant section from Listing 2:
...
'hooks': [
{
// Disqualify blacklisted users.
'class':'com.variant.extapi.std.demo.UserQualifyingHook',
'init': {'blackList':['Nikita Krushchev']}
}
]
...
Lifecycle event hooks are callback methods, executed by Variant server when correponding lifecycle events are raised. For example, when a user session must be qualified or targeted for a particular variation, two corresponding lifecycle events are raised: the session qualification event and the session targeting event. If you have defined custom hooks for these events, Variant will post them by calling their post()
method.
Lifecycle hooks provide a way to extend Variant server’s default behavior with application-specific semantics. They are executed in the server process’s address space and are highly reusable modules encapsulating common semantics and having their own lifecycle, independent of that of the host application.
Depending on where a hook is defined in the schema, it may have the global (or meta) scope, a state scope or a variation scope. Global hooks are defined in the meta
section and apply to all states and all variations in this schema. A state-scoped hook only applies to the state with which it is defined, and a variation-scoped hook applies only to the variation with which it is defined.
In any scope, any number of hooks can be defined. If more than one lifecycle hook is eligible to be posted by a lifecycle event at runtime, they form a hook chain. More locally defined hooks are posted before the global ones on the chain, and within a scope hooks are posted in ordinal order. The hooks are posted serially, until a hook’s post()
method returns a non-empty Optional
. If no custom hooks have been defined for a lifecycle event, or all returned an empty Optional
, the default built-in hook for the event is posted, which is guaranteed to return a usable value.
For more information, refer to the Variant Server Reference.
5.2. Trace Event Flushers
Event flushers handle the terminal ingestion of Variant trace events. A typical event flusher writes them to a persistent storage mechanism, such as an external database or event stream. Whenever a trace event is triggered — implicitly by Variant server or explicitly by user code — it is picked up by the Variant’s asynchronous event writer, where it is held in a memory buffer until a dedicated flusher thread becomes available. There is one event writer per Variant server, shared by all schemata. Event writer groups trace events by the schema that produced them and turns them over to the apropriate event flusher by calling its flush()
method.
The size of the trace event buffer, passed to the flush()
method, is configured by the variant.event.writer.flush.size
server config property, whose value refers to the number of trace events held in a single flush buffer. The overall size of the event writer cache is configured by the variant.event.writer.flush.buffers
server config property, whose value refers to the total number of flush buffers available to the event writer. The larger the number of flush buffers, the better the event writer is able cope with bursts of trace evens, but at the price of additional memory footprint.
Whenever the event writer is not keeping up with the event load, it will discard new events (with an error message to the server log) until a flush buffer becomes available.
A few ready-made event flushers, intended for saving trace events in popular databases, such as PostgreSQL and MySQL, are included in Variant server’s standard extension library, included with the server. These can be configured and used out of the box.
It is also straightforward to create a custom event flusher by implementing the TraceEventFlusher
interface . See Variant Server Reference Guide for more information.
Appendix A Analyzing Variant Controlled Experiments
5.1. Trace Event Data Aggregation
Each Variant experiment is designed with particular target metric(s) in mind. But regardless of the target metric(s), the starting data point is always a time-series of trace events, such as the page visited event, which must be aggregated into a time series of measurements, such as revenue as a function of number of users through the experiment. The details of this aggregation step depend entirely on the longevity mechanism you’ve chosen for your trace events. If your flusher inserts them into a relational database, you will likely use SQL. A distributed data processing framework, like Apache Hadoop , can also be successfully deployed for longevity and aggregation of Variant trace events.
5.2. Statistical Analysis
The goal of an experiment to
- Discover if there is a difference between control and variant experience(s) with respect to the target metric of interest;
- Asses how certain can we be that this difference is not just random noise.
The latter can be accomplished with some well-known mathematical formulas developed in the field of statistical hypothesis testing. The fundamental idea there is to develop a procedure that will enable the researcher to make a claim about the entire population with a given degree of certainty, based on a set of sample observations. Refer to the Statistical Analysis of Variant Experiments white paper for more information.