Introduction

traffic-splitter is a component that allows for HTTP traffic to be directed to an appropriate upstream depending on the request matching certain criteria.

How it really works? It’s not magic…

traffic-splitter-architecture

Getting started

There are two ways you can use this package and both need to be provided with a configuration.

Personalize your own configuration or go here for a sample.

CLI

// in case config is a js file wrap it in:
module.exports = {}

Provided configuration can be a json or a js file.

API

const TrafficSplitter = require('traffic-splitter')
const splitter = new TrafficSplitter(/*your configuration*/)
splitter.start()

Import the package.

Create an instance providing a configuration object.

Start it!

And BOOM, splitter is running!

Example

[
  {
    "name": "mindera.com 0-50",
    "enabled": true,
    "criteria": {
      "in": {
        "bucket": [{ "min": 0,  "max": 50 }]
      },
      "out": {
         "cookie": [{ "name": "keepme", "value": "out" }]
      }
    },
    "upstream": {
      "type": "serveSecure",
      "options": { "host": "www.mindera.com", "port": 443, "headers": { "host": "www.mindera.com" } }
    }
  },
  {
    "name": "npmjs.com 51-100",
    "enabled": true,
    "criteria": {
      "in": {
        "bucket": [{ "min": 51,  "max": 100 }]
      },
      "out": {}
    },
    "upstream": {
      "type": "serveSecure",
      "options": { "host": "www.npmjs.com", "port": 443, "headers": { "host": "www.npmjs.com" } }
    }
  }
]

This upstreams example splits the traffic between mindera.com and npmjs.com, each one will handle 50% of the traffic.

The object “in” inside the criteria contains all the criteria (don’t worry, you’ll learn all about this in a few moments) that need to be respected in order for that upstream to be selected. The same goes for the “out” object but instead, it eliminates that upstream from being chosen.

For instance, if the user has a cookie named “keepme” with the value “out”, he won’t ever be served with the mindera’s upstream. Why? Because mindera’s upstream has a cookie rule (that matches) inside the “out” object.

Also, upstreams can be enabled or disabled by changing the “enabled” property.

“Wow.. this is the real thing! What other type of upstreams are available?”

Just scroll down a bit and you’ll find out :p

Upstream types

Serve

{
  "type": "serve",
  "options": {
    "host": "www.sapo.pt",
    "port": 80,
    "timeout": 6000,
    "headers": {
      "host": "www.sapo.pt"
    },
    "rewrite": {
      "expression": "(.*)",
      "to": "/noticias/"
    },
    "cookies": {
      "flavour": { "maxAge": 60000, "domain": "localhost", "path": "/", "value": "chocolate" }
    }
  }
}

It proxies the traffic to a non-certified host.

Host and port must be defined.

Optional configuration

timeout - sets the proxy timeout

headers - adds and overrides existing ones to be sent upstream, meaning that these headers will be sent to the server where the request is being proxied to

rewrite - rewrites the path of the request by providing a regular expression “expression” and a replacement string “to”. In the given example the url will be what the user has requested but it will always respond for the “/noticias/” path

cookies - adds and overrides existing ones to be sent both for upstream and downstream, meaning that these cookies will be sent to the server where the request is being proxied to and it will also be included in the response (potentially overriding a cookie being set by the upstream server)

Serve secure

{
  "type": "serveSecure",
  "options": {
    "host": "www.mindera.com",
    "port": 443,
  }
}

It proxies the traffic to a certified host.

Host and port must also be defined.

Optional configuration is extended from the serve type.

Redirect

{
  "type": "redirect",
  "options": {
    "location": "https://mindera.com/blog",
    "statusCode": 302
  }
}

It responds to the original request with a redirect response containing the given location and status code.

In this case, when a request to localhost is made, the user will be redirected to “https://mindera.com/blog”.

Serve file

{
  "type": "serveFile",
  "options": {
    "path": "etc",
    "file": "serve_me.jpg",
    "download": false,
    "encoding": ""
  }
}

It serves a file, that easy.

Options

path - path of file

file - file name and extension

download - flag indicating if the file will be showed or downloaded - default is false

encoding - indicates the encoding used to read the file when download is set to true (example, set “utf8” when file is txt) - default is empty

Only path and file properties are required.

Criteria

As shown in the example, criteria has two properties: “in” and “out”.

Those properties are objects that contains rules that will be evaluated against the request made.

{
  "path": ["/home", "/about", "/blog"]
}

There are quite a few default rules at your disposal, check below. These rules are arrays that may contain n elements and only one of them needs to be true for the condition to be true as well. Check the given example: the request can only have one path, so for the condition to be valid that path needs to match one of the given paths.

Oh, and you can also add your custom rules in a very easy and smooth way! (wait… what?) Yeah, it’s true!

agent

"agent": ["Chrome", "Safari", "Opera"]

Evaluates against the user-agent header.

If no such header is provided the condition wont ever be met.

bucket

"bucket": [
  {
    "min": 50,
    "max": 100
  }
]

Bucket selection is one of the most basic forms of traffic splitting. It allows the splitter to send percentages of traffic to different upstreams.

The bucketing system works by issuing a browser identifier cookie (name and lifespan configurable) with random sequence of characters (12 by default). The cookie is only issued once, which the splitter then uses to calculate the bucket (from 0 to 100). Meaning that, once a user was assigned a bucket, it will always be “inside” that bucket (unless cookies are deleted).

"cookie": [
  {
    "name": "amIallowed",
    "value": "yes"
  }
]

Matches against the request cookies.

Remember that only one cookie needs to be matched for the condition to be valid!

device

Device detection is achieved using the mobile-detect package, which is based on the user-agent header.

"device": [
  {
    "device": "tablet",
    "type": "ipad"
  },
  {
    "device": "desktop",
    "browser": "Firefox",
    "version": {
      "from": 35,
      "to": 37
    }
  }
]

Options

device - “desktop”, “phone”, “tablet”, “mobile”. “mobile” will match both tablets and phones.

type - only relevant when device is set to either phone, tablet or mobile. It provides the type of device.

browser - any browser

version - version of browser with the ability to limit the version by specifying a “from” and “to”. Either from or to can be omitted to leave the lower/upper boundary open.

Check the package documentation for the list of device types and browsers.

geoip

"geoip": [
  {
    "country": "US",
    "region": "TX",
    "city": "San Antonio"
  },
  {
    "country": "PT",
    "region": "",
    "city": ""
  }
]

geoip-lite package allows traffic splitter to evaluate the location from where the request was made. This package uses the GeoLite database from MaxMind.

You can leave any property empty to get a broader area.

By default, this uses the IP address from the incoming connection but this can be overridden by passing the ?splitterIP= parameter on the query string.

host

"host": [
  "www.mindera.com",
  "mindera.com",
]

Evaluates against the host header.

If no such header is provided the condition wont ever be met.

path

"path": ["/home", "/about", "/blog"]

Validates against the request path.

For better use of this rule check pathRegExp.

visitor

"visitor": true

User is considered a visitor when request has no cookies.

and

"and": [
  // insert n rules in here
]

This rule consists of an array of rules and it will only be true if all rules inside it are also true.

Note that this particular rule is slightly redundant as the splitter already matches rules within in and out sections using the AND approach (all rules must match).

or

"or": [
  // insert n rules in here
]

By now we bet you already know how this rule works, right? :)

In case you don’t, this rule also consists of an array of rules and it will be true if at least one of the rules inside it is true.

ruleset

"ruleset": ["myRuleset1", "myRuleset2", "myRuleset3"]

Check rulesets configuration to improve processing time and to avoid criterias repetition.

(this is pretty awesome btw!)

Configuration

// load from json file:
const config = require('./conf/config.json')

// or declare it normally
const config = {
  // ...
}

Configuration is an object with several properties (listed below) that will define the splitter behaviour.

Before the server starts, splitter optimizes the configuration by changing some properties and values so that the conditions in runtime are faster, mainly when evaluating rules.

You are able to see how your configuration will be with the getConfiguration method.

Look at a minimum configuration example in here.

api

"api": {
  "serverName": "Traffic Splitter",
  "port": 80,
  "maxConnections": 1024,
  "upstreamKeepAlive": {
    "maxSockets": 1024,
    "maxFreeSockets": 64,
    "timeout": 60000,
    "keepAliveTimeout": 30000
  },
  "emitMetricsInterval": {
    "http": 10000,
    "https": 10000
  },
  "performance": {
    "logSlowRequests": true,
    "slowRequestThreshold": 1000
  }
}

This property is related to the restify instance and agentkeepalive configuration, as well as other values for manipulation of logs and events.

Properties

serverName - restify instance name

port - port for the server: localhost:port

maxConnections - server maximum connections. Default is 256

upstreamKeepAlive - configuration for the agentkeepalive instance

emitMetricsInterval - interval in milliseconds to emit the httpSocketMetrics and httpsSocketMetrics events. Default is 5000

performance - “logSlowRequests” is a flag to log (or not) slow requests and “slowRequestThreshold” its the boundary in which a request is considered slow. Example: given the configuration at right, if a request takes 900 milliseconds then is not a slow request, but if it takes 1000 or more milliseconds it is now considered a slow request

This is a required property.

bunyan

"bunyan": {
  "name": "traffic-splitter",
  "streams": []
}

Splitter uses the bunyan library for logging.

Provide a name for it and 0 or more streams.

In case no streams are provided the splitter will add the stream: { level: ‘info’, stream: process.stdout }

This is a required property.

browserId

"browserId": {
  "cookie": "bid",
  "maxAge": 315360000,
  "length": 12
}

This property defines the configuration for the browserId cookie emitted by the splitter for bucketing purposes.

This is a required property.

domains

"domains": [
  "www.trafficsplitter.io",
  "www.mindera.com"
]

Provide a list for the domains that you want to enable splitter to add the CORS headers.

This is an optional property.

cors

"CORS": {
  "headers": [
    "ORIGIN",
    "X_REQUESTED_WITH",
    "X-Requested-with",
    "Content-Type",
    "Accept"
  ],
  "credentials": "true",
  "methods": [
    "GET",
    "POST",
    "DELETE",
    "OPTIONS",
    "PUT"
  ],
  "max-age": "600"
}

Configuration for the CORS headers in case the host belongs to the domains list.

This is an optional property.

pathRegExp

// please remember that this configuration is a global property of the configuration.
// adding this configuration to an upstream will be simply ignored.
// in doubts check the configuration sample provided at the end of this page :)

// recommended usage
"pathRegExp": {
  "prefix": "^",
  "sufix": "([/?].*)?$"
}

This property is a plus for the path criteria. The example at right is the recommend definition for this property.

The path criteria is turned into a regex when splitter optimizes the configuration and this object allows you to set a prefix and sufix to that regex for every path. Let’s see why this is useful.

Let’s make an example and assume that the path criteria is [“/banana”].

const expression1 = new RegExp("/banana", "i")
const condition1 = request.url.match(expression1)

const expression2 = new RegExp("^" + "/banana" + "([/?].*)?$", "i")
const condition2 = request.url.match(expression2)

Also assume the defined variables at right.

request.url	condition1 (without pathRegExp)	condition2 (with pathRegExp)
/apple
/banana
/bananas
/bananas/1
/bananas?id=1
/banana/
/banana/1
/banana?id=1

(run this test)

We believe that the correct behaviour is achieved with the condition2 (with pathRegExp), but do what best suits your needs, really :)

This is an optional property.

rulesets

"rulesets": {
  "myRuleset1": {
    "cookie": [{ "name": "step", "value": "in" }],
    "path": ["/blog", "/people-and-culture", "/case-studie"],
    "agent": ["Chrome", "Safari"]
  }
}

// inside some upstream
"in": {
  "ruleset": ["myRuleset1"],
  "bucket": [{ "min": 0, "max": 50 }]
}

// inside some other upstream
"in": {
  "ruleset": ["myRuleset1"],
  "bucket": [{ "min": 51, "max": 100 }]
}

This is one hell of a property, our favorite! Let’s us explain you why:

Reason 1: With rulesets you can reduce the time that it takes to calculate the apropriate upstream, awesome, right?!

And how exactly rulesets reduce the processing time you ask? Well, each ruleset will only be evaluated once per request. In case you don’t define rulesets and you still repeat criterias among the many upstreams, those criterias will be evaluated n times per request, unnecessarily.

Reason 2: Its also very useful to avoid repetition of criterias accross the upstreams.

Picture yourself having two upstreams with 10 criterias each (for the ‘in’ part). And in those 10, only one of them is different. Quite boring and repetitive right? Well, just create a ruleset with whatever name you want containing all those 9 equal criteria. And then, back in each upstream delete the repeated criteria and add the ruleset.

Reason 3: By now you should identify this one… simple… it makes things simpler! Easier to manage and much better looking! Just check the example at right.

Notes:

You can have multiple rulesets for each upstream, and only one needs to be matched so the ‘in’ or ‘out’ condition is met

This is an optional property.

upstreams

"upstreams": [
  {
    "name": "mindera <3",
    "enabled": true,
    "criteria": {
      "in": {
        // your criteria to allow access in here
      },
      "out": {
        // your criteria to deny access in here
      }
    },
    "upstream": {
      "type": "serveSecure",
      "options": {
        "host": "www.mindera.com",
        "port": 443,
        "headers": {
          "host": "www.mindera.com"
        }
      }
    }
  }
]

This is the most interesting part :D

It’s this property that contains the array with all the upstreams that you want your splitter to use, just like in the example.

This is a required property.

upstreamsReferences

"upstreamsReferences": {
  "mindera": {
    "type": "serveSecure",
    "options": {
      "host": "www.mindera.com",
      "port": 443,
      "headers": {
        "host": "www.mindera.com"
      }
    }
  }
}

// now, you can do this:
"upstreams": [
  {
    "name": "mindera <3",
    "enabled": true,
    "criteria": { /* ... */ },
    "upstream": "mindera"
  }
]

This property increases the config legibility by simplifying your upstreams.

This is an optional property.

Methods

const TrafficSplitter = require('traffic-splitter')
const splitter = new TrafficSplitter(config)

Splitter provides some methods to help you take more advantage of its features.

Besides the isConfigurationValid all the methods are at instance level.

isConfigurationValid

if (!TrafficSplitter.isConfigurationValid(myConfig)) {
  throw new Error('My configuration is invalid!')
}

This method returns a boolean and tells you if your configuration has the necessary objets and properties to start.

getConfiguration

const optimizedConfiguration = splitter.getConfiguration()

Splitter provides a method that returns the configuration after the optimizations were made.

getLogger

const log = splitter.getLogger()
log.info('Gotta love this!')

// {"name":"traffic-splitter","hostname":"unknown","pid":13360,"level":30,"msg":"Gotta love this!","time":"2017-06-01T15:56:05.897Z","v":0}

This method is at instance level and it returns its bunyan instance so you can use it.

Bunyan is a simple and fast JSON logging library.

bootstrap

splitter.bootstrap((server, config) => {
  // this code will be executed before the server starts
  // do whatever you want/need in here
})

This method allows you to run some code before the server starts.

It takes a callback with two parameters:

server - restify instance

config - optimized configuration

use

splitter.use((server, config) => (req, res, next) => {
  log.info(`Request URL = ${req.url}`)
  return next()
})

This method allows you to add middlewares to the server. It will be executed once per request.

It takes the same parameter as the previous method but this callback should return another callback that it will be called by the server.

Inside the callback returned always do “return next()” (or similar) or the request will get stuck in it.

addRule

// somewhere in your upstreams criteria configuration:
"in": { // or "out"
  "myCustomRule1": {
    "even": false
  },
  "host": ["myHost.com"]
}

// evaluating custom rule
splitter.addRule('myCustomRule1', (criteria, req) => {
  const currentMinute = new Date().getMinutes()
  const isEven = currentMinute % 2 === 0
  return criteria ? isEven : !isEven
})

// overriding splitter rule
splitter.addRule('host', (criteria, req) => {
   return criteria === 'myHost.com'
})

This is maybe the coolest feature of the splitter, it allows you to evaluate your custom rules in a very simplistic way.

And it also gives you the chance to override any splitter default rule evaluation in case you need to adjust it for your needs.

It takes two parameters. First is the name of your rule (in the configuration) and second is a callback that takes two parameters as well (first will be the rule object and second the current request). This callback must return a boolean.

Note: missing methods to evaluate custom rules or those who won’t return a boolean will be ignored.

addExecutor

splitter.addExecutor('redirect', (req, res, next, {config, eventEmitter, log, bidCookieDetails, httpAgent, httpsAgent}) => {
  log.info('my overrided executor :D :D')
  log.warn('I can literally do anything I want here!')
  res.redirect(302, 'https://mindera.com', next)
})

This is just another awesome feature…

Remember the upstream types? Well.. you can add your own! (mind blowing, right?)

Events

const TrafficSplitter = require('traffic-splitter')
const splitter = new TrafficSplitter(config)
const log = splitter.getLogger()

During its life splitter emits several events which you can listen to.

applicationStart

splitter.events.on('applicationStart', () => {
  log.info('Application has started')
})

Emitted when splitter.start() is called, which is after optimizing the given configuration.

serverStart

splitter.events.on('serverStart', () => {
  log.info('Server has started')
})

Triggered after server is configured and started, and also after executing bootstrap functions.

rulesProcessing

splitter.events.on('rulesProcessing', (duration, selectedUpstream) => {
  log.info(`Rules processed in ${duration} milliseconds`)
  log.info(`Selected upstream =  ${selectedUpstream}`)
})

Called each time a request is made.

It tells you the time it took to calculate the upstream (in milliseconds) and also the selected upstream name.

noUpstreamFound

splitter.events.on('noUpstreamFound', (req) => {
  log.info(`No upstream found for request ${req}`)
})

Only emitted when no upstream matched the given request.

resFinish

splitter.events.on('resFinish', (req, res, duration) => {
  log.info(`Response finished in ${duration} milliseconds`)
})

Triggered every time a response is sent to the final user.

serving

splitter.events.on('serving', (statusCode, upstream, duration, host, upstreamReq, upstreamRes) => {
  log.info(`${host} took ${duration} milliseconds to respond with ${statusCode} HTTP code`)
})

Called each time an upstream response arrives (only for serve and serveSecure types).

servingError

splitter.events.on('servingError', (err, upstream, duration, upstreamReq) => {
  log.info(`Error while serving upstream '${upstream.name}':`, err)
})

Triggered when an upstream responds with an error.

servingFile

splitter.events.on('servingFile', (upstream, duration) => {
  log.info(`File read and about to be served for upstream '${upstream.name}'`)
})

Emitted when a file is read and is about to be served

servingFileError

splitter.events.on('servingFileError', (err, upstream, duration) => {
  log.info(`Error reading file for upstream '${upstream.name}':`, err)
})

Emitted when it fails to read a file

upstreamException

splitter.events.on('upstreamException', (exception, upstream) => {
  log.info(`Upstream (${upstream.name}) exception: ${exception}`)
})

Emitted when an error is catched while executing an upstream.

httpSocketMetrics

splitter.events.on('httpSocketMetrics', (agentStatus) => {
  log.info('HTTP socket metrics: ', agentStatus)
})

// {"name":"traffic-splitter","hostname":"unknown","pid":14328,"level":30,"msg":"HTTP socket metrics:  { createSocketCount: 0,\n  createSocketErrorCount: 0,\n  closeSocketCount: 0,\n  errorSocketCount: 0,\n  timeoutSocketCount: 0,\n  requestCount: 0,\n  freeSockets: {},\n  sockets: {},\n  requests: {} }","time":"2017-06-03T19:50:33.799Z","v":0}

Triggered each emitMetricsInterval.http milliseconds when at least one serve upstream type is present in the configuration.

It gives you the agentkeepalive instance current status.

httpsSocketMetrics

splitter.events.on('httpsSocketMetrics', (agentStatus) => {
  log.info('HTTPS socket metrics: ', agentStatus)
})

Works the same way httpSocketMetrics does but for the serveSecure upstream type.

redirecting

splitter.events.on('redirecting', (statusCode, upstream, duration) => {
  log.info(`Redirecting to upstream '${upstream.name}' took ${duration} milliseconds with the HTTP code ${statusCode}`)
})

Emitted every time a request is redirected (consequence of the redirect upstream type).

Debugging

Request:
  curl
    --user-agent "Mozilla/5.0 (iPhone; CPU iPhone OS 10_0 like Mac OS X) AppleWebKit/602.1.38 (KHTML, like Gecko) Version/10.0 Mobile/14A5297c Safari/602.1"
    -G localhost
    -d splitterIP=195.245.151.210
    -d splitterDebug=true
    -v

Response debugging headers:
  x-splitter-upstream: "debugging"
  x-splitter-geo: "PT.17.Porto"
  x-splitter-device: {"ua":"Mozilla/5.0 (iPhone; CPU iPhone OS 10_0 like Mac OS X) AppleWebKit/602.1.38 (KHTML, like Gecko) Version/10.0 Mobile/14A5297c Safari/602.1","_cache":{"phone":"iPhone","mobile":"iPhone","tablet":null},"maxPhoneWidth":600}
  x-splitter-bucket: 66

Pass the ?splitterDebug=true parameter to obtain debug information in the response headers about the criteria selection.

Provided headers

x-splitter-upstream - selected upstream name

x-splitter-geo* - geoip information

x-splitter-device* - device information

x-splitter-bucket* - assigned bucket

* - only available if selected upstream has the respective rule

Release change log

1.2.8 - 08/08/2018

events parameter fixed - fixed servingFile and servingFileError events upstream parameter.
handle file read fail - send 500 http status code when it fails to read the file.

1.2.7 - 02/08/2018

emitting browserid - emit browserid cookie only when a bucket criteria is present.

1.2.6 - 01/08/2018

events created and updated - new events servingFile and servingFileError were created. serving and servingError events now have access to upstream request and response.

1.2.4 - 26/06/2018

referencing upstreams - configuration property to increase config legibility.

1.2.0 - 03/10/2017

serve file upstream type - there is a new upstream type that allows you to serve files!
custom executors - you can now add your own custom executors.

1.1.0 - 16/08/2017

rulesets updated - each ruleset is now only evaluated once per request, improving the time to determine the appropriate upstream.

Ready to go configuration

{
  "api": {
    "serverName": "traffic-splitter",
    "port": 80,
    "maxConnections": 1024,
    "upstreamKeepAlive": {
      "maxSockets": 1024,
      "maxFreeSockets": 64,
      "timeout": 60000,
      "keepAliveTimeout": 30000
    },
    "emitMetricsInterval": {
      "http": 10000,
      "https": 10000
    },
    "performance": {
      "logSlowRequest": true,
      "slowRequestThreshold": 2500
    }
  },
  "bunyan": {
    "name": "traffic-splitter",
    "streams": []
  },
  "browserId": {
    "cookie": "bid",
    "maxAge": 315360000,
    "length": 12
  },
  "pathRegExp": {
    "prefix": "^",
    "sufix": "([/?].*)?$"
  },
  "upstreams": [
    // your upstreams in here
  ]
}

Use this configuration sample as you want.

Add some upstreams as seen in here.

After that you’re ready to rock and split!

OR… go here for free samples!