7.12.10. Adjusting the Connnector Response to Resource Losses

This section describes how to control the Connector responses in the event of the loss of a required Datasource or all Managers.

7.12.10.1. Adjusting the Connnector Response to Datasource Loss

Summary: Whenever no Primary datasource is found, the Connector will reject connection requests.

This feature controls how long the Connector waits for the given type of DataSource to come ONLINE before forcibly disconnecting the client application.

By default, wait indefinitely for a resource to become available.

There are two parameters involved in this decision-making. They are:

  • waitIfUnavailable (default: true)

    If waitIfUnavailable is true, then the Connector will wait for up to the time period specified by waitIfUnavailableTimeout to make a connection for a given QOS. If the timeout expires, the Connector will disconnect the client application (reject connection attempts and close ongoing connections).

    If waitIfUnavailable is false, the Connector will immediately disconnect the client with an error if a connection for a given QOS cannot be made immediately.

  • waitIfUnavailableTimeout (default: 0, wait indefinitely)

    If waitIfUnavailable is true, the Connector will wait for up to waitIfUnavailableTimeout number of seconds before disconnecting the client. If waitIfUnavailable is false, this parameter is ignored. If this parameter is set to zero (0) seconds, the Connector will wait indefinitely (client connection requests will hang forever).

waitIfUnavailable is specific to data source availability and will be considered if everything else is online: connector and data service.

This will typically be used during a switch or failover while the primary changes: the client application will request a primary (RW_STRICT QoS), which at some point is not available since both new and old primaries are offline. With waitIfUnavailable=true, the connector will wait for the new one to come online (Upto waitIfUnavailableTimeout seconds), allowing seamless failover. If set to false, there will be a period of time during switch/failover where client applications will get errors trying to connect AND reconnect/retry failing requests.

For example, to immediately reject connections upon Datasource loss, set the following property in your /etc/tungsten/tungsten.ini file:

property=waitIfUnavailable=false

Warning

PLEASE NOTE: this will make switch and failover much less transparent to the application since the connections will error until the new Primary is elected and back online.

Important

Updating these values require a connector restart (via tpm update), if enabled after installation, for the changes to be recognized.

Important

These entries will NOT work if placed into [defaults], each service must be handled individually.

7.12.10.2. Adjusting the Connnector Response to Manager Loss

Summary: Whenever the Connector loses sight of the managers for a given data service, it will either suspend or reject new connection requests.

waitIfDisabled applies to both:

  1. Whole offline data service: client application tries to connect to a composite or physical data service that is offline, e.g. during a full site outage where the client application requests access to a local primary without allowing redirection to remote site.

  2. connector offline or onhold: typically when the connector looses connectivity to all managers in the cluster, it will first go ON HOLD, then OFFLINE. In both cases, waitIfDisabled defines what to do in such case: throw an error to the client application or wait until network is back and access to a manager is possible. For example, when the connector is isolated from the cluster, setting waitIfDisabled=true will make new connection requests "hang" until either the connector gets back network access to a manager OR waitIfDisabledTimeout is reached

By default, suspend requests indefinitely until Manager communications are re-established.

This feature controls how long the Connector waits during a manager loss event to either suspend or reject the client connection.

Here is the decision chain and associated settings for what happens when the connector loses sight of the managers:

  1. Delay for the value of delayBeforeOnHoldIfNoManager seconds which is 0/no delay by default.

  2. Change state to ON HOLD and begin the countdown timer starting from the delayBeforeOfflineIfNoManager value.

    In the ON HOLD state, the connector will hang all new connections and allow existing connections to continue.

  3. When the delayBeforeOfflineIfNoManager timer expires (30 seconds by default), change state to OFFLINE.

    Once OFFLINE, the Connector with break existing connections because there is no authoritative Manager node from the Connector's perspective. Without a Manager link, any change to the cluster configuration will remain invisible to the Connector, potentially leading to writes on a Replica node.

    By default, all new connection requests will hang in the OFFLINE state. If waitIfDisabled is set to false, then the Connector will instead reject all new connections.

There are multiple parameters involved in this decision-making. They are:

  • delayBeforeOnHoldIfNoManager (in seconds, default: 0, i.e. no delay)

    When the connector loses sight of the managers, delay before going ON HOLD for the value of delayBeforeOnHoldIfNoManager seconds, which is 0/no delay by default.

  • delayBeforeOfflineIfNoManager (in seconds, default: 30)

    Once ON HOLD, delay before going OFFLINE for the value of delayBeforeOfflineIfNoManager seconds, 30 by default.

  • waitIfDisabled (default: true)

    If the Dataservice is OFFLINE because it is unable to communicate with any Manager, the waitIfDisabled parameter determines whether to suspend connection requests or to reject them. If waitIfDisabled is true (the default), then the Connector will wait indefinitely for manager communications to be re-established. If waitIfDisabled is set to false, the Connector will return an error immediately.

    To check for data service state, use the tungsten-connector/bin/connector cluster-status command. For example:

    shell> connector cluster-status
    Executing Tungsten Connector Service --cluster-status ...
    +--------------+--------------------+-------------+--------------+--------+--------+--------------------------------------+------------------+-----------------+------------------+--------------------+---------------------+
    | Data service | Data service state | Data source | Is composite | Role   | State  | High water                           | Last shun reason | Applied latency | Relative latency | Active connections | Connections created |
    +--------------+--------------------+-------------+--------------+--------+--------+--------------------------------------+------------------+-----------------+------------------+--------------------+---------------------+
    | europe       | OFFLINE            | c1          | false        | master | ONLINE | 0(c1-bin.000002:0000000000000510;-1) | MANUALLY-SHUNNED | 0.0             | 5193.0           | 0                  | 3                   |
    | europe       | OFFLINE            | c2          | false        | slave  | ONLINE | 0(c1-bin.000002:0000000000000510;-1) |                  | 1.0             | 5190.0           | 0                  | 0                   |
    | europe       | OFFLINE            | c3          | false        | slave  | ONLINE | 0(c1-bin.000002:0000000000000510;-1) |                  | 2.0             | 5190.0           | 0                  | 1                   |
    +--------------+--------------------+-------------+--------------+--------+--------+--------------------------------------+------------------+-----------------+------------------+--------------------+---------------------+

For more information, see Connector On-Hold State.

For example, to decrease the ON HOLD time to 15 seconds, the following property should be added to your /etc/tungsten/tungsten.ini file:

property=delayBeforeOfflineIfNoManager=15

For example, to immediately reject connections upon Manager loss, the following property should be added to your /etc/tungsten/tungsten.ini file:

property=waitIfDisabled=false

Warning

PLEASE NOTE: this will make switch and failover much less transparent to the application since the connections will error until communications with at least one manager has been established and the Connector is back online.

Important

Updating these values require a connector restart (via tpm update), if enabled after installation, for the changes to be recognized.

Important

These entries will NOT work if placed into [defaults], each service must be handled individually.