3.4.4. Resetting a single dataservice

3.4.4. Resetting a single dataservice
Prev	^Up	3.4. Deploying Multi-Site/Active-Active Clustering	Next

3.4.4. Resetting a single dataservice

Note

The procedures in this section are designed for the Multi-Site/Active-Active topology ONLY. Do NOT use these procedures for Composite Active/Active Clustering uing v6 onwards.

For version 6.x onwards, Composite Active/Active Clustering, please refer to Section 3.3, “Deploying Composite Active/Active Clusters”

Under certain conditions, dataservices in an active/active configuration may drift and/or become inconsistent with the data in another dataservice. If this occurs, you may need to re-provision the data on one or more of the dataservices after first determining the definitive source of the information.

In the following example the west service has been determined to be the definitive copy of the data. To fix the issue, all the datasources in the east service will be reprovisioned from one of the datasources in the west service.

The following is a guide to the steps that should be followed. In the example procedure it is the east service that has failed. It is assumed that the value of executable-prefix has been set to mm and the env.sh script has been executed to configure the environment.

Put the dataservice into MAINTENANCE mode. This ensures that Tungsten Cluster will not attempt to automatically recover the service.
```
cctrl [east]> set policy maintenance
```
On the east, failed, Tungsten Cluster service, put each Tungsten Connector offline:
```
cctrl [east]> router * offline
```
Reset the Tungsten Replicator service on all servers connected to the failed Tungsten Cluster service. For example, on west{1,2,3} reset the east Tungsten Replicator service:
```
shell west> mm_trepctl offline
shell west> mm_trepctl -service east reset -all -y
```
Place all Tungsten Replicator services on all servers in the failed Tungsten Cluster service to offline:
```
shell east> mm_trepctl offline
```
Next we reprovision the primary node in the failed cluster (east1 in our example) with a manual backup taken from a replica node within the west cluster (west3 in this example).
Shun the east1 datasource to be restored, and put the replicator service offline, if not already in a failed state, using cctrl :
```
cctrl [east]> set force true
cctrl [east]> datasource east1 shun
cctrl [east]> replicator east1 offline
```
Shun the west3 datasource to be backed up, and put the replicator service offline using cctrl :
```
cctrl [west]> datasource west3 shun
cctrl [west]> replicator west3 offline
```
Stop the mysqld service on both hosts:
```
shell> sudo systemctl stop mysqld
```
Delete the mysqld data directory on east1 :
```
east1> sudo rm -rf /var/lib/mysql/*
```
If necessary, ensure the tungsten user can write to the MySQL directory on east1:
```
east1> sudo chmod 777 /var/lib/mysql
```
Use rsync on west3 to send the data files for MySQL to east1 :
```
west3> rsync -aze ssh /var/lib/mysql/* east1:/var/lib/mysql/
```
You should synchronize all locations that contain data. This includes additional folders such as innodb_data_home_dir or innodb_log_group_home_dir. Check the my.cnf file to ensure you have the correct paths.
Once the files have been copied, the files should be updated to have the correct ownership and permissions so that the Tungsten service can read them.
Recover west3 back to the dataservice (This process will automatically restart MySQL):
```
cctrl [west]> datasource west3 recover
```

Update the ownership and permissions on the data files on east1:

east1> sudo chown -R mysql:mysql /var/lib/mysql
east1> sudo chmod 770 /var/lib/mysql

Restart MySQL on east1 :
```
east1> sudo systemctl start mysqld
```

Reset the local replication services on east1 :

east1> trepctl offline
east1> trepctl -service east reset -all -y
east1> trepctl online

Recover east1 witin cctrl :

cctrl [east]> set force true
cctrl [east]> datasource east1 welcome
cctrl [east]> datasource east1 online

Using tprovision, restore the remaining nodes (east{2,3}) in the failed east service from a host in the newly recovered east1 host:
```
shell east{2,3}> tprovision -s east1 -m xtrabackup
```
Note
For a full explanation of using tprovison see The tprovision Script
Place all the Tungsten Replicator services on east{1,2,3} back online:
```
shell east> mm_trepctl online
```
Place all the Tungsten Replicator services on west{1,2,3} back online:
```
shell west> mm_trepctl online
```
On the east, failed, Tungsten Cluster service, put each Tungsten Connector online:
```
cctrl [east]> router * online
```
Set the policy back to AUTOMATIC:
```
cctrl> set policy automatic
```

Prev	Up	Next
3.4.3. Best Practices: Multi-Site/Active-Active Clusters	^Level	3.4.5. Resetting all dataservices

Continuent Documentation