-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathimprovements.tex
62 lines (42 loc) · 4.83 KB
/
improvements.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
% Conclusions and future work
\chapter{Conclusions and future work}
\label{chap:conclusions-and-future-work}
This chapter draws some conclusions and describes some of the improvements that can be applied to our High Availability system. These improvements can be used in further research.
\section {Conclusions}
We have shown, thanks to a reproducible example, that Zimbra OSE could be enhanced to be high available. This system has been based on state of the art open source high availability software such as Pacemaker and Corosync.
The master thesis writer has learnt how to develop OCF Resource Agents which can be useful for other HA systems. How to mirror a partition between two servers thanks to DRBD was also covered. More over, Zimbra, one of the easiest email server solutions in its open source edition, can now be used as a HA system with standard HA software.
\section {Future work}
\subsection {OVH Datacentre network handling}
There have been some efforts from the bTactic team to handle OVH Datacentre networking thanks to three OCF scripts named:
ClusterOVHFailover, ClusterHostRoute and ClusterDefaultRoute. These scripts make sense in setup that does not use Virtual Rack + RIPE but just normal servers connected via Internet only.
ClusterOVHFailover makes sure an OVH ip-fail-over is failed over from one machine to the another one. This way Zimbra Server is being served by the correct host.
ClusterHostRoute and ClusterDefaultRoute are meant to help to setup OVH networking for ip-fail-over which is not covered by standard network setup found in default Pacemaker package.
These scripts can be found inside BtacticOCF tar.gz file (\cite{BtacticOCF}) which can be downloaded from BtacticOCF tar.gz file (\cite{BtacticOrg}).
\subsection {\label{subsec:fencing}Fencing}
Fencing is the process of locking resources away from a node whose status is uncertain (\cite{LinuxHAFencing}). The default method of fencing in Pacemaker is using stonith. Stonith is a technique for node fencing, that means that the node (either primary or secondary server in our example) that it is supposed to have failed is \textit{shot in head} (\cite{LinuxHAStonith}). That makes sure that the node is actually dead.
In our described example we have disabled stonith. We can improve it by enabling it and using one of the available methods.
Once again bTactic team has developed an stonith script (or fence agent as known per Pacemaker) to make sure an OVH server does not use a shared resource. The failing node is rebooted into a Rescue mode which is quite similar to booting a computer with a live CD that does not make any change to local hard disks. Fence agent name is: fence\_ovh.
This script is available as a Red Hat's fence-agents package since fence-agents package 4.0.2 release version (\cite{LinuxClusterML201307}).
\subsection {\label{subsection:mysql-ha}Mysql HA}
One of the Zimbra components is a Mysql database. This Mysql database can be excluded from DRBD syncing and can be setup as an active / active cluster.
This setup will offer a better handling of node failing because you would only loose last mysql queries. In the DRBD case you loose all the file changes that have not been stored into the files that compose actual Mysql database.
The only drawback for this improvement is that Zimbra OSE upgrades tend to be much more difficult when we want to preserve HA.
\subsection {Project Always ON}
Always ON is a project from Zimbra (\cite{ZimbraProjectAlwaysOn}) (September 2013) with a very simple goal: Email and collaboration should be \textit{always on} for end users. Its design goals are:
\begin{itemize}
\item Inherently resilient to failure
\item Scaling should be elastic based on workload demands
\item The software can be enhanced without service disruption
\item Efficient usage of commodity hardware resources
\end{itemize}
. That design goals imply:
\begin{itemize}
\item No single points of failure in the application components
\item Separating the application code from the data
\item Distributing state information across commodity storage
\item Automatic failover of application and data storage components
\item Automatic load balancing of client requests across the application and data layers
\end{itemize}
Project Always ON has just started and might be implemented as early as in Zimbra 10 version. It is an improvement over Mysql HA future work described on subsection {\ref{subsection:mysql-ha} Mysql HA} because not only enforces HA on each of the Zimbra components but it is also focused on scaling resources. That means, that the more service is needed the more Zimbra (virtual) machines will be deployed to meet that service requirements.
\subsection {Data loss}
Additional tests can be performed to measure how much data is lost is when one of the nodes is offline while having an active role. We can find some of these tests at \cite{TaerHowtoHAZimbra8}.