Option 1 - Separated Servers
As mentioned in my previous post “Horizontally Scaling Drupal” some very bad experience finally allowed me to get open hands and chance to review options how to horizontally scale some Drupal based web sites.
Easiest option is to split on multiple servers apache, cache servers and MySQL. This option require at least 3 servers but lack ability to add increase resources in any other way except vertically; adding RAM & CPU to every server. Second problem here is each component is single point of failure. One server crashes, complete site is down. So with this we are not far from initial state :-(.
Second option is ability to add multiple server to each component. Since I didn’t know where site will be hosted and what do I have available (most important part is Load Balancer) created solutions required some extra servers for proxy/balancing purpose.
In order to be able to scale and add multiple web servers I used Apache load balancer, 2-3 web servers, 2 memcached servers, and powerful MySQL server (or some kind of MySQL cluster).
In order to be able to test above solution I created few Ubuntu servers VM's on my machine:
- Front end server running Apache load balancer
- For start 3 separated machines running Apache. On each machine I had installed APC cache.
- Two memcached servers.
- One MySQL server.
- File storage replication
This solution has few benefits against basic one. At start we can add as many web servers as we want. One important part is to configure web servers to share files. This was done by using rsync replication and mounting on all servers /var/www/html/mysite/sites folder as shared one (expecting that Drupal core is same on all servers I didn't want to share it). I like this solution since we will have source code on both web instances, and not on the file storage. This makes it possible to release new “source code” (not database!) instances of Drupal modules. Or you can quickly change some lines on a PROD environment for debugging (as long as you block traffic from visitors to that web instance of course ;-)).
- Move the caching mechanism to Memcached. Memcached can store all caching data in memory. So it doesn't use the MySQL tables any longer. Also, Memcached can run in a clustered environment, so no need to manually flush the remote cache. The Memcached Drupal module and Memcached daemon would take care of it.
- Because of the movement of caching to Memcached, databases would not be on heavy load any longer.
Database server can be just one or full MySQL cluster (depending on amount of available $$$).
File storage replication
Another (possible) improvement to above solution would be to store all data on NAS file storage.
The NAS storage holds all data in sites/##YOUR_SITE_NAME##/files directory. Compared with the previous solution, we don’t need to sync data again. Again: one disadvantage here: if the NAS file storage goes out: no file in your files will be served. Nor by web-server1, nor by web-server 2.
As previous solution, problem with this solution lay in some single points of failure, like only one load balancer and possible one MySQL server.