Until now, we've only set up a 3-node etcd2 cluster in Amazon EC2 and added/removed a node from the etcd2 cluster. This tutorial is less focused in etcd administration and brings a real situation where etcd could help solving a common problem in the real world.
Description of the problem
Imagine that you have a load balancer and five machines behind it running a spring boot application in production and these five machines have access to an etcd cluster (this is a typical worker scenario when using CoreOS, etcd and fleet in production). Suddenly your team realizes that something is weird with the application behavior, although it's not generating any error logs. In order to find out what's going on, the developers should be able to check the DEBUG level logs for a while. How would you solve this problem? It's easy, you've got just to deploy the application in the five machines with logback loggers set to DEBUG level. After the problem is identified and solved, you just rollback the logback configuration and deploy the application again.
But… what if (for instance) your deploy pipeline takes many hours and the amount of logs that the application generates in DEBUG level is so great that the rollback deploy time will probably consume all disk space in your servers, leading it to a fatal crash?
How could etcd and logback help with this kind of situation?
Externalizing logback configuration file and making logback aware of log level changes
One solution that would fit nicely for this problem would be to have the ability to change the log level at runtime instead of compilation time. In order to do that, one possible solution is to have the logback configuration file, logback-spring.xml, outside the application jar, that is, located in the file system, and make the application realizes that when the logback-spring.xml changes in the file system then it needs to reload the configurations at runtime.
Fortunately, logback comes with a nice out-of-the-box feature that scans the logback configuration file in a fixed time rate for changes and reloads it if there's any change. In the logback-spring.xml file, add the scan and scanPeriod properties to the <configuration> tag:
<configuration scan="true" scanPeriod="30 seconds">
...
</configuration>
This configurations will cause logback to reconfigure itself each 30 seconds if there's any change in the logback-spring.xml configuration file.
As the logback-spring.xml will be now in the file system, it's needed to configure the application configuration file, application.properties (or yml), to point to the path in which the file will be located in the file system, lets say /opt/myapp/config/logback-spring.xml. In the application.properties, paste the following:
logging.config=/opt/myapp/config/logback-spring.xml
Using etcdctl to put the modified log level logback-spring.xml to the file system at runtime
Ok, the application configurations are done, but did you noticed that the logback-spring.xml is not already in the file system? And even if it was, how would we easily modify this file in five servers at the same time to change the log levels?
That's where etcd comes in handy. It already has a built-in watch functionality that allow us to watch for changes in a key and execute an action when it changes. That fits perfectly to our needs.
One possible solution is to get the logback-spring.xml from the application git master branch, change the log level from INFO to DEBUG by using a simple sed shell command and set the modified logback-spring.xml to an etcd key (lets say /myapp/production/logback) using whether the etcdctl or the etcd rest api. This key would be being watched by etcd in each of the five servers and when it detects that the key has changed we execute a simple command to write the modified value of this key (the modified logback-spring.xml) to the logback-spring.xml in the file system (where the application will be looking for changes)
The etcd syntax to watch a key named my-key and execute an action when it changes is:
$ etcdctl exec-watch my-key -- action-here
Back to our context, we need to use the above syntax to put the modified logback-spring.xml to the filesystem. In each of the five server, execute the command bellow:
$ etcdctl exec-watch /myapp/production/logback -- /bin/bash -c 'mkdir -p /opt/myapp/config && echo $ETCD_WATCH_VALUE > /opt/myapp/config/logback-spring.xml'
The variable $ETCD_WATCH_VALUE will contains the modified logback-spring.xml after a change in the specified key is detected and we will override the old logback-spring.xml by a new one in the file system.
Of course there are more automated ways to execute the above command in the five servers rather than make it one by one. For instance, if you use CoreOS with fleet, you could write a fleet unit template named logback-log-level@.service like bellow and perform a simple fleetctl start logback-log-level@{1..5}.service in the terminal:
[Unit] After=etcd2.service Requires=etcd2.service Description=Logback Log Level Unit
[Service] Restart=always RestartSec=10 TimeoutStartSec=0
# start ExecStart=/usr/bin/etcdctl exec-watch /myapp/production/logback -- /bin/bash -c 'mkdir -p /opt/myapp/config && echo $ETCD_WATCH_VALUE > /opt/myapp/config/logback-spring.xml'
[Install] WantedBy=multi-user.target
[X-Fleet] MachineMetadata="production=true" Conflicts=logback-log-level@*
That's it! Now you've got only to fire the etcdctl set /myapp/production/logback < debug-logback-spring.xml and the application log levels will change from INFO to DEBUG at runtime in all of your five servers behind the load balancer.
コメント