Alfresco logging with Logstash and Kibana. Part 1: The basics

This is the first post in a series about how to integrate Alfresco with Elasticsearch using Logstash and Kibana. The overall purpose is to get a better view of the events happening in Alfresco. The purpose of this post is to show the basic set up of the needed components.

As any system Alfresco could produce enormous amounts of log data. It is also usually spread out across several application servers (for example one for the repository, one for Share and one for Solr). In a clustered environment there could also be multiple instances of a particular type. Logs are text files with some common patterns. For example, the standard Alfresco logs contains a time stamp, the log level, name of logger and of course the message. It is not unusual that the combined log files in a production system contains tens of thousands of lines and it could be very hard to get an overview and map lines between different log files to see the full context. To the rescue comes Logstash and Kibana! Basically Logstash collect data from one or several sources, filters it and sends it to an output. In this case the output will be Elasticsearch and Kibana acts as the web user interface towards the (log) data stored in Elasticsearch. There are several excellent tutorials on how to set up these components. Very briefly this is how I did it:

Preparations

Download Logstash:

curl -O https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz
tar zxvf logstash-1.4.2.tar.gz

Download Elasticsearch

curl -O https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.1.tar.gztar zxvf elasticsearch-1.1.1.tar.gz

Download and install Kibana

curl -O https://download.elasticsearch.org/kibana/kibana/kibana-3.1.0.tar.gz
tar zxvf kibana-3.1.0.tar.gz
cp -r kibana-3.1.10 /var/www/kibana

You also need to point Kibana to the Elasticsearch host. Locate config.js inside the kibana directory and change

elasticsearch: "http://"+window.location.hostname+":9200",

to

elasticsearch: "http://yourelasticsearchserver:9200",

Configure Logstash for Alfresco

Now you need to do some basic config in Logstash. The configuration file consist of inputs, filters and outputs. We will start with a very simple configuration that just reads a file input (alfresco.log) and forwards it to Elasticsearch. Create the server.conf in logstash-1.4.2/ with the following content:

input {
 file {
  path => "/opt/alfresco/tomcat-repo/logs/alfresco.log"
  type => "alfresco" # a type to identify those logs
  start_position => "end"
 }
}
filter {
}
output {
 stdout { }
 elasticsearch {
  cluster => "elasticsearch"
 }
}

It just tells Logstash to forward what is written to your alfresco.log (assuming it is located in /opt/alfresco) to Elasticsearch without any filtering

Fire it up!

Now you are ready to start up the services. Start with Elasticsearch:

./elasticsearch-1.1.1/bin/elasticsearch

Then Kibana, in my case I use the nginx web server so I simply do

nginx restart

Finally you start Logstash and point out the configuration file defined above:

logstash-1.4.2/bin/logstash --verbose -f ./logstash-1.4.2/server.conf

After you have started Logstash you should see some logging in Elasticsearch indicating that Logstash has been connected. Now start Alfresco and you should see all the alfresco logging also appear in the Logstash console. Open the Logstash dashboard at http://localhost/index.html#/dashboard/file/logstash.json It will look something like this:

Kibana screenshot showing the Alfresco repository startup logging

Kibana screenshot showing the Alfresco repository startup logging.

Basic Logstash configuration for Alfresco

So far we have made a generic setup of Logstash and Kibana. Now it’s time for some Alfresco specifics.

1. Multiline filter (really not Alfresco specific but for Java logs in general)

You might have noticed that each row in alfresco.log was interpreted as a single item in Kibana. This is not suitable as Java logs often contains line breaks (for example stack traces). Add a filter in your Logstash configuration that merges each row of a stack trace into a single item:

filter {
 multiline {
   pattern => "^\s"
   what => "previous"
 } 
}

2. Configure types for Alfresco, Share and Solr

Another advantage of using Logstash is that you could combine several logs into a single Elasticsearch index. With Alfresco you would probably like to do this for at least the repository, Share and Solr but it is of course also possible to add system logs, access logs or even Alfresco audit logs to Logstash to analyze them at once. In the example you simply add another log using the input directive. Below is an example for all three Alfresco logs:

input {
 file {
  path => "/opt/alfresco/tomcat-repo/logs/alfresco.log"
  type => "alfresco"
  start_position => "end"
 }

 file {
  path => "/opt/alfresco/tomcat-share/logs/share.log"
  type => "share"
  start_position => "end"
 }

 file {
  path => "/opt/alfresco/tomcat-solr/logs/solr.log"
  type => "solr"
  start_position => "end"
 }
}

With this new config the Kibana dashboard will look like (I also enabled full debug logging in Alfresco to get some more data):

Kibana dashboard with repo, share and solr logging

Note the three different types. Here alfresco produces the majority output as DEBUG log level is enabled.

I added a query each for the types I defined in my Logstash config. This is just a very simple example on how to filter and search in the logs.

What is next?

There are numerous ways to configure Logstash for your needs. In the following posts I will explain

  • How to use the log4j socket appender instead of files to transmit messages to Logstash. Is is normally not the best idea to run it on the Alfresco repository server.
  • How to use a json log4j layout to auto parse log messages in Logstash.
  • How to view Alfresco audit data in Kibana using Logstash and Elasticsearch.
This entry was posted in Alfresco, Logging and tagged , , , , . Bookmark the permalink.

2 Responses to Alfresco logging with Logstash and Kibana. Part 1: The basics

  1. That’s amazing.
    I’ll try it

    You said “Is is normally not the best idea to run it on the Alfresco repository server.”
    Does it impact a lot on the server performance?

    Thank you

  2. carn says:

    It is indeed very easy to set up and get started.

    What I mean with “normally not the best idea to run it on the Alfresco repository server” was that when you have limited resources on the server and it also adds complexity to your production environment I would recommend to scale out if possible. And with Logstash it is really simple (I will show how in a coming post).

    The actual load it put on the server depends on the amount of logging and formatting done. But in any circumstances one of the great benefits of this approach is that you will aggregate all logs (repo, share, solr, db etc) into one with a user friendly UI and normally these logs resides on different servers anyway – so maybe performance is not the no #1 reason for separating log analysis from the Alfresco servers.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>