New LogAnalysis with 109x speed

New LogAnalysis with 109x speed

The former version of SenseLog (which serves our robust LogAnalysis module) has processed the files at the start and observed them if there were any changes in them. It has used a lot of sources for the dates in the log rows. In this version it was necessary because SenseLog had to recognize the changes and had to decide whether it has to to something or not with the changes. The process of log files took longer time because of this.

The current version only processing the changes, in the case of delegated logs, SenseLog stands at the end. This way therer is no need for excess resource.

In what way it got quicker?

The new version only deals with changes in log files, so that only newly generated logs are processed. Currently it does not handle the content of the former files. ( This is because we plan to introduce a development later that will allow you to scan the full contents of the log files using the SenseLog rules. )

In many rules, we succeeded in simplifying the regexes so much that resulted in a further acceleration in pattern matching. As well as fixing the regexes for a rule, in one row of logs a rule would only have a regex fitting, which also resulted in a significant acceleration. 

We have used the Aho-Corasick algorithm in the rules, which simplifies and accelerates the search.

We have developed a method to examine how often each log file changes and by this we  classifythem accordingly. The more they change, the more often we check them. If, however, it is found that they are less common, we will check them less often, saving unnecessary steps ( and processor time ).

How did it happen? 

We have taken a great care of planning : we have made the code cleaner so that bot current and later expansion can be more transparent and simpler. With fewer classes, we work more efficiently than in the first version of SenseLog, and we have also put the responsibilities up to date. The regexes were optimized to find the similarities from as few steps as possible. We have been careful not to use too many resources for class instances. As well as sample attachments, we tried to make the least of these function calls with the module, as these are relatively costly workflows. Of course, only within the limits of reasonableness, because we need to model attachment to find the “wickedness” in the log files.

We use the Object Pool design pattern, as this also supports “cost-effectiveness”.

We have made the arrangements for the SenseLog module settings to be personalized directly from Dashboard.

Benchmark :

Based on our tests, 1 million logs in 1000 log files are proccessed in 13 seconds.