How to get Logstash Grok Fliter to see Newline and Carriage Returns?

Question!

I am trying to parse our log files and send them to elasticsearch. The problem is that our S3 client is injecting lines into the file that contains carriage returns (\r) instead of new line chars (\n). The config for the File Input Filter using '\n' as the delimiter which is consistent with 99% of the data. When I run logstash against this data, it misses the last line which is what I am really looking for. This is because the File Input Filter is treating the '\r' characters as normal text and not new line. To get around this I am trying to use a Mutate Filter to rewrite the '\r' chars to '\n'. The mutate works, but Grok still sees it as one big line. and _grokparsefailure.

My 'normal' log file lines Grok as expected.

Config

input {
     file {
             path => "/home/pa_stg/runs/2015-12-09-cron-1449666001/run.log"
             start_position => "beginning"
             sincedb_path => "/data/logstash/sincedb"
             stat_interval => 300
             type => "spark"
     }
}
filter{
     mutate {
             gsub => ["message", "\r", "
"]
     }
     grok {
             match => {"message" => "\A%{DATE:date} %{TIME:time} %{LOGLEVEL:loglevel} %{SYSLOGPROG}%{GREEDYDATA:data}"}
             break_on_match => false
     }
}
output{
     stdout { codec => rubydebug }
}

Input

This sample from the input file illustrates the problem. The ^M characters are how vim displays the '\r' Carriage Returns ('more' hides most of them). I left the line as is so you can see that the whole thing is seen in linux and the File Plugin as a single line of text.

^M[Stage 79:=======>                                               (30 + 8) / 208]^M[Stage 79:============>                                          (49 + 8) / 208]^M[Stage 79:=================>                                     (65 + 8) / 208]^M[Stage 79:=====================>                                 (83 + 8) / 208]^M[Stage 79:===========================>                          (105 + 8) / 208]^M[Stage 79:===============================>                      (122 + 8) / 208]^M[Stage 79:====================================>                 (142 + 8) / 208]^M[Stage 79:=========================================>            (161 + 8) / 208]^M[Stage 79:==============================================>       (180 + 6) / 208]^M[Stage 79:==================================================>   (195 + 3) / 208]^M[Stage 79:=====================================================>(206 + 1) / 208]^M                                                                                ^M^M[Stage 86:==============>                                        (55 + 8) / 208]^M[Stage 86:===================>                                   (75 + 8) / 208]^M[Stage 86:==========================>                           (101 + 8) / 208]^M[Stage 86:===============================>                      (123 + 8) / 208]^M[Stage 86:======================================>               (147 + 8) / 208]^M[Stage 86:============================================>         (173 + 6) / 208]^M[Stage 86:==================================================>   (193 + 3) / 208]^M[Stage 86:=====================================================>(205 + 1) / 208]^M                                                                                ^M^M[Stage 93:===================>                                   (74 + 8) / 208]^M[Stage 93:===========================>                          (104 + 8) / 208]^M[Stage 93:==================================>                   (132 + 8) / 208]^M[Stage 93:========================================>             (157 + 9) / 208]^M[Stage 93:================================================>     (186 + 6) / 208]^M[Stage 93:=====================================================>(206 + 2) / 208]^M                                                                                ^M15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor completed
15/12/09 13:04:44 INFO CassandraConnector: Disconnected from Cassandra cluster: int

Output

{
       "message" => "\n[Stage 79:=======>                                               (30 + 8) / 208]\n[Stage 79:============>
                             (49 + 8) / 208]\n[Stage 79:=================>                                     (65 + 8) / 208]\n[Stage 79:===
==================>                                 (83 + 8) / 208]\n[Stage 79:===========================>                          (105 + 8
) / 208]\n[Stage 79:===============================>                      (122 + 8) / 208]\n[Stage 79:====================================>
               (142 + 8) / 208]\n[Stage 79:=========================================>            (161 + 8) / 208]\n[Stage 79:================
==============================>       (180 + 6) / 208]\n[Stage 79:==================================================>   (195 + 3) / 208]\n[St
age 79:=====================================================>(206 + 1) / 208]\n
                  \n\n[Stage 86:==============>                                        (55 + 8) / 208]\n[Stage 86:===================>
                            (75 + 8) / 208]\n[Stage 86:==========================>                           (101 + 8) / 208]\n[Stage 86:====
===========================>                      (123 + 8) / 208]\n[Stage 86:======================================>               (147 + 8)
 / 208]\n[Stage 86:============================================>         (173 + 6) / 208]\n[Stage 86:========================================
==========>   (193 + 3) / 208]\n[Stage 86:=====================================================>(205 + 1) / 208]\n
                                                     \n\n[Stage 93:===================>                                   (74 + 8) / 208]\n[S
tage 93:===========================>                          (104 + 8) / 208]\n[Stage 93:==================================>
   (132 + 8) / 208]\n[Stage 93:========================================>             (157 + 9) / 208]\n[Stage 93:============================
====================>     (186 + 6) / 208]\n[Stage 93:=====================================================>(206 + 2) / 208]\n
                                                                 \n15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor com
pleted",
      "@version" => "1",
    "@timestamp" => "2015-12-09T22:16:52.898Z",
          "host" => "ip-10-252-1-225",
          "path" => "/home/something/pa_stg/runs/2015-12-09-cron-1449666001/run.log",
          "type" => "spark",
          "tags" => [
        [0] "_grokparsefailure"
    ]
}

I need grok to parse this line as it were a newline '\n'. Anyone know how to fix this?

15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor completed


Answers

I believe what you are looking for might be the multiline filter.

https://www.elastic.co/guide/en/logstash/current/plugins-filters-multiline.html

If I recall correctly, this filter is responsible for deciding if a log line is a new line, or not. For example, I am using it to concatenate all lines together, that do not start with "[INFO]".

    multiline {
            pattern => "^\[%{LOGLEVEL}\]"
            negate => true
            what => "previous"
    }

I hope that helps

By : pandaadb


This video can help you solving your question :)
By: admin