Merge pull request #32 from netantho/averez-doc

Shipping and JSON structure documentation
This commit is contained in:
jeffbryner 2014-04-10 14:29:52 -07:00
Родитель 57aa8ab6e0 c274341bdc
Коммит 416262e095
16 изменённых файлов: 609 добавлений и 22 удалений

Просмотреть файл

@ -1,26 +1,17 @@
Usage
=====
Sending logs to MozDef
----------------------
Events/Logs are accepted as json over http(s) or over rabbit-mq. Most modern log shippers support json output. MozDef is tested with support for:
* heka ( https://github.com/mozilla-services/heka )
* beaver ( https://github.com/josegonzalez/beaver )
* nxlog ( http://nxlog-ce.sourceforge.net/ )
* logstash ( http://logstash.net/ )
* native python code ( https://github.com/jeffbryner/MozDef/blob/master/lib/mozdef.py or https://github.com/jeffbryner/MozDef/blob/master/test/json2Mozdef.py )
* AWS cloudtrail (via native python)
Web Interface
-------------
MozDef uses the Meteor framework ( https://www.meteor.com/ ) for the web interface and bottle.py for the REST API.
For authentication, MozDef ships with native support for Persona ( https://login.persona.org/about ).
Meteor (the underlying UI framework) also supports many authentication options ( http://docs.meteor.com/#accounts_api ) including google, github, twitter, facebook, oath, native accounts, etc.
MozDef uses the `Meteor framework`_ for the web interface and bottle.py for the REST API.
For authentication, MozDef ships with native support for `Persona`_.
Meteor (the underlying UI framework) also supports `many authentication options`_ including google, github, twitter, facebook, oath, native accounts, etc.
.. _Meteor framework: https://www.meteor.com/
.. _Persona: https://login.persona.org/about
.. _many authentication options: http://docs.meteor.com/#accounts_api
Events visualizations
*********************
@ -37,3 +28,254 @@ Alerts are generally implemented as Elastic Search searches, or realtime examina
Incident handling
*****************
Sending logs to MozDef
----------------------
Events/Logs are accepted as json over http(s) with the POST or PUT methods or over rabbit-mq.
Most modern log shippers support json output. MozDef is tested with support for:
* `heka`_
* `beaver`_
* `nxlog`_
* `logstash`_
* `native python code`_
* `AWS cloudtrail`_ (via native python)
We have `some configuration snippets`_
.. _heka: https://github.com/mozilla-services/heka
.. _beaver: https://github.com/josegonzalez/beaver
.. _nxlog: http://nxlog-ce.sourceforge.net/
.. _logstash: http://logstash.net/
.. _native python code: https://github.com/gdestuynder/mozdef_lib
.. _AWS cloudtrail: https://aws.amazon.com/cloudtrail/
.. _some configuration snippets: https://github.com/jeffbryner/MozDef/tree/master/examples
What should I log?
******************
If your program doesn't log anything it doesn't exist. If it logs everything that happens it becomes like the proverbial boy who cried wolf. There is a fine line between logging too little and too much but here is some guidance on key events that should be logged and in what detail.
+------------------+---------------------------+---------------------------------------+
| Event | Example | Rationale |
+==================+===========================+=======================================+
| Authentication | Failed/Success logins | Authentication is always an important |
| Events | | event to log as it establishes |
| | | traceability for later events and |
| | | allows correlation of user actions |
| | | across systems. |
+------------------+---------------------------+---------------------------------------+
| Authorization | Failed attempts to | Once a user is authenticated they |
| Events | insert/update/delete a | usually obtain certain permissions. |
| | record or access a | Logging when a user's permissions do |
| | section of an application.| not allow them to perform a function |
| | | helps troubleshooting and can also |
| | | be helpful when investigating |
| | | security events. |
+------------------+---------------------------+---------------------------------------+
| Account | Account | Adding, removing or changing accounts |
| Lifecycle | creation/deletion/update | are often the first steps an attacker |
| | | performs when entering a system. |
+------------------+---------------------------+---------------------------------------+
| Password/Key | Password changed, expired,| If your application takes on the |
| Events | reset. Key expired, | responsibility of storing a user's |
| | changed, reset. | password (instead of using |
| | | centralized LDAP/persona) it is |
| | | important to note changes to a users |
| | | credentials or crypto keys. |
+------------------+---------------------------+---------------------------------------+
| Account | Account lock, unlock, | If your application locks out users |
| Activations | disable, enable | after failed login attempts or allows |
| | | for accounts to be inactivated, |
| | | logging these events can assist in |
| | | troubleshooting access issues. |
+------------------+---------------------------+---------------------------------------+
| Application | Invalid input, | If your application catches errors |
| Exceptions | fatal errors, | like invalid input attempts on web |
| | known bad things | forms, failures of key components, |
| | | etc creating a log record when these |
| | | events occur can help in |
| | | troubleshooting and tracking security |
| | | patterns across applications. Full |
| | | stack traces should be avoided however|
| | | as the signal to noise ratio is |
| | | often overwhelming. |
| | | |
| | | It is also preferable to send a single|
| | | event rather than a multitude of |
| | | events if it is possible for your |
| | | application to correlate a significant|
| | | exception. |
| | | |
| | | For example, some systems are |
| | | notorious for sending a connection |
| | | event with source IP, then sending an |
| | | authentication event with a session ID|
| | | then later sending an event for |
| | | invalid input that doesn't include |
| | | source IP or session ID or username. |
| | | Correctly correlating these events |
| | | across time is much more difficult |
| | | than just logging all pieces of |
| | | information if it is available. |
+------------------+---------------------------+---------------------------------------+
JSON format
-----------
This section describes the structure JSON objects to be sent to MozDef.
Using this standard ensures developers, admins, etc are configuring their application or system to be easily integrated into MozDef.
Background
**********
Mozilla used CEF as a logging standard for compatibility with Arcsight and for standardization across systems. While CEF is an admirable standard, MozDef prefers JSON logging for the following reasons:
* Every development language can create a JSON structure
* JSON is easily parsed by computers/programs which are the primary consumer of logs
* CEF is primarily used by Arcsight and rarely seen outside that platform and doesn't offer the extensibility of JSON
* A wide variety of log shippers (heka, logstash, fluentd, nxlog, beaver) are readily available to meet almost any need to transport logs as JSON.
* JSON is already the standard for cloud platforms like amazon's cloudtrail logging
Description
***********
As there is no common RFC-style standard for json logs, we prefer the following structure adapted from a combination of the graylog GELF and logstash specifications.
Note all fields are lowercase to avoid one program sending sourceIP, another sending sourceIp, another sending SourceIPAddress, etc.
Since the backend for MozDef is elasticsearch and fields are case-sensitive this will allow for easy compatibility and reduce potential confusion for those attempting to use the data.
MozDef will perform some translation of fields to a common schema but this is intended to allow the use of heka, nxlog, beaver and retain compatible logs.
Mandatory Fields
****************
+------------+-------------------------------------+-----------------------------------+
| Field | Purpose | Sample Value |
+============+=====================================+===================================+
| category | General category/type of event | Authentication, Authorization, |
| | matching the 'what should I log' | Account Creation, Shutdown, |
| | section below | Startup, Account Deletion, |
| | | Account Unlock, brointel, |
| | | bronotice |
+------------+-------------------------------------+-----------------------------------+
| details | Additional, event-specific fields | "dn": "john@example.com,o=com, |
| | that you would like included with | dc=example", "facility": "daemon" |
| | the event. Please completely spell | |
| | out a field rather an abbreviate: | |
| | i.e. sourceipaddress instead of | |
| | srcip. | |
+------------+-------------------------------------+-----------------------------------+
| hostname | The fully qualified domain name of | server1.example.com |
| | the host sending the message | |
+------------+-------------------------------------+-----------------------------------+
| processid | The PID of the process sending the | 1234 |
| | log | |
+------------+-------------------------------------+-----------------------------------+
|processname | The name of the process sending the | myprogram.py |
| | log | |
+------------+-------------------------------------+-----------------------------------+
| severity | RFC5424 severity level of the event | INFO |
| | in all caps: DEBUG, INFO, NOTICE, | |
| | WARNING, ERROR, CRITICAL, ALERT, | |
| | EMERGENCY | |
+------------+-------------------------------------+-----------------------------------+
| source | Source of the event (file name, | /var/log/syslog/2014.01.02.log |
| | system name, component name) | |
+------------+-------------------------------------+-----------------------------------+
| summary | Short human-readable version of the | john login attempts over |
| | event suitable for IRC, SMS, etc. | threshold, account locked |
+------------+-------------------------------------+-----------------------------------+
| tags | An array or list of any tags you | vpn, audit |
| | would like applied to the event | |
| | | nsm,bro,intel |
+------------+-------------------------------------+-----------------------------------+
| timestamp | Full date plus time timestamp of | 2014-01-30T19:24:43+00:00 |
| | the event in ISO format including | |
| | the timezone offset | |
+------------+-------------------------------------+-----------------------------------+
Details substructure (optional fields)
**************************************
+----------------------+--------------------------+---------------+------------------------------+
| Field | Purpose | Used In | Sample Value |
+======================+==========================+===============+==============================+
| destinationipaddress | Destination IP of a | NSM/Bro/Intel | 8.8.8.8 |
| | network flow | | |
+----------------------+--------------------------+---------------+------------------------------+
| destinationport | Destination port of a | NSM/Bro/Intel | 80 |
| | network flow | | |
+----------------------+--------------------------+---------------+------------------------------+
| dn | Distinguished Name in | event/ldap | john@example.org,o=org, |
| | LDAP, mean unique ID in | | dc=example |
| | the ldap hierarchy | | |
+----------------------+--------------------------+---------------+------------------------------+
| filedesc | | NSM/Bro/Intel | |
+----------------------+--------------------------+---------------+------------------------------+
| filemimetype | | NSM/Bro/Intel | |
+----------------------+--------------------------+---------------+------------------------------+
| fuid | | NSM/Bro/Intel | |
+----------------------+--------------------------+---------------+------------------------------+
| result | Result of an event, | event/ldap | LDAP_SUCCESS |
| | success or failure | | |
+----------------------+--------------------------+---------------+------------------------------+
| seenindicator | Intel indicator that | NSM/Bro/Intel | evil.com/setup.exe |
| | matched as seen by our | | |
| | system | | |
+----------------------+--------------------------+---------------+------------------------------+
| seenindicator_type | Type of intel indicator | NSM/Bro/Intel | HTTP::IN_URL |
+----------------------+--------------------------+---------------+------------------------------+
| seenwhere | Where the intel indicator| NSM/Bro/Intel | Intel::URL |
| | matched (which protocol, | | |
| | which field) | | |
+----------------------+--------------------------+---------------+------------------------------+
| source | Source of the connection | event/ldap | Mar 19 15:36:25 ldap1 |
| | | | slapd[31031]: conn=6633594 |
| | | | fd=49 ACCEPT |
| | | | from IP=10.54.70.109:23957 |
| | | | (IP=0.0.0.0:389) |
| | | | |
| | | | Mar 19 15:36:25 ldap1 |
| | | | slapd[31031]: conn=6633594 |
| | | | op=0 BIND |
+----------------------+--------------------------+---------------+------------------------------+
| sourceipaddress | Source IP of a network | NSM/Bro/Intel | 8.8.8.8 |
| | flow | | |
| | | event/ldap | |
+----------------------+--------------------------+---------------+------------------------------+
| sourceport | Source port of a network | NSM/Bro/Intel | 42297 |
| | flow | | |
+----------------------+--------------------------+---------------+------------------------------+
| sources | Source feed | NSM/Bro/Intel | CIF - need-to-know |
+----------------------+--------------------------+---------------+------------------------------+
| success | Auth success | event/ldap | True |
+----------------------+--------------------------+---------------+------------------------------+
| uid | Bro connection uid | NSM/Bro/Intel | CZqhEs40odso1tFNx3 |
+----------------------+--------------------------+---------------+------------------------------+
Examples
********
.. code-block:: javascript
{
"timestamp": "2014-02-14T11:48:19.035762739-05:00",
"hostname": "fedbox",
"processname": "/tmp/go-build278925522/command-line-arguments/_obj/exe/log_json",
"processid": 3380,
"severity": "INFO",
"summary": "joe login failed",
"category": "authentication",
"source": "",
"tags": [
"MySystem",
"Authentication"
],
"details": {
"user": "joe",
"task": "access to admin page /admin_secret_radioactiv",
"result": "10 authentication failures in a row"
}
}

Просмотреть файл

@ -0,0 +1,9 @@
# beaver-syslog
This configuration for [beaver](https://github.com/josegonzalez/beaver) ships syslog logs stored in `/var/log/syslog/systems/myhost.example.com/*.log` to mozdef.
To run it:
```
beaver -c config.ini -t stdout
```

Просмотреть файл

@ -0,0 +1,9 @@
[beaver]
format: json
logstash_version: 1
path: /var/log/
http_url: http://mozdef.example.com:8080/events/
[/var/log/syslog/systems/myhost.example.com/*.log]
type = syslog
tags = beaver

Просмотреть файл

@ -0,0 +1,10 @@
# heka-apache
This configuration for [heka](http://hekad.readthedocs.org/en/latest/) ships apache logs stored in `/var/log/syslog/systems/web` to mozdef.
To run it:
```
rm -rf /var/cache/hekad/*
hekad -config=heka.toml
```

Просмотреть файл

@ -0,0 +1,38 @@
[syslog]
type="LogstreamerInput"
log_directory="/var/log/syslog/systems/web/"
file_match='(?P<Year>\d+)-(?P<Month>\d+)-(?P<Day>\d+).log'
priority = ["Year", "Month", "Day"]
oldest_duration="2h"
[apache_transform_decoder]
type = "PayloadRegexDecoder"
match_regex = '^.*?[(?P<Timestamp>[^\]]+)\] "(?P<Method>[A-Z]+) (?P<Url>[^\s]+)[^"]*" (?P<StatusCode>\d+) (?P<RequestSize>\d+) "(?P<Referer>[^"]*)" "(?P<Browser>[^"]*)"'
timestamp_layout = "02/Jan/2006:15:04:05 -0700"
[apache_transform_decoder.message_fields]
Type = "ApacheLogfile"
Logger = "apache"
Url|uri = "%Url%"
Method = "%Method%"
Status = "%StatusCode%"
RequestSize|B = "%RequestSize%"
Referer = "%Referer%"
Browser = "%Browser%"
# Start commenting here if you don't want any stdout
[stdout]
type = "LogOutput"
message_matcher = "TRUE"
payload_only = true
# Finish commenting here
[ElasticSearchOutput]
message_matcher = "Type!='heka.all-report'"
cluster = "mozdefqa"
index = "events"
type_name = "event"
server = "http://mozdef.example.com:8080"
format = "clean"
flush_interval = 1000
flush_count = 10

Просмотреть файл

@ -0,0 +1,15 @@
# heka-lua-bro-intel
This configuration for [heka](http://hekad.readthedocs.org/en/latest/) ships intel logs for [Bro](http://bro.org/) stored in `/nsm/bro/spool/manager/intel.log` to mozdef.
We use here the [Lua Sandbox for heka](http://hekad.readthedocs.org/en/latest/sandbox/index.html) to parse our logs.
These log files have comments starting by `#` and have tab-delimited fields.
To run it:
```
rm -rf /var/cache/hekad/*
cp -rf brointel.lua /usr/share/hekad
hekad -config=heka.toml
```

Просмотреть файл

@ -0,0 +1,56 @@
require "lpeg"
require "string"
-- Some magic for parsing tab-separated logs
local sep = lpeg.P"\t"
local elem = lpeg.C((1-sep)^0)
grammar = lpeg.Ct(elem * (sep * elem)^0) -- split on tabs, return as table
local msg = {
Type = "bronotice_log",
Logger = "nsm",
Fields = {
-- Initializing our fields
['details.ts'] = nil,
['details.uid'] = nil,
['details.orig_h'] = nil,
['details.orig_p'] = nil,
['details.resp_h'] = nil,
['details.resp_p'] = nil,
['details.proto'] = nil,
['details.note'] = nil,
['details.msg'] = nil,
['details.sub'] = nil,
summary = nil,
severity = "NOTICE",
category = "bronotice",
tags = "nsm,bro,notice"
}
}
function process_message()
local log = read_message("Payload")
-- Don't take comments
if string.sub(log, 1, 1) ~= "#" then
local matches = grammar:match(log)
if not matches then return -1 end
-- populating our fields
msg.Fields['details.ts'] = matches[1]
msg.Fields['details.uid'] = matches[2]
msg.Fields['details.sourceipaddress'] = matches[3]
msg.Fields['details.sourceport'] = matches[4]
msg.Fields['details.destinationipaddress'] = matches[5]
msg.Fields['details.destinationport'] = matches[6]
msg.Fields['details.proto'] = matches[10]
msg.Fields['details.note'] = matches[11]
msg.Fields['details.msg'] = matches[12]
msg.Fields['details.sub'] = matches[13]
-- Our summary is the concatenation of other fields
msg.Fields['summary'] = msg.Fields['details.note'] .. " " .. msg.Fields['details.msg'] .. " " .. msg.Fields['details.sub']
inject_message(msg)
return 0
else
return -1 -- do not send bro comments
end
end

Просмотреть файл

@ -0,0 +1,27 @@
[brointel]
type = "LogstreamerInput"
log_directory = "/nsm/bro/spool/manager"
file_match = 'intel\.log'
decoder = "brointel_transform_decoder"
[brointel_transform_decoder]
type = "SandboxDecoder"
script_type = "lua"
filename = "brointel.lua"
# Start commenting here if you don't want any stdout
[stdout]
type = "LogOutput"
message_matcher = "FALSE"
#payload_only = true
# Finish commenting here
[ElasticSearchOutput]
message_matcher = "Type!='heka.all-report'"
cluster = "mozdefqa"
index = "events"
type_name = "event"
server = "http://mozdef.example.com:8080"
format = "clean"
flush_interval = 1000
flush_count = 10

Просмотреть файл

@ -0,0 +1,15 @@
# heka-lua-bro-notice
This configuration for [heka](http://hekad.readthedocs.org/en/latest/) ships notice logs for [Bro](http://bro.org/) stored in `/nsm/bro/spool/manager/notice.log` to mozdef.
We use here the [Lua Sandbox for heka](http://hekad.readthedocs.org/en/latest/sandbox/index.html) to parse our logs.
These log files have comments starting by `#` and have tab-delimited fields.
To run it:
```
rm -rf /var/cache/hekad/*
cp -rf bronotice.lua /usr/share/hekad
hekad -config=heka.toml
```

Просмотреть файл

@ -0,0 +1,56 @@
require "lpeg"
require "string"
-- Some magic for parsing tab-separated logs
local sep = lpeg.P"\t"
local elem = lpeg.C((1-sep)^0)
grammar = lpeg.Ct(elem * (sep * elem)^0) -- split on tabs, return as table
local msg = {
Type = "bronotice_log",
Logger = "nsm",
Fields = {
-- Initializing our fields
['details.ts'] = nil,
['details.uid'] = nil,
['details.orig_h'] = nil,
['details.orig_p'] = nil,
['details.resp_h'] = nil,
['details.resp_p'] = nil,
['details.proto'] = nil,
['details.note'] = nil,
['details.msg'] = nil,
['details.sub'] = nil,
summary = nil,
severity = "NOTICE",
category = "bronotice",
tags = "nsm,bro,notice"
}
}
function process_message()
local log = read_message("Payload")
-- Don't take comments
if string.sub(log, 1, 1) ~= "#" then
local matches = grammar:match(log)
if not matches then return -1 end
-- populating our fields
msg.Fields['details.ts'] = matches[1]
msg.Fields['details.uid'] = matches[2]
msg.Fields['details.sourceipaddress'] = matches[3]
msg.Fields['details.sourceport'] = matches[4]
msg.Fields['details.destinationipaddress'] = matches[5]
msg.Fields['details.destinationport'] = matches[6]
msg.Fields['details.proto'] = matches[10]
msg.Fields['details.note'] = matches[11]
msg.Fields['details.msg'] = matches[12]
msg.Fields['details.sub'] = matches[13]
-- Our summary is the concatenation of other fields
msg.Fields['summary'] = msg.Fields['details.note'] .. " " .. msg.Fields['details.msg'] .. " " .. msg.Fields['details.sub']
inject_message(msg)
return 0
else
return -1 -- do not send bro comments
end
end

Просмотреть файл

@ -0,0 +1,27 @@
[bronotice]
type = "LogstreamerInput"
log_directory = "/nsm/bro/spool/manager"
file_match = 'notice\.log'
decoder = "bronotice_transform_decoder"
[bronotice_transform_decoder]
type = "SandboxDecoder"
script_type = "lua"
filename = "bronotice.lua"
# Start commenting here if you don't want any stdout
[stdout]
type = "LogOutput"
message_matcher = "FALSE"
#payload_only = true
# Finish commenting here
[ElasticSearchOutput]
message_matcher = "Type!='heka.all-report'"
cluster = "mozdefqa"
index = "events"
type_name = "event"
server = "http://mozdef.example.com:8080"
format = "clean"
flush_interval = 1000
flush_count = 10

Просмотреть файл

@ -0,0 +1,10 @@
# heka-syslogng
This configuration for [heka](http://hekad.readthedocs.org/en/latest/) ships syslog-ng logs stored in `/var/log/syslog/systems` to mozdef.
To run it:
```
rm -rf /var/cache/hekad/*
hekad -config=heka.toml
```

Просмотреть файл

@ -0,0 +1,40 @@
[systemslogs]
type = "LogstreamerInput"
log_directory = "/var/log/syslog/systems"
file_match = '(?P<System>[^/]+)/(?P<Year>\d+)-(?P<Month>\d+)-(?P<Day>\d+).log'
priority = ["Year","Month","Day"]
oldest_duration = "2h"
decoder = "systemslogs_transform_decoder"
[systemslogs_transform_decoder]
type = "PayloadRegexDecoder"
match_regex = '^(?P<syslogtimestamp>[A-Z][a-z]+\s+\d+\s\d+:\d+:\d+)\s(?P<hostname>.+?)\s(?P<program>.+?)(\[(?P<pid>.+?)\])?:\s(?P<summary>.*)'
[systemslogs_transform_decoder.message_fields]
Type = "syslog"
Logger = "heka"
summary = "%summary%"
Severity = "INFO"
Hostname= "%hostname%"
Timestamp="%syslogtimestamp%"
details.timestamp = "%syslogtimestamp%"
details.hostname = "%hostname%"
details.program = "%program%"
details.processid = "%pid%"
# Start commenting here if you don't want any stdout
[stdout]
type = "LogOutput"
message_matcher = "FALSE"
#payload_only = true
# Finish commenting here
[ElasticSearchOutput]
message_matcher = "Type!='heka.all-report'"
cluster = "mozdefqa"
index = "events"
type_name = "event"
server = "http://mozdef.example.com:8080"
format = "clean"
flush_interval = 1000
flush_count = 10

Просмотреть файл

@ -0,0 +1,10 @@
# beaver-syslog
This configuration for [nxlog](http://nxlog-ce.sourceforge.net) ships syslog logs stored in `/var/log/*.log` to mozdef.
To run it:
```
cp nxlog.conf /etc/nxlog/
sudo service nxlog restart
```

Просмотреть файл

@ -0,0 +1,23 @@
<Extension syslog>
Module xm_syslog
</Extension>
<Extension json>
Module xm_json
</Extension>
CacheDir /tmp/nxlog
PidFile "nxlog.pid"
<Input in>
Module im_file
File '/var/log/*.log'
ReadFromLast TRUE
Exec parse_syslog(); to_json();
</Input>
<Output outes>
Module om_http
URL http://mozdef.example.com:8080/nxlog/
</Output>
<Route httpout>
Path in => buffer=>outes,outfile
</Route