data-docs/cookbooks/realtime_analysis_plugin.md

5.2 KiB

Creating a Real-time Analysis Plugin

Getting Started

Creating an analysis plugin consists of three steps:

  1. Writing a message matcher

    The message matcher allows one to select specific data from the data stream.

  2. Writing the analysis code/business logic

    The analysis code allows one to aggregrate, detect anomalies, apply machine learning algorithms etc.

  3. Writing the output code

    The output code allows one to structure the analysis results in an easy to consume format.

Step by Step Setup

  1. Go to the CEP site: https://pipeline-cep.prod.mozaws.net/

  2. Login/Register using your Google @mozilla.com account

  3. Click on the Plugin Deployment tab

  4. Create a message matcher

    1. Edit the message_matcher variable in the Heka Analysis Plugin Configuration text area. For this example we are selecting all telemetry messages. The full syntax of the message matcher can be found here: http://mozilla-services.github.io/lua_sandbox/util/message_matcher.html

      message_matcher = "Type == 'telemetry'"
      
  5. Test the message matcher

    1. Click the Run Matcher button.

      Your results or error message will appear to the right. You can browse the returned messages to examine their structure and the data they contain; this is very helpful when developing the analysis code but is also useful for data exploration even when not developing a plugin.

  6. Delete the code in the Heka Analysis Plugin text area

  7. Create the Analysis Code (process_message)

    The process_message function is invoked every time a message is matched and should return 0 for success and -1 for failure. Full interface documentation: http://mozilla-services.github.io/lua_sandbox/heka/analysis.html

    1. Here is the minimum implementation; type it into the Heka Analysis Plugin text area:

      function process_message()
          return 0 -- success
      end
      
  8. Create the Output Code (timer_event)

    The timer_event function is invoked every ticker_interval seconds.

    1. Here is the minimum inplementation; type it into the Heka Analysis Plugin text area:

      function timer_event()
      end
      
  9. Test the Plugin

    1. Click the Test Plugin button.

      Your results or error message will appear to the right. If an error is output, correct it and test again.

  10. Extend the Code to Perform a Simple Message Count Analysis/Output

    1. Replace the code in the Heka Analysis Plugin text area with the following:

      local cnt = 0
      function process_message()
          cnt = cnt + 1                       -- count the number of messages that matched
          return 0
      end
      
      function timer_event()
          inject_payload("txt", "types", cnt) -- output the count
      end
      
  11. Test the Plugin

    1. Click the Test Plugin button.

      Your results or error message will appear to the right. If an error is output, correct it and test again.

  12. Extend the Code to Perform a More Complex Count by Type Analysis/Output

    1. Replace the code in the Heka Analysis Plugin text area with the following:

      types = {}
      function process_message()
          -- read the docType from the message, if it doesn't exist set it to "unknown"
          local dt = read_message("Fields[docType]") or "unknown"
      
          -- look up the docType in the types hash
          local cnt = types[dt]
          if cnt then
              types[dt] = cnt + 1   -- if the type cnt exists, increment it by one
          else
              types[dt] = 1         -- if the type cnt didn't exist, initialize it to one
          end
          return 0
      end
      
      function timer_event()
          add_to_payload("docType = Count\n")   -- add a header to the output
          for k, v in pairs(types) do           -- iterate over all the key/values (docTypes/cnt in the hash)
              add_to_payload(k, " = ", v, "\n") -- add a line to the output
          end
          inject_payload("txt", "types")        -- finalize all the data written to the payload
      end
      
  13. Test the Plugin

    1. Click the Test Plugin button.

      Your results or error message will appear to the right. If an error is output, correct it and test again.

  14. Deploy the plugin

    1. Click the Deploy Plugin button and dismiss the successfully deployed dialog.
  15. View the running plugin

    1. Click the Plugins tab and look for the plugin that was just deployed {user}.example
    2. Right click on the plugin to active the context menu allowing you to view the source or stop the plugin.
  16. View the plugin output

    1. Click on the Dashboards tab
    2. Click on the Raw Dashboard Output link
    3. Click on analysis.{user}.example.types.txt link

Where to go from here