mig/doc/concepts.rst.html

<!DOCTYPE html><html><head><meta charset="utf-8"><title></title><style type="text/css">body {
  width: 95%;
  max-width: 70%;
  margin: 20px;
  padding: 0;
  background: #151515 url("../images/bkg.png") 0 0;
  color: #eaeaea;
  font: 16px;
  line-height: 1.5em;
  font-family: Monaco, "Bitstream Vera Sans Mono", "Lucida Console", Terminal, monospace;
}

#table-of-contents ul {
    line-height: 1;
}

/* General & 'Reset' Stuff */

.container {
  width: 95%;
  max-width: 1000px;
  margin: 0 auto;
}

section {
  display: block;
  margin: 0 0 20px 0;
}

h1, h2, h3, h4, h5, h6 {
  /*margin: 0 0 20px;*/
  /*margin: 0;*/
}

/* Header, <header>
 *    header   - container
 *       h1       - project name
 *          h2       - project description
 *          */

header {
  background: rgba(0, 0, 0, 0.1);
  width: 100%;
  /*border-bottom: 1px dashed #b5e853;*/
  /*padding: 20px 0;
 *   margin: 0 0 40px 0;*/
  padding: 5px 0;
  margin: 0 0 10px 0;
}

header h1 {
  font-size: 30px;
  line-height: 1.5;
  margin: 0 0 0 -40px;
  font-weight: bold;
  font-family: Monaco, "Bitstream Vera Sans Mono", "Lucida Console", Terminal, monospace;
  /*color: #b5e853;*/
  color: #089d00;
  text-shadow: 0 1px 1px rgba(0, 0, 0, 0.1),
               0 0 5px rgba(181, 232, 83, 0.1),
               0 0 10px rgba(181, 232, 83, 0.1);
  letter-spacing: -1px;
  -webkit-font-smoothing: antialiased;
}

header h1:before {
  content: "./ ";
  font-size: 24px;
}

header h2 {
  font-size: 18px;
  font-weight: 300;
}

/* Main Content
 * */

body {
  width: 100%;
  margin-left: auto;
  margin-right: auto;
  -webkit-font-smoothing: antialiased;
}
section img {
  max-width: 100%
}

h2 a {
  font-weight: bold;
  color: #8AB638;
  line-height: 1.4em;
  font-size: 1.4em;
}
h3 a, h4 a, h5 a, h6 a {
  font-weight: bold;
  color: #934500;
  line-height: 1.4em;
}

h1 {
  font-size: 30px;
}

h2 {
  font-size: 28px;
  border-bottom: 1px dashed #b5e853;
}

h3 {
  font-size: 18px;
}

h4 {
  font-size: 14px;
}

h5 {
  font-size: 12px;
  text-transform: uppercase;
  margin: 0 0 5px 0;
}

h6 {
  font-size: 12px;
  text-transform: uppercase;
  color: #999;
  margin: 0 0 5px 0;
}

dt {
  font-style: italic;
  font-weight: bold;
}
/*
ul li {
  list-style: none;
}
*/
/*
ul li:before {
  content: ">>";
  font-family: Monaco, "Bitstream Vera Sans Mono", "Lucida Console", Terminal, monospace;
  font-size: 13px;
  color: #b5e853;
  margin-left: -37px;
  margin-right: 21px;
  line-height: 16px;
}
*/

blockquote {
  color: #aaa;
  padding-left: 10px;
  border-left: 1px dotted #666;
}


pre {
  background: rgba(0, 0, 0, 0.9);
  border: 1px solid rgba(255, 255, 255, 0.15);
  padding: 10px;
  font-size: 14px;
  //color: #b5e853;
  border-radius: 2px;
  -moz-border-radius: 2px;
  -webkit-border-radius: 2px;
  text-wrap: normal;
  overflow: auto;
  overflow-y: hidden;
}

pre.address {
  margin-bottom: 0 ;
  margin-top: 0 ;
  font: inherit }

pre.literal-block, pre.doctest-block, pre.math, pre.code {
  margin-left: 2em ;
  margin-right: 2em }

code .ln { color: grey; } /* line numbers */
/*code, code { background-color: #eeeeee }*/
code .comment, code .comment, code .c1 { color: #999; }
code .keyword, code .keyword, code .kd, code .kn, code .k, code .o { color: #FC8F3F; font-weight: bold;}
code .nb { color: #c45918;}
code .s {color: #0a77c4;}
code .punctuation, code .p { color: white;}
code .literal.string, code .literal.string { color: #40BF32; }
code .name, code .name.builtin, code .nx { color: white; }
code .deleted, code .deleted { background-color: #DEB0A1}
code .inserted, code .inserted { background-color: #A3D289}

table {
  width: 100%;
  margin: 0 0 20px 0;
}

th {
  text-align: left;
  border-bottom: 1px dashed #b5e853;
  padding: 5px 10px;
}

td {
  padding: 5px 10px;
}

hr {
  height: 0;
  border: 0;
  border-bottom: 1px dashed #b5e853;
  color: #b5e853;
}
/* Links
 *    a, a:hover, a:visited
 *    */

a {
  color: #63c0f5;
  /*text-shadow: 0 0 5px rgba(104, 182, 255, 0.5);*/
  text-decoration: none;
}

cite {
  color: #00FF4A;
}

strong {
  color: #C64216;
}
</style></head><body><h1>Mozilla InvestiGator Concepts &amp; Internal Components</h1><div class="contents" id="table-of-contents"><h2>Table of Contents</h2><ul class="auto-toc"><li><p><a class="reference internal" href="#actions" id="id1">1   Actions</a></p></li><li><p><a class="reference internal" href="#investigation-workflow" id="id2">2   Investigation workflow</a></p></li><li><p><a class="reference internal" href="#access-control-lists" id="id3">3   Access Control Lists</a></p></li><li><p><a class="reference internal" href="#threat-model" id="id4">4   Threat Model</a></p><ul class="auto-toc"><li><p><a class="reference internal" href="#strong-gpg-security-model" id="id5">4.1   Strong GPG security model</a></p></li><li><p><a class="reference internal" href="#infrastructure-resiliency" id="id6">4.2   Infrastructure resiliency</a></p></li><li><p><a class="reference internal" href="#no-port-listening" id="id7">4.3   No port listening</a></p></li><li><p><a class="reference internal" href="#protection-of-connections-to-the-relays" id="id8">4.4   Protection of connections to the relays</a></p></li><li><p><a class="reference internal" href="#randomization-of-the-queue-names" id="id9">4.5   Randomization of the queue names</a></p></li><li><p><a class="reference internal" href="#whitelisting-of-agents" id="id10">4.6   Whitelisting of agents</a></p></li><li><p><a class="reference internal" href="#limit-data-extraction-to-a-minimum" id="id11">4.7   Limit data extraction to a minimum</a></p></li></ul></li></ul></div><p>MIG is a platform to perform investigative surgery on remote endpoints.
It enables investigators to obtain information from large numbers of systems
in parallel, thus accelerating investigation of incidents.</p><p>Besides scalability, MIG is designed to provide strong security primitives:</p><ul><li><p><strong>Access control</strong> is ensured by requiring GPG signatures on all actions. Sensitive
actions can also request signatures from multiple investigators. An attacker
who takes over the central server will be able to read non-sensitive data,
but will not be able to send actions to agents. The GPG keys are securely
kept by their investigators.</p></li><li><p><strong>Privacy</strong> is respected by never retrieving raw data from endpoints. When MIG is
run on laptops or phones, end-users can request reports on the operations
performed on their devices. The 2-man-rule for sensitive actions also prevents
rogue investigators invading privacy.</p></li><li><p><strong>Reliability</strong> is built in. No component is critical. If an agent crashes, it
will attempt to recover and reconnect to the platform indefinitely. If the
platform crashes, a new platform can be rebuilt rapidly without backups.</p></li></ul><p>MIG privileges a model where requesting information from endpoints is fast and
simple. It does not attempt to record everything all the time. Instead, it
assumes that when a piece of information is needed, it will be easy to retrieve it.</p><p>It's an army of Sherlock Holmes, ready to interrogate your network within
milliseconds.</p><p>Terminology:</p><ul><li><p><strong>Investigators</strong>: humans who use clients to investigate things on agents</p></li><li><p><strong>Agent</strong>: a small program that runs on a remote endpoint. It receives commands
from the scheduler through the relays, executes those commands using modules,
and sends the results back to the relays.</p></li><li><p><strong>Module</strong>: single feature Go program that does stuff, like inspecting a file
system, listing connected IP addresses, creating user accounts or adding
firewall rules</p></li><li><p><strong>Scheduler</strong>: a messaging daemon that routes actions and commands to and from
agents.</p></li><li><p><strong>Relay</strong>: a RabbitMQ server that queues messages between schedulers and agents.</p></li><li><p><strong>Database</strong>: a storage backend used by the scheduler and the api</p></li><li><p><strong>API</strong>: a REST api that exposes the MIG platform to clients</p></li><li><p><strong>Client</strong>: a program used by an investigator to interface with MIG (like the
MIG Console, or the action generator)</p></li><li><p><strong>Worker</strong>: a worker is a small extension to the scheduler and api that
performs very specific tasks based on events received via the relay.</p></li></ul><p>An investigator uses a client (such as the MIG Console) to communicate with
the API. The API interfaces with the Database and the Scheduler.
When an action is created by an investigator, the API receives it and writes
it into the spool of the scheduler (they share it via NFS). The scheduler picks
it up, creates one command per target agent, and sends those commands to the
relays (running RabbitMQ). Each agent is listening on its own queue on the relay.
The agents execute their commands, and return the results through the same
relays (same exchange, different queues). The scheduler writes the results into
the database, where the investigator can access them through the API.
The agents also use the relays to send heartbeat at regular intervals, such that
the scheduler always knows how many agents are alive at a given time.</p><p>The end-to-end workflow is:</p><blockquote><pre>{investigator} -https-&gt; {API}        {Scheduler} -amqps-&gt; {Relays} -amqps-&gt; {Agents}
                            \           /
                          sql\         /sql
                             {DATABASE}</pre></blockquote><p>Below is a high-level view of the architecture:</p><img alt=".files/MIG-Arch-Diagram.png" src=".files/MIG-Arch-Diagram.png"><section id="actions"><header><h2><a href="#id1">1   Actions</a></h2></header><p>Actions are JSON files created by investigators to perform tasks on agents.</p><p>For example, an investigator who wants to verify that root passwords are hashed
and salted on linux systems, would use the following action:</p><pre><code class="code json"><span class="punctuation">{</span>
        <span class="name tag">"name"</span><span class="punctuation">:</span> <span class="literal string double">"verify root password storage method"</span><span class="punctuation">,</span>
        <span class="name tag">"target"</span><span class="punctuation">:</span> <span class="literal string double">"agents.queueloc like 'linux.%'"</span><span class="punctuation">,</span>
        <span class="name tag">"threat"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
                <span class="name tag">"family"</span><span class="punctuation">:</span> <span class="literal string double">"compliance"</span><span class="punctuation">,</span>
                <span class="name tag">"level"</span><span class="punctuation">:</span> <span class="literal string double">"low"</span><span class="punctuation">,</span>
                <span class="name tag">"ref"</span><span class="punctuation">:</span> <span class="literal string double">"syslowauth3"</span><span class="punctuation">,</span>
                <span class="name tag">"type"</span><span class="punctuation">:</span> <span class="literal string double">"system"</span>
        <span class="punctuation">},</span>
        <span class="name tag">"description"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
                <span class="name tag">"author"</span><span class="punctuation">:</span> <span class="literal string double">"Julien Vehent"</span><span class="punctuation">,</span>
                <span class="name tag">"email"</span><span class="punctuation">:</span> <span class="literal string double">"ulfr@mozilla.com"</span><span class="punctuation">,</span>
                <span class="name tag">"revision"</span><span class="punctuation">:</span> <span class="literal number integer">201503121200</span>
        <span class="punctuation">},</span>
        <span class="name tag">"operations"</span><span class="punctuation">:</span> <span class="punctuation">[</span>
                <span class="punctuation">{</span>
                        <span class="name tag">"module"</span><span class="punctuation">:</span> <span class="literal string double">"file"</span><span class="punctuation">,</span>
                        <span class="name tag">"parameters"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
                                <span class="name tag">"searches"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
                                        <span class="name tag">"root_passwd_hashed_or_disabled"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
                                                <span class="name tag">"paths"</span><span class="punctuation">:</span> <span class="punctuation">[</span>
                                                        <span class="literal string double">"/etc/shadow"</span>
                                                <span class="punctuation">],</span>
                                                <span class="name tag">"contents"</span><span class="punctuation">:</span> <span class="punctuation">[</span>
                                                        <span class="literal string double">"root:(\\*|!|\\$(1|2a|5|6)\\$).+"</span>
                                                <span class="punctuation">]</span>
                                        <span class="punctuation">}</span>
                                <span class="punctuation">}</span>
                        <span class="punctuation">}</span>
                <span class="punctuation">}</span>
        <span class="punctuation">],</span>
        <span class="name tag">"syntaxversion"</span><span class="punctuation">:</span> <span class="literal number integer">2</span>
<span class="punctuation">}</span></code></pre><p>The parameters are:</p><ul><li><p><strong>name</strong>: a string that represents the action.</p></li><li><p><strong>target</strong>: a search string used by the scheduler to find agents to run the
action on. The target format uses Postgresql's WHERE condition format against
the <a class="reference external" href="data.rst.html#entity-relationship-diagram">agents</a> table of the database. This method allows for complex target
queries, like running an action against a specific operating system, or
against an endpoint that has a given public IP, etc...</p><p>The most simple query that targets all agents is <cite>name like '%'</cite> (the <cite>%</cite>
character is a wildcard in SQL pattern matching). Targeting by OS family can
be done on the <cite>os</cite> parameters such as <cite>os='linux'</cite> or <cite>os='darwin'</cite>.</p><p>Combining conditions is also trivial: <cite>version='201409171023+c4d6f50.prod'
and heartbeattime &gt; NOW() - interval '1 minute'</cite> will only target agents that
run a specific version and have sent a heartbeat during the last minute.</p><p>Complex queries are also possible.
For example: imagine an action with ID 1 launched against 10,000 endpoints,
which returned 300 endpoints with positive results. We want to launch action
2 on those 300 endpoints only. It can be accomplished with the following
<cite>target</cite> condition. (note: you can reuse this condition by simply changing
the value of <cite>actionid</cite>)</p></li></ul><pre><code class="code sql"><span class="name">id</span> <span class="keyword">IN</span> <span class="punctuation">(</span><span class="keyword">select</span> <span class="name">agentid</span> <span class="keyword">from</span> <span class="name">commands</span><span class="punctuation">,</span> <span class="name">json_array_elements</span><span class="punctuation">(</span><span class="name">commands</span><span class="punctuation">.</span><span class="name">results</span><span class="punctuation">)</span> <span class="keyword">as</span> <span class="name">r</span> <span class="keyword">where</span> <span class="name">actionid</span><span class="operator">=</span><span class="literal number integer">1</span> <span class="keyword">and</span> <span class="name">r</span><span class="operator">#&gt;&gt;</span><span class="literal string single">'{foundanything}'</span> <span class="operator">=</span> <span class="literal string single">'true'</span><span class="punctuation">)</span></code></pre><ul><li><p><strong>description</strong> and <strong>threat</strong>: additional fields to describe the action</p></li><li><p><strong>operations</strong>: an array of operations, each operation calls a module with a set
of parameters. The parameters syntax are specific to the module.</p></li><li><p><strong>syntaxversion</strong>: indicator of the action format used. Should be set to 2</p></li></ul><p>Upon generation, additional fields are appended to the action:</p><ul><li><p><strong>pgpsignatures</strong>: all of the parameters above are concatenated into a string and
signed with the investigator's private GPG key. The signature is part of the
action, and used by agents to verify that an action comes from a trusted
investigator. <cite>PGPSignatures</cite> is an array that contains one or more signatures
from authorized investigators.</p></li><li><p><strong>validfrom</strong> and <strong>expireafter</strong>: two dates that constrain the validity of the
action to a UTC time window.</p></li></ul></section><section id="investigation-workflow"><header><h2><a href="#id2">2   Investigation workflow</a></h2></header><p>The diagram below represents the full workflow from the launch of an action by
an investigation, to the retrieval of results from the database. The steps are
explained in the legend of the diagram, and map to various components of MIG.</p><p>Actions are submitted to the API by trusted investigators. PGPSignatures are
verified by the API and each agent prior to running any command.</p><p>View <a class="reference external" href=".files/action_command_flow.svg">full size diagram</a>.</p><object data=".files/action_command_flow.svg" type="image/svg+xml">.files/action_command_flow.svg</object></section><section id="access-control-lists"><header><h2><a href="#id3">3   Access Control Lists</a></h2></header><p>Not all keys can perform all actions. The scheduler, for example, sometimes needs
to issue specific actions to agents (such as during the upgrade protocol) but
shouldn't be able to perform more dangerous actions. This is enforced by
an Access Control List, or ACL, stored on the agents. An ACL describes who can
access what function of which module. It can be used to require multiple
signatures on specific actions, and limit the list of investigators allowed to
perform an action.</p><p>An ACL is composed of permissions, which are JSON documents hardwired into
the agent configuration. In the future, MIG will dynamically ship permissions
to agents.</p><p>Below is an example of a permission for the <cite>filechecker</cite> module:</p><pre><code class="code json"><span class="punctuation">{</span>
    <span class="name tag">"filechecker"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
        <span class="name tag">"minimumweight"</span><span class="punctuation">:</span> <span class="literal number integer">2</span><span class="punctuation">,</span>
        <span class="name tag">"investigators"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
            <span class="name tag">"Bob Kelso"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
                <span class="name tag">"fingerprint"</span><span class="punctuation">:</span> <span class="literal string double">"E60892BB9BD..."</span><span class="punctuation">,</span>
                <span class="name tag">"weight"</span><span class="punctuation">:</span> <span class="literal number integer">2</span>
            <span class="punctuation">},</span>
            <span class="name tag">"John Smith"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
                <span class="name tag">"fingerprint"</span><span class="punctuation">:</span> <span class="literal string double">"9F759A1A0A3..."</span><span class="punctuation">,</span>
                <span class="name tag">"weight"</span><span class="punctuation">:</span> <span class="literal number integer">1</span>
            <span class="punctuation">}</span>
        <span class="punctuation">}</span>
    <span class="punctuation">}</span>
<span class="punctuation">}</span></code></pre><p><cite>investigators</cite> contains a list of users with their PGP fingerprints, and their
weight, an integer that represents their access level.
When an agent receives an action that calls the filechecker module, it will
first verify the signatures of the action, and then validates that the signers
are authorized to perform the action. This is done by summing up the weights of
the signatures, and verifying that they equal or exceed the minimum required
weight.</p><p>Thus, in the example above, investigator John Smith cannot issue a filechecker
action alone. His weight of 1 doesn't satisfy the minimum weight of 2 required
by the filechecker permission. Therefore, John will need to ask investigator Bob
Kelso to sign his action as well. The weight of both investigators are then
added, giving a total of 3, which satisfies the minimum weight of 2.</p><p>This method gives ample flexibility to require multiple signatures on modules,
and ensures that one investigator cannot perform sensitive actions on remote
endpoints without the permissions of others.</p><p>The default permission <cite>default</cite> can be used as a default for all modules. It
has the following syntax:</p><pre><code class="code json"><span class="punctuation">{</span>
        <span class="name tag">"default"</span><span class="punctuation">:</span> <span class="punctuation">{</span>
                <span class="name tag">"minimumweight"</span><span class="punctuation">:</span> <span class="literal number integer">2</span><span class="punctuation">,</span>
                <span class="name tag">"investigators"</span><span class="punctuation">:</span> <span class="punctuation">{</span> <span class="error">...</span> <span class="punctuation">}</span>
                <span class="error">]</span>
        <span class="punctuation">}</span>
<span class="punctuation">}</span></code></pre><p>The <cite>default</cite> permission is overridden by module specific permissions.</p><p>The ACL is currently applied to modules. In the future, ACL will have finer
control to authorize access to specific functions of modules. For example, an
investigator could be authorized to call the <cite>regex</cite> function of filechecker
module, but only in <cite>/etc</cite>. This functionality is not implemented yet.</p></section><section id="threat-model"><header><h2><a href="#id4">4   Threat Model</a></h2></header><p>Running an agent as root on a large number of endpoints means that Mozilla
InvestiGator is a target of choice to compromise an infrastructure.
Without proper protections, a vulnerability in the agent or in the platform
could lead to a compromission of the endpoints.</p><p>The architectural choices made in MIG diminish the exposure of the endpoints to
a compromise. And while the risk cannot be reduced to zero entirely, it would
take an attacker direct control on the investigator's key material, or be root
on the infrastructure in order to take control of MIG.</p><p>MIG's security controls include:</p><ul><li><p>Strong GPG security model</p></li><li><p>Infrastructure resiliency</p></li><li><p>No port listening</p></li><li><p>Protection of connections to the relays</p></li><li><p>Randomization of the queue names</p></li><li><p>Whitelisting of agents</p></li><li><p>Limit data extraction to a minimum</p></li></ul><section id="strong-gpg-security-model"><header><h3><a href="#id5">4.1   Strong GPG security model</a></h3></header><p>All actions that are passed to the MIG platform and to the agents require
valid GPG signatures from one or more trusted investigators. The public keys of
trusted investigators are hardcoded in the agents, making it almost impossible
to override without root access to the endpoints, or access to an investigator's
private key. The GPG private keys are never seen by the MIG platform (API,
Scheduler, Database or Relays). A compromise of the platform would not lead to
an attacker taking control of the agents and compromising the endpoints.</p></section><section id="infrastructure-resiliency"><header><h3><a href="#id6">4.2   Infrastructure resiliency</a></h3></header><p>One of the design goals of MIG is to make each components as stateless as
possible. The database is used as a primary data store, and the schedulers and
relays keep data in transit in their respective cache. But any of these
components can go down and be rebuilt without compromising the resiliency of
the platform. As a matter of fact, it is strongly recommended to rebuild each
of the platform components from scratch on a regular basis, and only keep the
database as a persistent storage.</p><p>Unlike other systems that require constant network connectivity between the
agents and the platform, MIG is designed to work with intermittent or unreliable
connectivity with the agents. The rabbitmq relays will cache commands that are
not consumed immediately by offline agents. These agents can connect to the
relay whenever they choose to, and pick up outstanding tasks.</p><p>If the relays go down for any period of time, the agents will attempt to
reconnect at regular intervals continuously. It is trivial to rebuild
a fresh rabbitmq cluster, even on a new IP space, as long as the FQDN of the
cluster, and the TLS cert/key and credentials of the AMQPS access point
remain the same.</p></section><section id="no-port-listening"><header><h3><a href="#id7">4.3   No port listening</a></h3></header><p>The agents do not accept incoming connections. There is no listening port that
an attacker could use to exploit a vulnerability in the agent. Instead, the
agent connects to the platform by establishing an outbound connection to the
relays. The connection uses TLS, making it theorically impossible for an
attacker to MITM without access to the PKI and DNS, both of which are not
part of the MIG platform.</p></section><section id="protection-of-connections-to-the-relays"><header><h3><a href="#id8">4.4   Protection of connections to the relays</a></h3></header><p>The rabbitmq relay of a MIG infrastructure may very well be listening on the
public internet. This is used when MIG agents are distributed into various
environments, as opposed to concentrated on a single network location. RabbitMQ
and Erlang provide a stable network stack, but are not shielded from a network
attack that would take down the cluster. To reduce the exposure of the AMQP
endpoints, the relays use AMQP over TLS and require the agents to present a
client certificate before accepting the connection.</p><p>The client certificate is shared across all the agents. <strong>It is not used as an
authentication mechanism.</strong> Its sole purpose is to limit the exposure of a public
AMQP endpoint. Consider it a network filter.</p><p>Once the TLS connection between the agent and the relay is established, the
agent will present a username and password to open the AMQP connection. Again,
these credentials are shared across all agents, and are not used to authenticate
individual agents. Their role is to assign an ACL to the agent.
The ACL limits the AMQP action an agent can perform on the cluster.
See <a class="reference external" href="configuration.rst">rabbitmq configuration</a> for more information.</p></section><section id="randomization-of-the-queue-names"><header><h3><a href="#id9">4.5   Randomization of the queue names</a></h3></header><p>The protections above limit the exposure of the AMQP endpoint, but since the
secrets are shared across all agents, the possibility still exists that an
attacker gains access to the secrets, and establishes a connection to the relays.</p><p>Such access would have very limited capabilities. It cannot be used to publish
commands to the agents, because publication is ACL-limited to the scheduler.
It can be used to publish fake results to the scheduler, or listen on the
agent queue for incoming commands.</p><p>Both are made difficult by prepending a random number to the name of an agent
queue. An agent queue is named using the following scheme:</p><blockquote><p><cite>mig.agt.&lt;OS family&gt;.&lt;Hostname&gt;.&lt;uid&gt;</cite></p></blockquote><p>The OS and hostname of a given agent are easy to guess, but the uid isn't.
The UID is a 64 bits integer composed of nanosecond timestamps and a random 32
bits integer, chosen by the agent on first start. It is specific to an endpoint.</p></section><section id="whitelisting-of-agents"><header><h3><a href="#id10">4.6   Whitelisting of agents</a></h3></header><p>At the moment, MIG does not provide a strong mechanism to authenticate agents.
It is a work in progress, but for now agents are whitelisted in the scheduler
using the queuelocs that are advertised in the heartbeat messages. Spoofing the
queueloc string is difficult, because it contains a random value that is
specific to an endpoint. An attacker would need access to the random value in
order to spoof an agent's identity. This method provides a basic access control
mechanism. The long term goal is to allow the scheduler to call an external database
to authorize agents. In AWS, the scheduler could call the AWS API to verify that
a given agent does indeed exist in the infrastructure. In a traditional datacenter,
this could be an inventory database.</p></section><section id="limit-data-extraction-to-a-minimum"><header><h3><a href="#id11">4.7   Limit data extraction to a minimum</a></h3></header><p>Agents are not <cite>meant</cite> to retrieve raw data from their endpoints. This is more
of a good practice rather than a technical limitation. The modules shipped with
the agent are meant to return boolean answers of the type "match" or "no match".</p><p>It could be argued that answering "match" on sensitive requests is similar to
extracting data from the agents. MIG does not solve this issue.. It is the
responsibility of the investigators to limit the scope of their queries (ie, do
not search for a root password by sending an action with the password in the
regex).</p><p>The goal here is to prevent a rogue investigator from dumping a large amount of
data from an endpoint. MIG could trigger a memory dump of a process, but
retrieving that data will require direct access to the endpoint.</p><p>Note that MIG's database keeps records of all actions, commands and results. If
sensitive data were to be collected by MIG, that data would be available in the
database.</p></section></section></body></html>