Merge pull request #115 from mozilla/document-architecture

add docs page for services architecture and troubleshooting
2014-03-05 10:40:17 -08:00 · 2014-03-05 10:40:17 -08:00 · 76c7835db0
--- a/docs/common_tasks.rst
+++ b/docs/common_tasks.rst
@ -0,0 +1,65 @@
+Common tasks
+============
+
+This is a list of maintenance tasks you may have to execute on a treeherder-service deployment
+
+Apply a change in the code
+--------------------------
+
+If you changed something in the log parser, you need to do a compilation step:
+
+.. code-block:: bash
+
+   > python setup.py build_ext --inplace
+
+In order to make the various services aware of a change in the code you need to restart supervisor:
+
+.. code-block:: bash
+
+   > sudo /etc/init.d/supervisord restart
+
+Add a new repository
+--------------------
+
+To add a new repository, the following steps are needed:
+
+* Append a new datasource to the datasource fixtures file located at treeherder/model/fixtures/repository.json
+* Load the file you edited with the loaddata command:
+
+  .. code-block:: bash
+
+     > python manage.py loaddata repository
+
+* Create a new datasource for the given repository:
+
+  .. code-block:: bash
+
+     > python manage.py init_datasources
+
+* Restart memcached to clean any previously cached datasource
+
+  .. code-block:: bash
+
+     > sudo /etc/init.d/memcached restart
+
+* Generate a new oauth credentials file:
+
+  .. code-block:: bash
+
+     > python manage.py export_project_credentials
+
+* Restart all the services through supervisord:
+
+  .. code-block:: bash
+
+     > sudo /etc/init.d/supervisord restart
+
+
+Restarting varnish
+------------------
+
+You may want to restart varnish after a change in the ui. To do so type
+
+  .. code-block:: bash
+
+     > sudo /etc/init.d/varnish restart
--- a/docs/conf.py
+++ b/docs/conf.py
@ -16,7 +16,8 @@ import sys, os
 # If extensions (or modules to document with autodoc) are in another directory,
 # add these directories to sys.path here. If the directory is relative to the
 # documentation root, use os.path.abspath to make it absolute, like shown here.
-#sys.path.insert(0, os.path.abspath('.'))
+sys.path.insert(0, os.path.abspath('..'))
+os.environ.setdefault("DJANGO_SETTINGS_MODULE", "treeherder.settings")

 # -- General configuration -----------------------------------------------------

--- a/docs/dataload.rst
+++ b/docs/dataload.rst
@ -21,7 +21,12 @@ Here is a brief description of what each periodic task will do for you:
 *process-objects*
  As the name says, processes job objects from the objectstore to the jobs store.
  Once a job is processed, it becomes available in the restful interface for consumption.
-  See the `architecture diagram`_ for more info
+  See the `dataflow diagram`_ for more info
+
+Follows a data flow diagram which can help to understand better how these tasks are used by treeherder
+
+.. image:: https://cacoo.com/diagrams/870thliGfT89pLZc-B5E80.png
+   :width: 800px

 .. _RelEng buildapi: https://wiki.mozilla.org/ReleaseEngineering/BuildAPI
-.. _architecture diagram: https://cacoo.com/diagrams/870thliGfT89pLZc
+.. _dataflow diagram: https://cacoo.com/diagrams/870thliGfT89pLZc
--- a/docs/deployment.rst
+++ b/docs/deployment.rst
@ -73,3 +73,19 @@ Once puppet has finished, the only thing left to do is to start all the treeherd
 The easiest way to do it is via supervisord.
 A supervisord configuration file is included in the repo under deployment/supervisord/treeherder.conf.

+
+Securing the connection
+-----------------------
+
+To put everything under a SSL connection you may want to use a SSL wrapper like stunnel_. Here is a bacis example
+of a stunnel configuration file:
+
+.. code-block:: INI
+
+  cert = /path-to-my-pem-file/credentials.pem
+
+  [https]
+  accept  = 443
+  connect = 80
+
+.. _stunnel: https://www.stunnel.org
--- a/docs/index.rst
+++ b/docs/index.rst
@ -15,6 +15,9 @@ Contents:
   dataload
   deployment
   ui_integration
+   list_of_services
+   common_tasks
+   troubleshooting


 Indices and tables
--- a/docs/list_of_services.rst
+++ b/docs/list_of_services.rst
@ -0,0 +1,54 @@
+Services architecture
+=====================
+
+Running treeherder at full speed requires a number of services to be started. For an overview of all the services, see the diagram below
+
+.. image:: https://cacoo.com/diagrams/c5uZPojmQdR0QaDp-3D076.png
+
+All the services marked with a yellow background are python scripts that can be found in the bin directory.
+In a typical deployment they are monitored by something like supervisord.
+Follows a description of those services.
+
+Gunicorn
+--------
+
+A wsgi server in charge of serving the restful api and the django admin.
+All the requests to this server are proxied through varnish and apache.
+
+Gevent-socketio
+---------------
+
+A gevent-based implementation of a `socket.io`_ server that can send soft-realtime updates to the clients.
+It only serves socketio-related request, typically namespaced with /socket.io.
+When executing, it consumes messages from rabbitmq using a "events.#" routing key.
+As soon as a new event is detected, it's sent down to the consumers who subscribed to it.
+To separate the socket.io connection from the standard http ones we use varnish with the following configuration
+
+.. literalinclude:: ../puppet/files/varnish/default.vcl
+   :lines: 14-26
+
+Celery task worker
+------------------
+
+This service executes asynchronous tasks that can be by triggered by the celerybeat task scheduler or by another worker.
+In a typical treeherder deployment you will have two different pools of workers:
+
+*  a gevent based pool, generally good for I/O bound tasks
+* a pre-fork based pool, generally good for CPU bound tasks
+
+In the bin directory of treeherder-service there's a script to run both these type of pools.
+
+Celerybeat task scheduler
+-------------------------
+
+A scheduler process in charge of running periodic tasks.
+
+Celerymon task monitor
+----------------------
+
+This process provides an interface to the status of the worker and the running tasks. It can be used to provide such informations
+to monitoring tools like munin.
+
+
+
+.. _socket.io: http://socket.io
--- a/docs/troubleshooting.rst
+++ b/docs/troubleshooting.rst
@ -0,0 +1,35 @@
+Troubleshooting
+===============
+
+Using supervisord for development
+---------------------------------
+
+On an ubuntu machine you can install supervisord with
+
+.. code-block:: bash
+
+   >sudo apt-get install supervisor
+
+To start supervisord with an arbitrary configuration, you can type:
+
+.. code-block:: bash
+
+   >supervisord -c my_config_file.conf
+
+You can find a supervisord config file inside the deployment/supervisord folder.
+That config file contains a section for each service that you may want to run.
+Feel free to comment one or more of those sections if you don't need that specific service.
+If you just want to access the restful api or the admin for example, comment all those sections but the one
+related to gunicorn.
+You can stop supervisord (and all processes he's taking care of) with ctrl+c.
+Please note that for some reason you may need to manually kill the celery worker when it's under heavy load.
+
+Where are my log files?
+-----------------------
+
+You can find the various services log files under
+  * /var/log/celery
+  * /var/log/gunicorn
+  * /var/log/socketio
+
+You may also want to inspect the main treeherder log file ~/treeherder-service/treeherder.log