Multiverso/binding/lua/docs/API.md

3.4 KiB

Multiverso Torch Binding API

init()

Initialize mutliverso.

This should be called only once before training at the beginning of the whole project.

barrier()

Set a barrier for all workers to wait.

Workers will wait until all workers reach a specific barrier.

shutdown()

Shutdown multiverso.

This should be called only once after finishing training at the end of the whole project.

num_workers()

Return the total number of workers.

worker_id()

Return the id (zero-based index) for current worker.

TableHandler

TableHandler is an interface to sync different kinds of values.

In most cases, you are supposed to sync models (for initialization) and gradients (during training) so as to let multiverso help you manage the models in distributed environments. Currently, two types of TableHandler are supported, namely ArrayTableHandler and MatrixTableHandler.

ArrayTableHandler

ArrayTableHandler is used to sync array-like (one-dimensional) value.

Although our model tends to be a matrix, when using torch.nn package we can get the flattened parameters and gradients with module.getParameters(). So in most cases, we should use ArrayTableHandler instead of MatrixTableHandler we will introduce soon.

ArrayTableHandler:new(size)

Create a ArrayTableHandler for syncing array-like (one-dimensional) value.

The size should be a number equal to the size of value we want to sync.

ArrayTableHandler:add(data)

Add a array-like (one-dimensional) data to the server.

The data should be a torch.Tensor or Lua table. During training process, the data should be the gradients (delta value). The size of data must be equal to the size specified in initialization.

ArrayTableHandler:get()

Get the array-like (one-dimensional) value from the server.

The value we get will be a torch.Tensor. Usually, we are supposed to use Tensor:copy() to assign the value to desired destination.

MatrixTableHandler

MatrixTableHandler is used to sync matrix-like (two-dimensional) value.

MatrixTableHandler:New(num_row, num_col)

Create a MatrixTableHandler for syncing matrix-like (two-dimensional) value.

The num_row should be the number of rows and the num_col should be the number of columns. Both of them should be a number equal to the exact size of value we want to sync.

MatrixTableHandler:add(data, row_ids)

Add a matrix-like (two-dimensional) data to the server.

Same as the clarification in ArrayTableHandler, the data should be a torch.Tensor or Lua table and we should pass the gradients (delta value) not the exact value to it. The row_ids is an optional parameter and it should be an array of 'row_id' numbers when specified. If specified, multiverso will only update the value in specific rows and the size of data should be equal to the size of value we want to update.

MatrixTableHandler:get(row_ids)

Get the matrix-like (two-dimensional) value from the server.

The row_ids is an optional parameter and the interface works the same way as ArrayTableHandler when row_ids is not specified. But when we pass an array of row_id numbers, we will only get the value form specific rows. In this way, we can not do a Tensor:copy() but have to deal with the value manually.