Cross-posted from itbrokeand.ifixit.com because I wrote it.
A while back we released Forker, a small PHP library that enables easy parallel processing. We’ve been using it in production for a couple of months now, so I figured it would be a good candidate for a first post.
Right now, we use it for network IO operations. Particularly, we use it to
connect via SSH and execute commands on a bunch of machines in parallel. This
has decreased the time it takes us to deploy new code to 2-3 seconds, instead
of 2-3*n
seconds that it took before.
Here’s an example of how you can use Forker to run a command on many machines in parallel:
This would produce something along the lines of:
Array
(
[0] => Array
(
[output] => machine1.example.com
[exitCode] => 0
)
[1] => Array
(
[output] => machine1.example.com
[exitCode] => 0
)
)
Forker has three functions to provide the functionality
behind its one public function. A protected method called
fork()
creates a connected socket stream pair, forks the current
process and returns one socket to the parent, and one to
child. These sockets are used for one-way communication from
the child to the parent.
Up one more level there is the mapStream()
function.
This implements a typical map function over a provided array.
The map function passes to the callback the array entry and a connected stream.
The data written to the stream during each callback is read
from the other end by the parent and returned from the original
mapStream()
call.
The final public function ‘map()’ wraps the mapStream()
function and allows the caller to just return a php
object instead of writing to a stream. Returned objects are serialized
and written to the stream and then deserialized before being returned
in the resultant array.
The end result is an easy to use interface to forking in
PHP, Forker::map()
.