Introduction to beanstalk-clj
Getting started
Assume you have installed beanstalkd at the very begining.
To start our trip, simply start it using:
beanstalkd -VV
(Let’s see what it really happens!):
~ % beanstalkd -VV
pid 6034
bind 4 0.0.0.0:11300
To use beanstalk-clj
we have to use the library and set up
a connection to an (already running) beanstalkd server.
Inspect beanstalkd verbose log:
accept 6
Basic Operation
Now that we have a connection set up, we can enqueue jobs:
Inspect beanstalkd:
<6 command put
<6 job 1
>6 reply INSERTED 1
Or we can request jobs:
Inspect beanstalkd:
<6 command reserve
>6 reply RESERVED 1 7
>6 job 1
Once we are done with processing a job, we have to mark it as
done, otherwise jobs are re-queued by beanstalkd after ttr
(time-to-run, 120 seconds default) is surpassed. A job is marked
as done by calling delete
:
Inspect beanstalkd:
<6 command delete
>6 reply DELETED
If you use a timeout of 0, reserve
will immediately return either
a job or nil
Inspect beanstalkd:
<6 command reserve-with-timeout
>6 reply TIMED_OUT
Note that beanstalk-clj requires a job bodies to be strings, otherwise throwing an exception:
There is no restriction on what characters you can put in a job body, so the can be used to hold artibrary binary data:
- If you want to send images, just
put
the image data as a string; - If you want to send dictionary, just
put
the json or protobuf encoded string.
A more clojure-idiom
interface would look something like following:
Inspect beanstalkd:
<7 command use
>7 reply USING my-tube
<7 command put
<7 job 4
>7 reply INSERTED 4
close 7
accept 7
<7 command watch
>7 reply WATCHING 2
<7 command reserve
>7 reply RESERVED 4 5
>7 job 4
<7 command delete
>7 reply DELETED
close 7
Now for saving some typing with a little macro:
Tube Management
A single beanstalkd server can provide many different queues,
called list-tubes
to see all available tubes:
A beanstalkd client choose one tube into which its job are put. This is the tube “used” by the client. To see what tube you are currently using:
Unless told, otherwise, a client uses the default
tube.
If you want to use a different tube:
If you decide to use a tube, that does not yet exist, the tube is automatically created by beanstalk-clj:
Of course, you can always switch back to the default tube. Tubes that don’t have any client using or watching, vanish automatically:
Further, a beanstalkd client can choose many tubes to reserve
jobs from. These tubes are watched
by client. To see
what tubes you are currently watching:
To watch an additional tube:
As before, tubes that do not yet exist are created automatically once you start watching them.
To stop watching a tube:
You can’t watch zero tubes. So if you try to ignore the last tube you are watching, this is silently ignored.
To recap: each beanstalkd client manages two separate concerns: which tube newly created jobs are put into, and which tube(s) jobs are reserved from. Accordingly, there are two separate sets of functions for these concerns:
use
andusing
affect where jobs areput
;watch
andwatching
control where jobs arereserve
d from.
Note that these concerns are fully orthogonal: for example, when you use
a
tube, it is not automatically watch
ed. Neither does watch
ing a tube affect
the tube you are using
.
Statistics
Beanstalkd accumulated various statistics at the server/tube/job level. Statistical details for a job can only be retrieved during the job’s lifecycle. If you try to access job stats after the job was delted, you will get a command-failure exception:
Let’s have a look at some numbers for the default
tube:
Finally, let’s go into server-level statistics:
Advanced Operation
In “Basic Operation” above, we discussed the typical lifecycle of a job:
put reserve delete
-----> [READY] ---------> [RESERVED] --------> *poof*
(This picture was taken from beanstalkd's protocol documentation. It is
originally contained in `protocol.txt`, part of the beanstalkd
distribution.)
But besides ready
and reserved
, a job can also be delayed
or buried
.
Along with those states come a few transitions, so the full picture looks like
the following:
put with delay release with delay
----------------> [DELAYED] <------------.
| |
| (time passes) |
| |
put v reserve | delete
-----------------> [READY] ---------> [RESERVED] --------> *poof*
^ ^ | |
| \ release | |
| `-------------' |
| |
| kick |
| |
| bury |
[BURIED] <---------------'
|
| delete
`--------> *poof*
(This picture was taken from beanstalkd’s protocol documentation. It is
originally contained in protocol.txt
, part of the beanstalkd
distribution.)
Now let’s have a practical look at those new possibilities. For a start, we can create a job with a delay. Such a job will only be available for reservation once this delay passes:
To release job will put it back into the tube it came from;
To bury job will put it aside and not available until execute kick
;
To kick with a bound number will send many jobs to be alived again:
Inspecting jobs
Peek command allow us to inpect jobs without reserving and modifying their states, Note that this peek did not reserve the job:
If you try to peek at a non-existing job, you’ll get an exception:
You can also use peek-delayed
and peek-buried
to inspect delayed jobs
and buried jobs.
Job Priorities
If need arises, you can override this behaviour by giving different jobs different priorities. There are three hard facts to know about job priorities:
-
Jobs with lower priority numbers are reserved before jobs with higher priority numbers.
-
beanstalkd priorities are 32-bit unsigned integers (they range from 0 to 2**32 - 1).
-
beanstalkc uses 2**31 as default job priority (
beanstalkc.DEFAULT_PRIORITY
).
Fin!
That’s it, for now. We’ve left a few capabilities untouched (touch and time-to-run). But if you’ve really read through all of the above, send me a message and tell me what you think of it. And then go get yourself a treat. You certainly deserve it.