Traffic Server Layer 7 Working Group¶
My view on the Layer 7 Routing working group meetings in Denver, 25-27 Jul 2018.
Configuration¶
The YAML schema will be updated to be more generic. The “primary” and “secondary” rings will be removed and replaced with generic “rings” which will be defined in a specific section. The “strategy” section will be used to create policy that uses the rings. In essence the “ring” section will describe the elements of the CDN and “strategy” will describe the CDN policy for a specific node. To make this easier to deploy, some version of John’s include support for YAML (which does not have this natively).
HttpSM
Interaction¶
The various strategies in the L7R configuration files will provide a set of “strategies”, each marked with a unique tag. There is a globally configured default strategy tag. During remap, this can be overridden with a different strategy tag.
When the HttpSM
decides it needs to send the request to an upstream target, it instantiates a
“strategizer” 1 which is a run time stateful object based on a specific strategy. That strategy is
selected by the strategy tag.
The strategizer is responsible for providing transaction ready sessions to upstream targets. The
selection of the target and any handling of layer 3 or 4 connection errors is the resposibility of
the strategizer. When the HttpSM
receives an open session event (SESSIOn_READY
) it will send the HTTP request to the
upstream. The HttpSM
will handle the result or response header for transaction, then report it to the
strategizer. The strategizer then decides on the appropriate next action and informs the HttpSM
syncrhonously what that action is. The HttpSM
performs its required tasks and when it has finished,
reports that to the strategizer. Note that all decisions about how to proceed from the upstream
response is done by the strategizer. This includes not only HTTP responses but also network errors.
Even though the session is open at the time the HttpSM
receives it, that does not guarantee the
absence of further network problems.
The interaction is split out in this manner to accomodate the variety of synchronous and
asynchronous operations. In particular if there is an HTTP reponse from the upstream then the HttpSM
needs to be able to handle it in an appropriate way. However, that depends on what the next action
as decided by the strategizer. Because the strategizer next action can depend on potentially long
lived asynchronous operations, waiting for such operations to complete is therefore not feasible.
Instead the strategizer must respond synchronously to the response with an action code that
describes what the strategizer will do when the HttpSM
has finished its operations. It is assumed
that while the next action may take a long time to perform, determining the next action should be
a fast computation.
The current valid actions are
DONE
The result is a final result that should be sent on to the user agent.
RETRY
The transaction was a failure of some sort but not a permanent one. The strategizer will prepare another session after the
HttpSM
has finished cleaning up the current transaction.
This is clearer with specific examples.
First consider the case when the upstream request is successful. It is unacceptable to wait for
strategy completion to send data to the user agent - that flow needs to start as soon as possible.
The HttpSM
can no longer know if that is correct without consulting the strategizer therefore that
must be a synchronous call.
Even in an HTTP failure there is still work for the HttpSM
. In this case, for a 404
response
status the strategizer will fail over to a different upstream. This behavior is used to probe
multiple upstream pods for specific content, a 404
indicating the next target should be tried
rather than returning the response to the user agent. The HttpSM
must drain any body from the response
but needs to know, while draining, that the body will not be returned to the user agent.
Another view of the activity demonstrates the “co-routine” like nature of the interaction. Both the HttpSM
and strategizer perform blocking operations where the other side must wait for an event or signal to indication the operation has completed. The basic logic is
The state sequence in the HttpSM
is much simplified by this work. In particular, the
nemesis of redirection or other upstream connection failures requiring rolling back the HttpSM
state
will be avoided. Instead there is a simpler loop which spans a much smaller set of states.
Next Steps¶
In my view the key next steps are first, to get Extendible
committed to master. After that
it was agreed the next deliverable would be a plugin or plugins that would emulate the current
manual host status marking using Extendible
. This would enable
Valdating the
Extendible
API.Experiment with external tool to
Extendible
data mechanisms. That is, how can external data be pused in to host and IP address records?Validate that
Extendible
data can be used to manipulate upstream selection.
I think this will also be the keystone of moving forward with Extendible
as doing things
similar to existing code is much easier than building thing ex nihilo. It might even be reasonable,
as a temporary expedient, to have the core use Extendible
to create the data for these
purposes and leave the plugin (which will reuquire much more in the way of infrastructure changes)
until later. There is a reasonable chance that the core will end up using Extendible
internally, because it makes the overall code base much more modular.
Open Issues¶
There are still a few open issues which aren’t fully resolved.
- Setting upstream address
It is a requirement, for particularly for transparent networking, to be able to explicitly set the IP addres of the upstream, both in the core and using the plugin API call
TSHttpTxnSetTargetAddr
.- Self Marking
There must be a descriptor that says “the remapped upstream” so that no explicit upstream hosts need to be defined. This is needed for the default routing situation and for forward proxying.
Visions¶
Extendible
was quite a hit and there was a push to use it in other situations, in particular
with the HttpSM
. I pushed back on that because I think updating the HttpSM
would be a major change and
therefore we should bake Extendible
a bit more to make sure the API is clean and the code
reliable.
With regard to the use of Extendible
in HttpSM
(something Leif and Vijay were giddy about) I
have always thought this was in the long run a good idea, but changing the HttpSM
is always fraught. I
prefer to get some experience with Extendible
before making such a major change. Past that
the purpose of this would be to replace the current transaction arguments with Extendible
.
An issue with this is remap plugins. These have two properties that make it more difficult to use
Extendible
.
Different plugins per remap rule using different sets of extended data.
Reloading, which is another ongoing project.
For various reasons (including the nature of the proxy allocators) it is not feasible to have
instances of a class with different extendible schemas. As a result it will probably be necessary to
store a schema per remap rule and dynamically allocate the storage block. This could be justified in
that it is better to allocate once per HttpSM
than once per remap plugin that needs per transaction
storage. If this isn’t sufficient it might be needful to use an IOBufferBlock
. Given all the
other allocations that go on in the HttpSM
I suspect reducing that will more than compensate for using
Extendible
in the HttpSM
, especially if combined with MemArena
for general allocation.
Such a change to HttpSM
will almost require similar changes to HttpClientSession
and
AnnotatedConnection
. The reasons for this are the same as for the HttpSM
, in that each
supports an array of < void * >
which are used by plugins. In these cases the use is for
global plugins and therefore can be done statically rather than dynamically.
Footnotes
- 1
This is a stupid name but no one else would provide a better one, which frankly shouldn’t be challenging.