Orphaned Requests Logging
In addition to request tracing and metrics reporting, logging orphaned requests provides additional insight into why an operation might have timed out (or got cancelled for a different reason).
While tracing and metrics can also be consumed through external interfaces, getting information about orphaned requests only works through the built-in mechanisms.
The way it works is that every time a response is in the process of being completed, when the SDK detects that the original caller is not listening anymore (likely because of a timeout), it will send this "orphaned" response to a reporting utility which aggregates all responses and in regular intervals logs them in a specific format.
When you spot a TimeoutException
in your log file, you can look for the output of the OrphanReporter
and correlate the information.
Output Format
The OrphanReporter
raises an OrphansRecordedEvent
which in turn is going to be logged alongside all other SDK logs.
Since orphans usually indicate a state that is not desirable, the log level for those events is WARN
.
By default they will be aggregated and logged every 10 seconds, but the event will be skipped if there are no orphans to report.
This makes sure that the log line will appear close to the corresponding TimeoutException
in the logs, while not spamming the log file if there is nothing to report.
See the next section on how to customize this behavior.
The actual body of the message consists of the text Orphaned requests found
, followed by a compact JSON representation of the aggregated orphans.
The following code snippet shows a prettified version of such a JSON blob:
[
{
"top":[
{
"b":"travel-sample",
"r":"127.0.0.1:11210",
"s":"get",
"c":"9DACF45F00000001/0000000077CB4DAA",
"d":8,
"t":2500,
"i":"0x6af7",
"l":"127.0.0.1:62836"
},
{
"b":"travel-sample",
"r":"127.0.0.1:11210",
"s":"get",
"c":"9DACF45F00000001/0000000077CB4DAA",
"d":8,
"t":2500,
"i":"0x5dcf",
"l":"127.0.0.1:62836"
},
{
"b":"travel-sample",
"r":"127.0.0.1:11210",
"s":"get",
"c":"9DACF45F00000001/0000000077CB4DAA",
"d":15,
"t":2500,
"i":"0x38f",
"l":"127.0.0.1:62836"
}
],
"service":"kv",
"count":3
}
]
The fields are kept compact so that the logs don’t get too big, but since they are abbreviations it is handy to have the following table available for reference:
Property | Description |
---|---|
|
Name of the bucket |
|
Remote hostname if dispatched |
|
Local hostnamr if dispatched |
|
The name/type of the request |
|
The Channel ID to correlate with the server logs |
|
The server duration if present |
|
The configured timeout in milliseconds |
|
The operation ID (i.e. opaque for Key/Value) |
Please note that in future releases this format is planned to change for easier readability, so we do not provide any stability guarantees on the logging output format and it might change between minor versions.
If you want to enable the new output format, for now you have to set the com.couchbase.orphanReporterNewOutputFormat
system property to true
.
Once enabled, the overall new structure looks like this:
{
"<service-a>": {
"total_count": 1234,
"top_requests": [{<entry>}, {<entry>},...]
},
"<service-b>": {
"total_count": 1234,
"top_requests": [{<entry>}, {<entry>},...]
},
}
Each individual request has the following format:
{
"total_duration_us": 1200,
"encode_duration_us": 100,
"last_dispatch_duration_us": 40,
"total_dispatch_duration_us": 40,
"last_server_duration_us": 2,
"timeout_ms": 75000,
"operation_name": "upsert",
"last_local_id": "66388CF5BFCF7522/18CC8791579B567C",
"operation_id": "0x23",
"last_local_socket": "10.211.55.3:52450",
"last_remote_socket": "10.112.180.101:11210"
}
Property | Description |
---|---|
|
The duration of the orphaned request. |
|
The duration of the encode span, if present. |
|
The duration of the last dispatch span if present. |
|
The duration of all dispatch spans, summed up. |
|
The server duration attribute of the last dispatch span, if present. |
|
The name of the outer request span, with “cb.” prefix removed. |
|
The local_id from the last dispatch span, if present. |
|
The operation_id from the outer request span, if present. |
|
The local_address from the last dispatch span, if present. |
|
The remote_address from the last dispatch span, if present. |
|
The operation timeout in milliseconds. |
If a field is not available, it will not be included in the output.
Configuration
The orphan logger can be configured through the OrphanReporterConfig
.
The following properties can be configured:
Property | Default | Description |
---|---|---|
|
10 seconds |
The interval where found orphans are emitted. |
|
10 |
The number of samples to store per service. |
|
1024 |
Maximum buffer size of orphans to store to pick up for the reporter. |
In addition to those properties, if you want to try out the new logging format you can set the com.couchbase.orphanReporterNewOutputFormat
system property to true
.