This is long. Sorry. Bear with me. Feel free to tell my, "you're wrong" about any part of this.
## Background
We're rolling out a deployment of jitsi (awesome stuff!). We want to be able to troubleshoot audio/video quality problems post-mortem. In other words, we want users to be able to report problems after the conference is over, and we as engineers can look at detailed statistics to determine what the problem is, and hopefully, craft a solution.
## Possible sources of statistics
* Statistics reported to InfluxDB after each channel is closed (i.e. the channel_expired series). -- This is good to spot big / general problems, but it's difficult to reconstruct what actually happened. For instance, we want to be able to see things like packet loss _over time_. We hope this can distinguish between things like a continually crappy network, and a network that was fine and then at some point got measurably worse.
* RTCP data broadcast to participants in the conference. -- This isn't currently preserved anywhere, but if we were to save off a pcap file (or similar) of the RTCP data stream, we should be able to reconstruct most, if not all, of the information we need from that.
We're currently investigating adding support to Jitsi-videobridge for saving off the complete set of RTCP packets for a conference.
## Possible solutions
* Add another hidden participant to the conference, ala JiCoFo, that records these statistics. As I understand, all participants should generally still get all the RTCP data, so this should work. (Correct me if that's not the case)
* Record the statistics client-side (via browser APIs) and report them separately to another server. This has the downside of duplicating data that, as I understand it, is already reported to the videobridge via RTCP.
* Inject code into jitsi-vidoebridge itself to save off the RTCP data to a file.
We've tentatively gone down that third path. Two questions:
* Is this of general interest, and thus likely to be accepted as a PR; OR
* Is there a good way of doing this without modifying jitsi-videobridge itself?
## Implementation
What follows is *very* hacky, largely untested, and not at all final.
So far, I've enabled `BasicBridgeRTCPTerminationStrategy` via configuration, and added a new `DetailedStatsSerializer implements Transformer<RTCPCompoundPacket>`, which I add to the list of rtcp transformers (via `setTransformerChain`.
The `DetailedStatsSerializer` saves all of the rtcp packets it sees to a file, one file per conference. We have a separate system monitoring the directory in question and uploading files to Amazon S3 for storage.
We'll build a separate analysis tool to summarize and visualize the information when needed.
···
-----
I'm looking for general feedback, of any sort.
* Am I way off track?
* Is there interest from other people in this sort of functionality?
* Are there better ways to get this information?
Sorry for the novel. Thanks for reading.
So. Thoughts?
---
Reply to this email directly or view it on GitHub:
https://github.com/jitsi/jitsi-videobridge/issues/79