The data you see from The Gathering 2015 will seem "broken up". This is not because we don't have data from the first day, but because the backend was re-written on day 1/2 and this web app only uses the new API.
NMS was set up on March 30th (Monday). Data started pouring in on the same day.
Ping data is available for the entire event with 1 second resolution. We "lost" data from the 30th because we re-inserted the switches (We have the ping data, but not the mapping between switch ID number and actual switch).
DHCP data is available only for the last detected DHCP ack (no history, except extensive text-based logs)
Uplink status is available for most of the event, but not exposed here. We only expose traffic-based uplink state here, which, again, is based on the new API.
Traffic status was temporarily bugged, but is available from late on day 2.
Temperature data is available from day 2.
Plans are being made to ensure that we don't have gaps like these in the future.
It is also worth mentioning that things like switch positions are not logged historically, so you see the final position on the map.
Outstanding AJAX requests | |
Overflowed AJAX requests |
NMS performance is surprisingly complex. It's split into several parts and dealt with differently.
Poller performance is a matter of efficiently collecting data and is mostly handled in the Perl code (and ensuring we use sensible database schemas).
Backend performance for the GUI is mostly about not killing the database server. We do NOT try to protect against malicious clients directly, since this is a management system not public-facing, but Varnish is used to cache requests. To be able to do that properly, we need use absolute time when reviewing past events (so "2015-04-02 17:30:00", not "2 hours ago"). We've also tried to minimize the stupidity in the queries. There's still work to be done here, though, as we need to split up a few large backend requests (port-state.pl).
Front-end performance is mostly about drawing things sensibly and not completely bombing the memory usage. And about gracefully handling slow backends This will affect you. For example, if you are reviewing past events and the DB is struggling, we'll simply skip a backend request if we have too many outstanding requests, that means you may jump from "17:00" to "18:30" instead of going through "17:30" and "18:00" too. This is working as intended. It also means that you can happily spam the forward/backward keyboard bindings to jump 18 hours forward: You'll overflow the extra AJAX requests for individual requests, but you'll land at the right time when you let go. But there could be a 1 second delay (or more if the backend really struggles) since you'll have to rely on the periodic backend requests instead of the explicit ones triggered on hitting a button.
Note that the counters on top are updated on a timer, but this timer is set up at the same time as everything else, which means that it's likely to update at the same time as we fire off AJAX requests, so the 'outstanding ajax requests' counter might either show almost constantly 3 or 0 depending on what timer happens to fire first. This does NOT mean that NMS has 3 requests all the time, just that we're checking right after we fire off AJAX requests every time.
NMS also tries to handle drawing OK, which is why things are split into different HTML5 canvases. Blur and text are particularly expensive, but there's no reason to re-paint that all the time, etc).
The basic performance experiments are done on TG15 data using a laptop and a VM with 6GB of memory, so it should hold up quite well on "proper" hardware.
Key | Description |
---|---|
? | Toggle navigation bar |
n | Toggle night mode |
1 | View Ping map |
2 | View uplink map |
3 | View temperature map |
4 | View uplink traffic map |
5 | View comment spotter map |
6 | View total switch traffic map |
7 | View Disco map |
h | Step 1 hour back in time |
j | Step 5 minutes back in time |
k | Step 5 minutes forward in time |
l | Step 1 hour forward in time |
p | Toggle playback (1 hour per second) |
r | Return to real time |
Some features do not have time travel support (comment spotting and DHCP map at the moment). We also lack compatible SNMP data for the first day or so, so you'll only have ping data for the first day of TG15.
It could take some time to load a specific point in time for the first time. See "About performance" under the help menu for more information.
You can also step backwards and forwards in time, stop and start replay and go back to real time using keyboard shortcuts. See the help menu for an overview of keyboard shortcuts.
These are internal timers for the NMS frontend. They are provided mainly to debug the frontend. Setting AJAX-triggering counters to ridiculous numbers is not advised (mainly because it causes server load).
Background | |
Linknets | |
Blur | |
Switches | |
Text | |
TextInfo | |
Timestamp |