Monday, July 9, 2012

Monday Morning Monitor Club

In the late 1990s, I held a meeting on Monday mornings. We had a different topic each week usually focused on the VM monitor system service, but also covering a variety of related performance topics. I would take portions of those meeting topics and roll them into presentations for SHARE called VM Performance Tidbits. At some point, I think we moved the meetings to Tuesday, but continued to call it the Monday Morning Monitor Club (MMMC).

I have fond memories of those times when people gathered to learn and share. We have less time to do that these days, which is a shame. The MMMC included performance people and CP developers, and service folks as well. Seems odd in ways that people would gather on a regular basis to talk about such a focused topic. (It didn't seem to get any hits when I included it in my online dating profile: enjoys hiking and dissecting z/VM monitor records.) And yet, there were faithful members showing up each week.

VM historically has done performance measurement well. Thanks to collaboration from the community. The current z/VM monitor system service is fairly elegant, though many of us take it for granted. Within the z/VM Development Lab the Performance Evaluation Team has a presentation that they give to developers about when/how/why to create monitor data, called What is Good Monitor Data? (yes, the jokers like to swizzle that to What Good is Monitor Data?). Despite the jokes, the topic is taken seriously.

Most of what one needs to know about the performance of a z/VM system is in the monitor data stream. It is a very complete picture of resources used by the virtual machines and the z/VM system itself. In addition, includes data on delays to virtual machines and their effect on one another. The VM Monitor system service also reaches up and down for additional data. Various interfaces exist for z/VM to reach down into the firmware and hardware to get data; as well as the appldata interface which allows it to reach up into the virtual machine at very low overhead.

The VM Monitor system service is a low overhead data collector. One nice approach it uses is to place the data in memory shared between the Control Program and one or more virtual machines. This allows CP to put the data right in the shared memory and avoid data moves for each reader of the data. The appldata interface mentioned earlier works off a similar principle by allowing CP to have access to guest memory and pull that information from the guest directly.

The VM Monitor system service is very flexible in terms of what data you can ask to be collected: which virtual machines, devices, types of data, etc.. Whoa. I'm getting carried away on this subject. But do you see why I'd look forward to a Monday Morning?

The key the reflection here is the VM system has the idea of being able to capture this great data to help solve problems and provide information for planning. It's another piece of the VM DNA. The designers realized that the system would have problems and data would be needed to help in the analysis. This was also a key part of a white paper from the SHARE VM Performance Project in the 1980s. Now they could have taken the approach used else where to deal with these scenarios:
  • Reboot and see if the problem goes away.
  • Buy a bigger/faster machine
  • "We were unable to recreate that problem, can you send us your private data and 10 TB of disks."
I know I'm biased, but being able to do what we can do with 1 hour of monitor data makes me happy. I know there will always be a field or two missing; and we don't always get first failure data capture. But the idea that we strive for those is important. And it's something that none of the college text books touched. I learned it and lived it in VM.

No comments:

Post a Comment