Chris,
If you are up for some math, the good news for you is that - if you take advantage of the opportunity - you will be able in the near future to learn something pretty cool. Namely, special relativity. I know you took physics in high school that touched on the subject but it's clear that you didn't really learn SR at that time, and after all, most high school students don't.
Like I said, if you analyze it in any given inertial frame, you get the same result for the age difference.
Various different explanations you've encountered that seem so contradictory to you are actually all correct. Relativity is like that: A lot of things that you might expect to be different are equal, while a lot of things that you might expect to be equal are different.
This wikipedia article should help:
http://en.wikipedia.org/wiki/Twin_paradox
You can tell different "stories" about what happens just by choosing different inertial frames to analyze it in, or indeed by getting more fancy like Einstein did and using accelerated frames, but all observable results are the same for all such choices. You might feel that in reality there must be only one true story, and I'm inclined that way myself, but all the stories that SR tells are experimentally equivalent and that's what's at issue here.
You would need to learn the actual math to *really* make sense of it, not just read explanations. Luckily, the math is not that hard if you can handle algebra and geometry. You can derive it all yourself using Einstein's two postulates (physics is the same in all inertial reference frames, and the speed of light is fixed) but I don't recommend you try it without a textbook handy. I remember doing that exercise in school, pen and paper, and yes, I got the standard results.
One main point you are missing involves the relativity of simultaneity:
http://en.wikipedia.org/wiki/Relativity_of_simultaneity
Thus in your Fig. 1, with clock A being distributed over a long distance in the direction of motion, then if in the frame of clock A all parts of it read the same time, then in the frame at which clock B is at rest the different parts of clock A all read different times. It is exactly like the train-and-platform example in the wikipedia article. If clock B is the platform, then from the viewpoint of clock A, different parts of clock B read different times (assuming that clock B is also long). This is how they can BOTH think the OTHER guy's clocks are ones going slow: When they compare a FIXED part of one clock to the nearest part of the other clock at every moment in time, because of the motion it's a *different* part of the second clock at every moment, and it's kind of like comparing apples and oranges.
If they try to compare using only two small point clocks to get around that, then they need to send signals at light speed or less to report the time, which introduces the needed time differences, or to send one clock on a round trip by accelerating it and that's the "twin paradox". Does the math really work? Get a pen and paper and you can see that it does.
Sincerely,
Jack