Thanks for your thoughtful comments. Let me try to respond to each in turn. (By the way, I'm sorry for the delay in replying to your email -- I've been on holiday for the last two weeks.)
* Concerning the methodological principle and operationalism. A longer paper would be required to properly answer questions about the kind of scientific realism I'm espousing here. First, let me try to express what I think is the difference between realism and operationalism. Here's an excerpt from a short article I wrote with Lucien Hardy (http://arxiv.org/pdf/1003.5008v1.pdf):
"But operationalism is not enough. Explanations do not bottom out with detectors going 'click'. Rather, the existence of detectors that click is the sort of thing that we can and should look to science to explain. Indeed, science seeks to explain far more than this, such as the existence of human agents to build these detectors, the existence of an earth and a sun to support these agents, and so on to the existence of the universe itself. The only way to meet these challenges is if explanations do not bottom out with complex entities and everyday concepts, but rather with simple entities and abstract concepts. This is the view of the realist. Without adopting some form of realism, it is unclear how one can seek a complete scientific world-view, incorporating not just laboratory physics, but all scientific disciplines, from evolutionary biology to cosmology. It is true of course that all of our evidence will come to us in the form of macroscopically observable phenomena, but we need not and should not restrict ourselves to these concepts when constructing scientific theories. For the realist, then, we need an interpretation of quantum theory."
My other beef with operationalism is just a beef with empiricism in general. The empiricist tradition promotes the notion that "why" questions are somehow inappropriate and that we should only ask "how" questions. That's fundamentally wrong in my view. Science is really about providing good explanations. That's why I think that a mere description of the shadows in the cave is not really good science, while an attempt at explaining their various forms in terms of the 2D projections of a 3D shape is good science. But I haven't said what counts as a good explanation. Hopefully, your realist intuitions align with mine and we can agree that certain accounts constitute better explanations than others. I see my methodological principle as a consequence of more general principles about what counts as a good explanation. For instance, one characteristic of a good explanation is that it should be difficult to vary. This kind of consideration is what makes "it was the will of the gods" a bad explanation. (This account of explanation has been promoted by Deutsch in his book "The beginning of infinity"). Any element of an explanation that can be varied without empirical consequence clearly doesn't pass the "difficult to vary" test. So my methodological principle can, I think, be motivated by this deeper principle. I haven't really thought too carefully about trying to formalize the notion of explanation, but hopefully this gives you an idea of what I'm after.
So to summarize, I think that with Plato's cave, colour should be excluded because it has no explanatory role to play. Similarly, the third option of neither shape nor colour should be excluded because it doesn't really provide any sort of explanation at all. I should have made it clearer in my article that the methodological principle is meant to help narrow down the options among approaches that espouse scientific realism. It was not intended as an argument for realism against operationalism. More general sorts of arguments suffice for that.
* Concerning Ptolemy and Copernicus. I'm not sure I have enough knowledge of the historical details of these two models to answer adequately. Nonetheless, from what little science history I've read, I have the impression that they did not make the same predictions, in which case the conditions for the applicability of my methodological principle are not met. But let me consider the case of a Ptolemaic-like model which makes precisely the same predictions as the Copernican model. It is really just a redescription of the Copernican model within a coordinate system that puts the earth at rest at the origin (adding whatever epicycles are required to achieve this). Given the way I've defined it, this model must now make precisely the same predictions as the Copernican model. The conditions of my methodological principle are now met. Because the object one puts at the origin (sun or earth) can be varied without empirical consequence, what the principle asserts is that this difference does not correspond to a physical difference. That seems right to me. The difference between the two models is purely a conventional one having to do with the choice of coordinate system.
* Concerning spin in deBroglie-Bohm. The orientation variable in the BST model is not like the colour property of the objects in Plato's cave because the outcome of a measurement does depend on the value that this variable takes prior to the measurement. Suppose that the Stern-Gerlach setup is such that the interaction between the spin and motional degrees of freedom begins at time t and that the location of the particle is measured at time t'. Although the measurement outcome is independent of the orientation at time t', it is not independent of the orientation at time t. This is because the dynamics of the Bohmian position depends on the Bohmian orientation during the interval between t and t'. So I think it is correct to say that the Bohmian orientation of the particle *does* have an explanatory role to play in the BST model. We would only have a good analogy with the colour variable in Plato's cave if we supplemented Bell's model with a Bohmian orientation that was irrelevant to the dynamics of the Bohmian position and irrelevant to the outcomes of any measurement.
* Concerning the distinction between kinematical and dynamical locality. I didn't mean to suggest that a violation of kinematical locality would be innocuous. I share your intuition that it is hard to even make sense of a theory that presumes a background space-time but which posits objects that have holistic properties, i.e. which don't "live" in space-time. However, unlike you, my view is that holistic properties are bizarre for any composite degrees of freedom, even if the components are not spatially separated (like your spin and colour example). I'm not sure exactly what you mean by "your very explication of 'kinematical locality' really presupposes the very notion in question." Are you arguing that one needs to specify a way of separating the whole into components before one can ask about kinematical locality relative to that notion of separation? If so, I don't agree. My own view is that we should require kinematical locality with respect to *every* decomposition of a system into independent degrees of freedom.
Consider psi-ontic models of quantum theory. One might say that one needs a preferred factorization of the Hilbert space into a tensor product of Hilbert spaces to define a notion of separation and thereby assess whether one has kinematical locality relative to this notion of separation. But actually, regardless of the factorization one uses, one finds that there are entangled states relative to this factorization, so one has a failure of kinematical locality for every choice of factorization. On the other hand, one has kinematical locality for the ontic state space of two classical bits regardless of the factorization. Suppose the "natural" factorization is into two bits, denoted a and b. One can also factorize the state space into a and the parity (a+b)mod2 . Or one can factorize into b and (a+b)mod2. In all cases, the ontic state space of the composite is the Cartesian product of the ontic state spaces of the components.
You also say that "as far as I know no serious scientist ever even dreamed of the idea that there might be physical stuff that doesn't live in ordinary space-time." I'm not sure about that assessment. Many physicists have suggested that space-time is not fundamental, but rather is an approximate and emergent description of something more primitive that does not live in space-time. My own view is that causal structure is more primitive than space-temporal structure. What I mean by this is that whereas it is usually assumed that part of the definition of a causal relation is that the cause precede the effect in time, one could rather assume that part of the definition of what it is for one event to precede another in time is that one event is a potential cause of the other. Similarly, for two events to be spatially separated means that they cannot be potential causes of one another.
* Concerning Bell's notion of local beables. I'd like to understand better precisely what Bell had in mind. I've always found it odd that in explaining his assumption of locality, he sometimes draws one diagram and sometimes another. In the first sort he draws space-like separated regions, labelled 1 and 2, and refers to a full specification of local beables in a space-time region 3 that screens off region 2 from the intersection of the backward lightcones of 1 and 2. This is the diagram he uses to describe his general notion of local causality. In the second sort of diagram, he draws a squiggly line extending across both backward lightcones, labels it by lambda and makes the comment that lambda need not be a local beable. This is the diagram he uses to explain the notion of locality applicable in the two-wing experiment. It seems to me that only the second diagram avoids assuming kinematical locality. Indeed, I argue that it transcends the kinematics-dynamics distinction. My sense is that Bell's "standard" definition of local causality, the one that refers to the first diagram, assumes kinematical locality, i.e. exclusively local beables. As you know, the definition is this: "A theory will be said to be locally causal if the probabilities attached to values of local beables in a space-time region 1 are unaltered by specification of values of local beables in a space-like separated region 2, when what happens in the backward light cone of 1 is already sufficiently specified, for example by a full specification of local beables in a spacetime region 3..." If there were some nonlocal beables, then these could not be part of the full specification of local beables in region 3. Nonetheless, such nonlocal beables could be correlated both with beables in region 1 and in region 2 so that by specifying the values of beables in region 2, one would update the probabilities attached to values of the nonlocal beables and this would in turn lead one to update the probabilities attached to the values of beables in region 1. So conditioning on region 2 *would* lead to an update in region 1 even though the beables in region 3 are fully specified. Given Bell's discussion of the second sort of diagram, it seems to me that he would like to consider this case as *satisfying* locality, and yet it clearly fails to satisfy his standard definition of local causality. So I conclude that his standard definition of local causality folds in a notion of kinematical locality and it is this which fails in the example provided. It is only the definition of locality that accompanies the second diagram which avoids the assumption. Unfortunately, I don't have my copy of Bell's "La Nouvelle Cuisine" with me at the moment, so I can't review precisely what he says. I suspect that he did not have complete conceptual clarity on these distinctions. Maybe you can convince me otherwise.
* Concerning causal structure versus kinematics and dynamics. It is true that one can convert models with different kinematics and dynamics into the different causal diagrams. That's what I did for Newtonian and Hamiltonian mechanics, for instance. My claim is that the nonconventional aspect of causal structure is specified by certain features that are common to these causal diagrams, that is, by an equivalence class of causal diagrams. In the case of Bell's notion of locality for Bell-type experiments, the question of whether the ontic state of the pair of quantum systems factorizes or not simply doesn't arise. We just assign a variable lambda to the pair of quantum systems and specify its causal relations to the other variables. The real advantage of causal diagrams is that we don't need to specify where any given variable "lives" in space-time. Indeed, it is better, I think, to infer spatio-temporal relations from the causal relations. That being said, I think more work needs to be done here to properly answer your question.
>That's all I've got. Despite all of these quibbles, I really enjoyed reading the paper, which definitely made me think about new things in new ways, so thanks!
Thanks again for the comments!