Requiring correctness in software
As a follow-up to my previous post, I’m curious to hear of any example of a protocol or specification in software that satisfies these requirements:
- is widely used on standard desktop systems
- has a well-documented specification
- provably rejects all uses that fail to adhere to any portion of the specification
- is not used for security (e.g. SSL or Kerberos)
I’m not so confident as to say that there isn’t any such protocol, but I can’t think of one. I should note that I’m excluding security protocols because for those, accepting any sort of invalid data could be considered a security problem, so anything that handles those can’t be liberal in what it accepts. Security protocols can’t have underspecified areas for the same reason.
To give a few random examples:
- RSS was discussed in the previous post
- HTML is definitely a non-starter
- C and C++ compilers definitely accept code that is in gray areas of the specification
- HTTP servers certainly differ in what they accept, even across the same version of the HTTP specification
Of course, the goal of a C compiler isn’t to be liberal in what it accepts, but you’ll never get all C compiler authors to change their code to accept or reject the same code everywhere and to continue to do so from now till the end of time.
This little exercise isn’t trying to show that RSS readers should accept anything. Instead, I’m trying to show that having all RSS readers reject all invalid feeds is likely to prove impossible to maintain in the real world, if only because different readers will disagree about what an “invalid” feed is.
Marc Said,
August 25, 2003 @ 10:11 am
XML? It’s a file format rather than a wire protocol but the standard requires that parsers reject documents that are not well formed.
Brooks Moses Said,
August 25, 2003 @ 9:29 pm
In the case of C compilers (well, ok, I’m drawing from experience with Fortran compilers, but I can’t see that C compilers would be any different), the intersection of adherence to the standard and intercompatible code is sort of an interesting case; the standard allows for implementation-specific extensions, and intentionally has cases where the standard does not explicitly say one way or the other whether a bit of code is valid.
But, beyond the “undefined” cases, there are a set of things that the standard explicitly says are to be reported as errors. In my experience on comp.lang.fortran, when a case comes up where a compiler does not report one of these as an error, the usual suggested response is a bug report to the vendor.
Thus, if XML is being treated as a “language” rather than a “protocol”, I think it’s important to make a distinction between cases where the standard allows implementations to vary, and cases where the standard explicitly disallows a given variation. (Even if it’s being considered a “protocol”, I think this is important, but protocols usually have much smaller amounts of standard-allowed variation.)
This also raises the point that the question of what constitutes “non-standard-conforming code” is fuzzy — beyond the obvious requirement that it not violate any of the “compilers must reject this” parts of the specification: is it code that is valid on any possible implementation of the standard? Is it code that is valid on all existing implementations? Is it code that is valid on any possible implementation of the standard plus a common set of standardized extensions? Is it code that is valid on at least one existing implementation? Is it code that is valid on at least one hypothetical implementation? Or somewhere in between?
So I do think that, to some extent, it’s inappropriate to consider compilers to be a “non-starter” on account of them taking wide advantage of the places where their standards allow for variation; their conformance to the standards (and rejection of non-standard-conforming code) ought be measured only on whether or not they conform to the places where the standard explicitly requires them to reject a given type of code. And, in my (admittedly limited) experience, most compilers do not necessarily do this perfectly, but their programmers do consider their failures in that regard to be legitimate bug reports — at least, in cases where there isn’t a specific compiler override flag that causes the compiler to behave in some nonstandard way. And the existence of such flags, I suppose, probably argues your point.
I don’t know XML enough to know whether this distinction applies to it or not; I get the impression that XML is a fairly simple specification by itself, and it passes off a lot of responsibility for details onto the document-level specification or the equivalent. If the “well-formed” that Marc refers to includes anything that violates the XML standard as such, it does seem that the standard doesn’t have nearly the space for standard-allowed implementation-dependent variations as, say, Fortran does.
(And my apologies for having pontificated at such interminable length; I can’t figure out anything to cut….)