Re-evaluating AI fundamentals implies rejecting the status quo, which in AI’s case is computationalism – the idea (theory, dogma, myth) that the mind is an executing computation. If we clear the slate then begin at the front end of perception, what is present there? We accept Searle’s view that the objects emitted by sensors, which objects he calls symbols, are semantically vacant – they do not indicate what was sensed. But if a computer will think then sensory-symbol streams must contain what is needed to build internal representations of external object types (we can assume as a starting point). So what does a sensory-symbol stream contain? Discrete objects that have values of a property (e.g., the property of shape, the different shapes are the values). Searle says that’s all. But there is in fact more. Sensory-symbol streams also contain instances of the relationship between symbol tokens of contiguity in time. One value arrives after another at the central system. So (1) the tokens, (2) the values of the property of the tokens, plus (3) the relationship in time between the tokens must be sufficient (given appropriate algorithms and structures) to create an inner semantics.
The (very) weird theory proposed here says that this relationship of temporal contiguity – one object following another in time as they pass a point or line or through a surface – explains two important problems of AI:
- Perception. How a system could learn about the world by processing the intrinsically meaningless objects emitted by sensors (the severe problem exposed by the Chinese room argument), and
- Generalization. How such a system could have a human-like general knowledge (the base problem behind philosophical objection to the computational theory of mind, including the frame problem (McCarthy and Hayes, 1969), the problem of common-sense knowledge (Hubert Dreyfus, 1965), and the problem of combinatorial explosion (James Lighthill, 1973).
It’s going to be fairly hard to explain this. But the basic ideas to be explained are:
(a) Building a semantic structure. An inner semantic structure can’t be built from (lets call them) “symbols” alone, but it can be built from sensory symbols plus permanent records of the temporal relation of contiguity between the symbols in the sensory stream (the permanent records being called connections, or for computers, typically pointers), and
(b) Embodying generalization. Programming a computer starts at the detail level and then seeks to achieve generality by quantity of conditionals. This quickly leads to the three problems mentioned above. The relation of temporal contiguity, however, starts at the most general level, then with the increase in quantity of recorded instances of contiguity becomes progressively more detailed. This is because the most often repeated is the most general. The most often repeated is stored, then the next most often repeated is stored (the next most general), etc.
Hence a system that embodies the principal of temporal contiguity starts at the general then develops detail, whereas the method of using human knowledge to define the causation of the system (by programming it with conditionals) starts at the detail level and with quantity seeks (but because the quantity of conditionals needed never truly achieves) generality.
(c) Content of a sensory stream. Assuming computers will think, everything they intrinsically learn about the world is going to come to them from attached sensors. Specifically, it’s going to embodied in the stream of objects the sensors emit into the internal world and send to the computer, i.e., sensory streams. So what do sensory streams contain? Searle and Chinese room say: just symbols. But in fact they contain three types of “thing”:
- the objects emitted by the sensor (which Searle calls symbols),
- the relationship of temporal contiguity between these emitted objects, and
- repetition of instances of the relationship.
The issue is to explain how an internal semantics could be built from 1, 2 and 3.
1. Abandon the established wisdom which says:
- intelligence is computational
- computers only compute
“Computation” should be defined or at least the differences between computing and other modes of processing explained.
2. Start at the interface between the machine and the world. The principles that operate at the front end of perception need to be identified, and these will indicate how sensor data is to be processed when it arrives at the machine.
These principles need to be adequately understood then realized in algorithms. If they’re not embodied in the machine software that processes sensor data as it arrives, whatever happens downstream will probably be wrong.
This approach of starting at the machine-world interface very strongly contrasts with that adopted by early mainstream AI. For instance, Turing‘s most widely read paper, Computing Machinery and Intelligence – the mission statement of AI, the roadmap for future AI research – doesn’t even mention sense perception.
The first proposed principle:
All knowledge gained from experience is reducible to instances of the relation of temporal contiguity.
Because this is counter-intuitive, it will probably be hard to convincingly explain. But in any event, it needs to be justified: arguments put as to why it might be right.
3. Discuss the classical problems of AI with a view to determining whether the one or more proposed principles solve, avoid, mitigate any of the classical problems including:
- frame problem
- combinatorial explosion problem
- problem of common-sense knowledge
- problem of generalization
- symbol grounding problem
- Chinese room argument
Since the machine with human-like intelligence will have the same or similar philosophical problems as human intelligence, an attempt should be made to explain why the machine is also confounded by:
- the problem of solipsism
- the problem of universals
- and a few other issues
Principle of temporal contiguity
The principle of temporal contiguity says that knowledge is reducible to instances of one item of sensory data followed in time by another. That knowledge gained from experience is determined by the adjacency in time of items of data emitted from sensors. This is a temporal juxtaposition as the items are received one after the other at the central system. (Contemporaneity and multiple sensor data streams are discussed separately.)
Associative. We can see that the relation of temporal contiguity is associative in that one term of an instance of the relation – the term is a data item – is associated in time with, is temporally next to, another data item: the other term of the instance of the relation.
Hence if the data items ABC are received in that temporal order by the core machine, A is associated with B (an ordered association – A comes first in time), B is associated with C, AB is associated with C, and A is associated in time with BC. The data items can be atomic (A, B, C) or compound, (AB, BC).
The principle of temporal contiguity is associative.
(Associative theories of mind, or aspects of the mind, go way back to at least Aristotle.)
Non-teleological. The principle is independent of (cares nothing about) the quality of the terms of the relationship. The principle is the same irrespective of whether the stream is ABC, XYZ, or any other sequence of data items. The properties of the terms are irrelevant. So if the principle is realized in a algorithm, the algorithm will ignore the nature of the data items processed and react only to their juxtaposition in time.
A great example of teleological is Rodney Brooks‘ intelligence (supposedly) without representation theory exemplified in his robotic insects. The “insect” has feelers and leg motors. When the feelers detect rough ground they emit a signal (call it A), and on smooth ground, a B. The motor signal X raises the legs half way, and Y raised them all the way up. A human knows all this so wires the insect as follows: when an A is received by the control unit, it emits a Y to the motors. When a B is received, the unit sends an X to the motors.
This is teleological. Human knowledge defines the causation of the insect’s behaviour (the pre-existing causation of the hardware is a separate issue). The human knows stuff about the world and wires the insect accordingly. It’s like the divine theory of life on Earth known as intelligent design. God knows stuff about the Universe, and designs, by the exercise of His knowledge, all life on Earth. Teleological – cognate design.
The theory of evolution is a vastly simpler explanation of life, and says nothing about the particular design of species. It is non-teleological. Same with the principle of temporal contiguity. It says nothing about the content of inner structures built by algorithms that embody the principle.
So programmatically, this is how the principle of temporal contiguity can be realized. The world causes changes in a sensor, and depending on the changes, the sensor dispatches a stream of certain data units to the central processor. A simple associative algorithm processes these data units as they arrive. The algorithm records which are together in time.
So if B follows A in time in the stream from the sensor, this temporal contiguity is recorded (made timeless) within the system by the initial algorithm establishing some sort of permanent storage relationship that takes the place of the temporal relationship that existed as the units impacted the core system.
If Y follows X, the algorithm treats these data units the same way. In this case, what is stored and related permanently is Y and X not B and A. So the content of the resulting inner structure is different, but the algorithm that processed the units as they arrived from the sensor is the same. Because the algorithm does not contain within itself any samples of sensor data units (which it would need to contain if it were to react differently to qualitatively different data units) the algorithm is very small and simple.
The principle of temporal contiguity is not teleological.
Because such algorithms do not (cannot) react differently to qualitatively different data units they process, they can be called “blind“. An equivalent in the Chinese room is Chinese ideograms inscribed with invisible ink on the cards that drop from the slot in the door. The ideograms are still there, but the man in the room cannot see them.
In a very simple case in the Chinese room, its ontology could be improved and the man could have pieces of string and a pot of glue, and the rule book instructs him to glue a piece of string to join each temporally adjacent card as it falls from the slot, thus making permanent in storage, the fleeting adjacency in time as the cards fall from the slot.
This sure seems like a trivial algorithm.
Not “reasoning”. A core concept in AI programming is called “reasoning”, or “deductive reasoning“. “Reasoning” in this sense mainly means the execution of conditionals created by humans. Such conditionals typically exemplify the general form: IF INPUT = “A” THEN OUTPUT = “B”. Since this is teleological – a human uses their knowledge of the world to pair A with B (and not, say with C, D, E, …) – an executing algorithm that realizes the principle of temporal contiguity is not performing deductive reasoning. If you want to equate deductive reasoning with computation, then the principle of temporal contiguity is not computational.
A process that realizes the principle of temporal contiguity, to the extent of that realization, is not a reasoning process.
Syntax. In AI, syntax means (adopting the convenient myth that computers process symbols – shapes that have meanings) … in AI, syntax means symbol shape, and syntactic processing is processing that depends on symbol shape. For example, the processing of the first Turing machine Turing describes in his 1936 paper.
Adopting the myth that computers process symbols because it makes writing about computers easier, the principle of temporal contiguity does not react to the shapes of the sensory symbols it processes. Hence, the principle is not syntactic. And any algorithm that realizes the principle, to the extent of that realization, is not syntactic – even though the algorithm or algorithm snippet is executed in a computer.
For example, in the Chinese room, Chinese ideograms inscribed on cards drop from a slot in the door and onto the floor of the room. If the rule book says “Starting with the first card to drop, put it in basket A then the next card in basket B, then continue to put alternate cards in baskets A and B“, this rule is non-syntactic. The rule says noting about the shape of the symbol on the card. But if the rule says, “If the shape on the card is <shape or description of shape goes here> then do <such-and such>“, then the rule is syntactic.
The principle of temporal contiguity is not syntactic.
Semantics. AI has wrongly used the term “semantics” for decades from at least Marvin Minsky’s (Ed.) widely read 1968 text, Semantic Information Processing, as even he indicates when saying (page 26) “…one cannot help being astonished at how far they did get [the programs described/listed in the book] with their feeble semantic endowment“. In fact they have zero semantic endowment. The book is about syntax, not semantics.
We accept that minds have a semantics (semantic content). The semantics of the principle of temporal contiguity is fascinating. Prima facie there could not possibly be any semantics there. Where the semantics of a computer could come from has been an enduring mystery, and the dire conclusion of the Chinese room argument is that for a fundamental reason computers cannot have a semantics. Hence, the principle, being at the front end of perception, being at the interface between the machine and the world, has to explain the seemingly inexplicable – how a computer could get a semantics from the data stream emitted by sensors.
John Searle. Searle would say that any associative algorithm is syntactic, because computers by definition have a syntax alone, and “There is no way the system can get from the syntax to the semantics” (Minds, Brains and Science, page 34), and “To repeat, a computer has a syntax, but no semantics” (page 33), and “digital computers insofar as they are computers have, by definition, a syntax alone” (page 34). A point Searle has repeated for almost 40 years.
But he is wrong about two things. Firstly, the associative algorithm of interest is not syntactic, as noted above. It ignores the qualities of the things it processes (Searle talks about symbols, too, where syntax is symbol shape, so in his terminology we say that the algorithm ignores the shapes of the symbols it processes). This means that the algorithm is not syntactic. So he is wrong when he calls computers purely syntactic devices and that computer operation “consists of a set of purely formal, syntactical operations” (page 35). The associative algorithm is not syntactical.
That the associative algorithm has no syntax might seem like a huge problem, but in fact it clarifies the situation greatly. With no syntax (with not reacting to the syntax of what it processes), if the algorithm is going to facilitate the creation of semantic content inside the machine, whatever the algorithm has (one of these things not being syntactic processing) this must explain how a computer could get a semantics. However it reacts to what it processes, this must explain computer semantics – even though the reaction is so simple, and even though nothing in a sensory data stream even remotely seems to be intrinsically about the sensed world.
The second thing Searle is wrong about is that computers by definition cannot have a semantics (unless by “computer” he means something other than the things on peoples desks, but Searle does mean the things on people’s desks). The challenge is to understand how an algorithm that processes data non-syntactically, non-teleologically, i.e., that reacts the same way no matter what the shape of the symbol processed, to use Searle’s terminology, could yield semantic content inside the machine. One object of the radical theory is to explain this.
If it is shown that the principle of temporal contiguity explains how a computer could get an intrinsic semantics – that would be truly amazing.
Initially, I’d like to look at the causation between the world and the data items emitted by sensors, and in doing this suggest a simplified sensor. Hopefully the simplification won’t be too much of a problem and will retain the essential elements of sensors generally.
If we are talking about causation then we end up needing to take some sort of position on the question of internal representations. Understanding intentionality – how we know things about the world – has been a core and unresolved theme of philosophy and cognitive science for yonks. Positions have ranged from there are no internal representations that mediate between consciousness and the world (but some sort of direct connection), to symbolic theories to causal theories and more.
Here, all I want to talk about is causation. And for a strange reason, I want to posit internal representations, that is, something, some internal structure with content of some sort, that is causally related to elements of the external world, where the relationship is called “represents” or x-represents-y where y is something out there and x is something internal. This is only an initial characterization of the idea of causal representation.
But if a causal theory of representation is to be countenanced, then objections to causal theories need to be considered. I’ve just re-discovered a great paper on representation, “Representation: Where Philosophy Goes When It Dies“, by Peter Slezak, which to me nails a lot of important issues.
Specially, the problem of misrepresentation needs to be addressed, and I’ll try to do that.