Tricyclic Differential Engine - Recursive (TDE-R)

What is wrong with Cognitive Science?

Cognitive Science has some structural problems which it does not seem to be able to escape from. These are-

1. It can't seem to deal directly and effectively with subjective phenomenology eg consciousness.

2. Its progress gets 'road-blocked' by famous opinion-makers (eg Jerry Fodor, David Chalmers) who sometimes reverse  their opinion or otherwise change their stance mid-career.

3. It is prosecuted by the very same machine it purports to study, ie the human brain and language. This is a unique situation, one that no other branch of science must suffer under.

4. It has (arguably) had to wait until the middle 20th Century, until humans have made an artificial 'thinking machine' (ie a computer).

This last point is predicated on the contentious principle that until humans do something, they cannot understand its purpose. It is as if, only by designing and building something themselves, can they understand its construction. It has found recent resonance in an old argument about whether the ancient greeks could see the color blue.

Blue was a latecomer among colors used in art and decoration. This is probably due to the practical difficulty of making 'fast' blue dyes. Similarly, humans first made practical water pumps at about the same time as the first accounts of the heart as a pump appeared. There seems to be a pattern here, a pattern which was continued with the computer. Once people struggled with, and overcame, the problems associated with designing a machine capable of moving and categorising realistic amounts of information in real time, they were in a much better position to appreciate the difficulties involved with the evolution of such a machine.

Yet even now, when computer design has reached the state of a 'high art' (eg my iPad), there is an institutional reluctance (and hence, popular ignorance) in acknowledging that what the brain does is, by and large, common-or-garden computation. Several elements of this attitude can be identified-

1. Before we understand something we need to make it. The mind is a computer, a virtual situational model supported by an information processing mechanism. After we have made many computers, and thoroughly understand how they work (eg by reference to the Turing Machine and other canons), it then makes no sense at all to deliberately NOT refer to cybernetic/ computer concepts when analysing the mind and brain. In spite of Bernard Baars' crusade for consciousness, this is a trap he also falls so easily into. In 'The Timing of the Cognitive Cycle' by Madl, Baars & Franklin, there is little if any direct or indirect use of computer terms or explanatory metaphors. Instead, he risks a 'paucity of inputs' type attack by using only neuroscientific and pseudo-clinical ('bottom-up') ideas and arguments.

2. There is a general ignorance in the literature about the teleological (goal-oriented) aspects of computation. If teleology ('end use' or 'design purpose' in common parlance) is deliberately left out of the argument because some consider it too semantically troublesome, general purpose computation will never equal cognition.  The rise of declarative computing (eg you specify the shape of the data that you want, as in XML, not the method of its transformation, as in Java) represents a reversal of this trend. In TDE theory, the teleological aspects of any machine are encoded in the topmost layer of Marr's Trilayer- see Figure below-


3. There is a general ignorance about the true nature of software. What is it? It is, in a very real sense, the very thing that Artur Koestler meant when he named his famous book, 'The Ghost in the Machine'. It flows like water to 'fill up' RAM, yet is quite 'crystalline' when the individual data elements are examined. However, the most important way of viewing software is as a virtual structure and/or mechanism. It uses a form of structured human speech as an architectural modelling toolkit. 

Consider a multi-grip, or other simple hand tool. Its intended purpose is fixed by its 'hard' physical design elements, even though it may be mis-used as a hammer, or a lever. Now imagine we have created a hyper-tool. It consists two parts- (a) a complete range of physical parts that fit the mechanic's hands and some that fit the workpiece (eg nuts of various sizes), and (b) a complete range of joining brackets, whose intricately machined surface shapes allow almost all of the parts to be assembled with one another, to be 'mixed and matched', to create every possible tool a mechanic could ever want, and some more she hasn't even thought of yet.

Imagine sets of simple machine parts each placed in a bin, next to a range of sets of joining brackets, painted a suitable color to guide the novice user. If two parts 'A' are joined by a bracket 'B', a pliers results. If a bracket 'C' is used, a similar tool, a multi-position pliers is created, and so on. Roughly speaking, computer data corresponds to the set of parts in bins, while application software corresponds to the joining brackets. This analogy is what people mean when they say a computer is a 'universal tool'. It can create universally applicable models. This failure to embrace teleology is perhaps an accidental, though quite unfortunate, side-effect of history. It has also lead to a regrettable confusion between the concepts of modelling (predicting reality) and control (connecting those predictions to a real situation). In the early days of machine computation, the words 'computation' and 'cybernetics' were more-or-less synonymous. This led the famous (late) vision researcher David Marr to use the adjective 'computational' to describe the topmost layer of his famous tri-layer* hierarchy, the layer which encodes the purpose or goal of a device or process. This layer should be called teleological, not computational- even famous people like David Marr get some things wrong occasionally.

4. There seems to be a refusal to let go of outmoded ways of thinking about 'subjective' data representations. These 'bad thought habits' act in the background to prevent AI researchers moving forward. Worse than that, they allow cognitive philosophers like David Chalmers to cast doubt on the whole AI exercise by the mischievous use of little mind gremlins called qualia. These are so defined as to defy easy 'mental definition by analogous visualization', the method most of us use to make sense of neologisms. However, the user of such problematic terms is either unwilling to or unaware of their violoation of several implicit rules of language. Their goal is not our enlightenment, but their own empowerment.

Another example of a counterproductive concept is the Cartesian Theater. It is relatively easy to find references to the 'Cartesian Theater' with its dysfunctional homunculus-inside-a-homunculus imagery. With most of these references, there is no clear label saying 'DO NOT USE - THIS IS WRONG', yet the Cartesian Theater is reprinted again and again, by the magic of internet idiocy. Even Descartes intended it as a paradox.

The correct imagery is something called the 'Situation Image' (see below). There are earlier references to a similar concept called 'image' in the 1950-60's TOTE literature. The solution to this bad thought habit was first found by Jacob von Uexkull, an Estonian semiotician, ie a person who studies information use in natural systems. His (german) name for the SI was 'umwelt' or subjectively perceived surroundings. Ferdinand Saussure, C.S Pierce and Jean Piaget all created concepts which, in three different study domains, confirmed von Uexkull's general findings. Somehow, this vital piece of research seems to have been ignored by an entire generation of AI scientists, a point made by William Powers, the most recent proponent of what he calls 'Perceptual Control Theory' (PCT). The reason for this amnesia is an all-too-human one:- the goal of (most) computer manufacturers is to make a profit, not find the truth. Artificial Intelligence (AI) has always been a fringe activity for the computer industry, albeit a highly public one. The modern computer is a throw-away consumer appliance first, intellectual prosthetic second.

5. There is, however, a real problem in the mapping between electronic (binary arithmetic and bistable semiconductors) and biological substrates. Do real-life neural nets use binary arithmetic? Almost certainly not. But the basic idea behind the use of binary - that discrete representations are more robust against per-stage noise incursion and subsequent re-amplification - is much harder to ignore. Lets rephrase this question in a more general form- do real-life neural nets use discrete data representations? Almost certainly!

At the heart of the problem (ie the difficulty in accepting the brain as a computer) lie several SUBVERSIVE PRECONCEPTIONS.These are usually based around false dichotomies.

The first one has existed for several hundred years. This is the belief that there is a fundamental difference between dead stuff and living things. In its original form, this was called the Doctrine d'Élan Vitale, or in English, "Vitalism"-  the belief that living things use and produce one unique set of substances, dead stuff  belongs to another group of substances, and never the twain shall meet.

Then in the 19th Century, Friedrich Wohler added silver cyanate to ammonium chloride- AgNCO + NH4Cl → (NH2)2CO + AgCl, to form ammonium cyanate, which is carbamide, or urea. This was the first organic ('vital') compound synthesized, and Wohler is hence called the 'father of organic chemistry'.

What has this got to do with AI? The artificial distinction between living/non-living is a 'false dichotomy', just like the artificial (as in 'fake', not 'man-made') distinction between computational and cognitive information processing**.

A few scientists (who should know better), and many science journalists (who shouldn't blithely trust everything they read) believe a similar thing- that somehow cognition, the kind of information processing that the brain does, is qualitatively different from the information processing that computers do. Jerry Fodor, probably the most famous cognitive philosopher in the world, has recently published an article to this effect-that because a computer is an abstract symbol system, ie one whose functional (intentional) operations are independent of any one (hardware) substrate, its computational symbols can never be semantically grounde. In plain English, a computer can never 'know' what its internal representations mean. 

This is such an enormous, embarassing clanger (ie 'faux pas' or 'stumble') for a person who has otherwise advanced the field of AI more than most other people alive today. Without posting a full refutation, the error issummarised as follows- when we think of an 'abstract symbol system' we are actually considering the class of such machines, not any one individual machine. Consider a group of machines, which share a common 'software' layer, say a particular flavor of Unix, but all with different hardware platforms (eg one is Intel, one is AMD, etc. 

Before the operating system layers are compiled for their target platform, it is true that they are not semantically grounded, but once each source (ie symbolic language form) has been compiled down to its target platform, they are now different data representations. They are now unique to each substrate. Fodor's erroneous conclusion has arisen from two causes- (1) he doesn't seem to do any real, low level computing, of the sort that Turing and Von Neumann would have spent many years getting their hands dirty with, and (2) he hasn't thought the thing through to its (il)logical conclusion.

If so-called 'abstract' symbol systems can't do cognition, then what exactly is the brain? The proponents of Fodor's and similar views posit that the human brain instead works as a 'sub-symbolic' processor.  Again, a little thought will reveal this to be yet another false dichotomy (like living/non-living).

Consider this- all computers must be sub-symbolic at some level, below the data link level. At several intervening points in any computational communication, sequential symbol streams devolve to electrical signal variations in a wire. In fact, sub-symbolic/signal forms exist at lower levels in parallel with symbolic/syntax forms at higher levels. In Artificial Neural Networks, current situation data is stored as analog weights in ANN neurode pseudo-synapses. But just because the data is stored in an analog format doesn't mean that it functions as analog data, in a continuous way. Arguably, it doesn't and it can't, since analog machines must develop internal categorization systems such as taxonomies.

Two different ANN's given the same testing and training sets will yield a different set of internal weight vectors. Indeed, the same ANN applied repetitively to the same test+training data will have a slightly different set of analog numbers describing the pseudo-synapse weight vector components, each time the backprop (or other gradient ascent) algorithm is run.

Lets take a closer look at these internal weight vectors. Each (say the ith) member of a given (say the jth) row of neurodes which form a computational layer in the ANN represents a 'template' or 'filter' vector. Input data vectors are inner-producted in parallel with these template vectors. Actually, the template vectors of each layer form a set of pseudo-symbols, performing the same combinational role as conventional symbols in discrete processing, allowing (n+1)th level symbols to be constructed from nth level symbols.How else could stable, error-free hierarchies be made from analog components.

Those in the first layer (of a multi-layer perceptron, say) are the most basic symbols- lets call it the pseudo-alphabet. Those at the nest layer, say the 'hidden' or 'middle' layer in a three layer feedforward network, form the next type of pseudo-symbols, lets call them pseudo-words. You see the pattern - the increasing levels of ANN processing utilise a hierarchy of (supposedly analog, or sub-symbolic) pseuso-symbols, from which an equivalent global function to their conventional discrete counterparts.

Considered in this way, the real difference between so-called sub-symbolic, and symbolic (a.k.a. linguistic) computational systems is revealed, not as a difference in the way data is represented internally (analog vs discrete data values), but as a difference in when the (pseudo-/)symbols are created.

Consider a human baby. It looks at everything wide-eyed, then it babbles and struggles. It forms its internal data templates (ie sensorimotor 'alphabet', 'lexicon', 'grammar' etc) at the start of its life, in advance of when these values are needed. This is in stark contrast to the conventional ANN, which computes its internal representations when they are needed, ie as it is presented with situations, by off-line use of backpropagation (or similar) methods.

The solution to this problem already exists. It is called ART, and was invented by Stephen Grossberg. It uses buffer-archive pairing, as discovered by Dyer (2011), and match-based learning. Dyer independently discovered both ART (which he called a Pose Cell Array) and match-based learning (which he called 'alphabetic' or discrete learning). Dyers work is part of a pan-linguistic idea called the Universal Language Machine.

General Approach to the topic taken in the TDE project - a personal view

Like all highly specialised sciences, there is an enormous amount of epistemological context (ie background knowledge) that one must first imbibe, before one is able to even understand the key issues in a public scientific debate, let alone reverse engineer those observations in order to expose their underlying etiology (causation). The reader who wishes to judge a new theory must normally do two things- ascertain whether a new theory is (a) really new, or a reworking of a past, peer-rejected model, and (b) if new, then viable, ie plausible, in view of the pluses and minuses of the two or three best available candidates.

There is one situation in which this patently commonsense approach will not work, and that is the situation in which someone discovers completely new data, and/or produces a radically new analysis of existing data. Such was the situation at the beginning of the 20th Century, when Albert Einstein used Minkowski's equations to posit a completely different shape for space-time. It was not possible to consider his work against the backdrop of the best two or three existing models, or rather, to do so would have erroneously led to a rejection of Einstein's model, since none of the available alternatives were able to explain why gravity was an inverse square law, or why light rays from distant stars which passed near the sun were deviated, as though refracted by a prism (the so-called 'gravity lens'). A similar argument applies to the other great fin de siècle theory, quantum theory. Only a theory in which energy was quantised into frequency-sized packets could explain why the spectrum of black body radiation was not continuous but discrete. Since all other theories of photoelectricity assumed a priori that energy was distributed stochastically, ie with normally distributed particle energies, none of these theories would be of any use in a peer-review situation.

So is the case with my theory of mind, named after the TDE, the bio-plausible, recursively occurring Turing Machine which lies at it's very heart. Since the basic shape of my theory became clear to me, around mid 2012, I have been sending emails describing my theory and how I discovered it, to as many people who have contributed publically to the field of cognitive philosophy, science and engineering. Not altogether unexpectedly, the response has been underwhelming.

My first reaction to others' lack of reaction was that they didn't believe me, after all, I am largely unpublished in the world of peer-reviewed journals, and for all the world knows, could be a liar, a thief or other kind of intellectual scoundrel which pops up from time to time in the world of public ideas. The most basic reading of my work, based on my website, and my on-line provenance (eg my public resume.pdf and my Facebook page) is usually enough to demonstrate that I am neither bad nor mad. However, the real worry that remains is that, however much I believe my own conclusions, I have made an unrecognised error and therefore I may be wrong.

There is an upside to being in the same situation as Einstein, however. Since my discovery indicates that the current world opinion in cognitive science is wrong, the reader who wishes to understand my work need not get across the various competing models to my own. Why? Because there is only one theory which presents a complete top-to-bottom physicalist account of human and animal consciousness, and that is mine. No one else has as yet been able to assemble the various pieces of data together into a coherent and cogent account of self, brain and mind. No one.

There have been excellent books on the topic of consciousness, such as Daniel Dennett's 'Consciousness Explained', but it's title is, regrettably, a furphy. Dennett does a lot of things in his book, but explaining consciousness ( as in providing a physicalist account of it) is not one of them. Getting to the bottom of consciousness means just that- explaining it in sufficient detail that the resultant theory accounts for  the objective and subjective things we may not know all about, but certainly may have experienced with our own sense of self-consciousness. Dennett's main focus is a mere sub-problem of the main game, namely explaining Libet's Paradox, which he attempts to do so by introducing the so-called 'multiple drafts' version of organic system's perception.

Finding a solution to Libet's Paradox is indeed an important task, since this infamous experiment appears to poke a gaping hole in the very idea of causality, ie that causes precede effects. By introducing the possibility of unconscious thinking ( ie neural functions with no conscious control) the data seems possible, but the very nature of the test itself precludes thus as a viable 'out', since it involves the subject making conscious decisions as to when the time datum starts. Before the subject's intention, no information of any kind exists anywhere, with which an ante-dated procedure can be initiated.

The TDE theory provides an explanation for Libet's data that is as astounding as it is satisfactory. The explanation is due to a specialised neural mechanism, the same mechanism that provides low motor latency between motive and motion and also provides low affect latency between external events and internal reactions. I remember the day that I discovered it-I was playing around with cybernetic (control) circuit diagrams, trying to reduce the latency that is inevitable with hierarchical command structures, such as Tinbergen trees. With roughly six levels of neurons linking sensory input to cortical decision layers in our brains, how is it even possible to control the limbs when quickly changing events rule the current situation?

The answer can be obtained by examining saccades, the zig-zag motion which allows our eyes to keep up with rapid scene changes, without generating an unstable visual experience. It turns out that some researchers have identified so-called 'mental saccades' as a more generic mechanism that operates in other parts of the brain.

However, that is not how I discovered it. I was looking at the design of WWII bomb sights. These clever devices are actually mechanical computers which combine a predictive feedforward step with a corrective feedback step, hence their generic name, predictor-corrector control circuits. These days it is extremely hard to find any examples of this generation of control design, since transistorised (ie digital) chips make decisions in such a short time, as to make the need to balance lagging with leading cycle phases obsolete and old fashioned. However, problems of this type find a way of resurfacing. So it is with the predictor-corrector circuit, which finds new uses in modern robotic dynamics, where it is called a 'Kalman Filter'.

Knowing as I do the real answer, Dennett's book seems to be much ado about nothing. However, this would be unfair, because I have relied extensively on his excellent accounts of cognitive science observations to educate myself. Ultimately, the point made by his book may be irrelevant and off-topic ( I believe it is) yet he is such a good writer and a fabulous thinker, that I feel it churlish to criticise it so. (Daniel, please can we still be friends?)

Jerry Fodor is another person who I like to go out of my way to criticise, even though, like Dennett, his penmanship and thought processes have served as invaluable epistemological scaffolding for my own brand of heresy. Fodor started his career by being one of the main proponents of CTM, or Computer Theory of Mind. My TDE theory is a CTM exemplar, therefore, prima facile, I have a lot to thank Fodor for.

However, later in his career, his ideas did a kind of 180 degree turn, and he renounced simple, great ideas like CTM in favor of more nuanced, philosophically murky interpretations. Does Fodor know some dirty little secret about CTM that he is not sharing with the rest of us? Not at all. Rather, he thought he would have seen much more progress in solving the problem of the mind than is actually the case. This kind of disappointment can affect all of us and most likely caused him to have a mid-career crisis of faith which manifested itself as apparent reversals of his prior canons.

* I prefer the term "Marr's Prism" to "Marr's tri-layer hierarchy". It is both more evocative and more economical.

**Why they do this is a mystery- perhaps it enables some scientists to maintain their religion at home while also practising godless blasphemy at work.

------------------------ Copyright 2013 Charles Dyer------------------------