Tuesday, July 02, 2024

messy AI milestones

For me it is VERY useful to have a list of AI milestones with dates. This defines the ball park which is much, much bigger than ChatGPT. It provides a framework which helps inform future focus. The comments I've added are there as a self guide to future research. So, they often do hint at my favourites.

Keep in mind that there are at least four different types of AI: Symbolic, Neural Networks aka Connectionist, Traditional Robots and Behavioural Robotics, as well as hybrids. For some events in the timeline it is easy to map to the AI type but for others it is not so easy.

1943: Warren McCulloch, a neurophysiologist, and Walter Pitts, a logician, teamed up to develop a mathematical model of an artificial neuron. In their paper "A Logical Calculus of the Ideas Immanent in Nervous Activity" they declared that:
Because of the “all-or-none” character of nervous activity, neural events and the relations among them can be treated by means of propositional logic. It is found that the behavior of every net can be described in these terms.
1950: Alan Turing publishes “Computer Machinery and Intelligence” (‘The Imitation Game’ later known as the Turing Test)
1952: Arthur Samuel implemented a program that could play checkers against a human opponent

1954: Marvin Minsky submitted his Ph.D. thesis in Princeton in 1954, titled Theory of Neural-Analog Reinforcement Systems and its Application to the Brain-Model Problem; two years later Minsky had abandoned this approach and was a leader in the symbolic approach at Dartmouth.

1956: Dartmouth Workshop organised by John McCarthy coined the term Artificial Intelligence. He said would explore the hypothesis that "every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it."
The main descriptor for the favoured approach was Symbolist: based on logical reasoning with symbols. Later this approach was often referred to as GOFAI or Good Old Fashioned AI.

Knowledge can be represented by a set of rules, and computer programs can use logic to manipulate that knowledge. Leading symbolists Allen Newell and Herbert Simon argued that if a symbolic system had enough structured facts and premises, the aggregation would eventually produce broad intelligence.

Marvin Minsky, Allen Newell and Herb Simon, together with John McCarthy, set the research agenda for machine intelligence for the next 30 years. All were inspired by earlier work by Alan Turing, Claude Shannon and Norbert Weiner on tree search for playing chess. From this workshop, tree search — for game playing, for proving theorems, for reasoning, for perceptual processes such as vision and speech and for learning — became the dominant mode of thought.

1957: Connectionists: Frank Rosenblatt invents the perceptron, a system which paves the way for modern neural networks
The connectionists, inspired by biology, worked on "artificial neural networks" that would take in information and make sense of it themselves. The pioneering example was the perceptron, an experimental machine built by the Cornell psychologist Frank Rosenblatt with funding from the U.S. Navy. It had 400 light sensors that together acted as a retina, feeding information to about 1,000 "neurons" that did the processing and produced a single output. In 1958, a New York Times article quoted Rosenblatt as saying that "the machine would be the first device to think as the human brain."

Perceptrons were critiqued as very limited in what they could achieve by the symbolic advocates Minsky & Papert in their book Perceptrons. The Symbolists won this funding battle.

1959: John McCarthy noted the value of commonsense knowledge in his pioneering paper "Programs with Common Sense" [McCarthy1959]

1959:Arthur Samuel published a paper titled “Some Studies in Machine Learning Using the Game of Checkers”⁠2, the first time the phrase “Machine Learning” was used–earlier there had been models of learning machines, but this was a more general concept

1960: Frank Rosenblatt published results from his hardware Mark I Perceptron, a simple model of a single neuron, and tried to formalize what it was learning.

1960: Donald Michie himself built a machine that could learn to play the game of tic-tac-toe (Noughts and Crosses in British English) from 304 matchboxes, small rectangular boxes which were the containers for matches, and which had an outer cover and a sliding inner box to hold the matches. He put a label on one end of each of these sliding boxes, and carefully filled them with precise numbers of colored beads. With the help of a human operator, mindlessly following some simple rules, he had a machine that could not only play tic-tac-toe but could learn to get better at it.

He called his machine MENACE, for Matchbox Educable Noughts And Crosses Engine, and published⁠5 a report on it in 1961

1960s: Symbolic AI in the 1960s was able to successfully simulate the process of high-level reasoning, including logical deduction, algebra, geometry, spatial reasoning and means-ends analysis, all of them in precise English sentences, just like the ones humans used when they reasoned. Many observers, including philosophers, psychologists and the AI researchers themselves became convinced that they had captured the essential features of intelligence. This was not just hubris or speculation -- this was entailed by rationalism. If it was not true, then it brings into question a large part of the entire Western philosophical tradition.

Continental philosophy, which included Nietzsche, Husserl, Heidegger and others, rejected rationalism and argued that our high-level reasoning was limited, prone to error, and that most of our abilities come from our intuitions, our culture, and from our instinctive feel for the situation. Philosophers who were familiar with this tradition were the first to criticize GOFAI (Good Old Fashioned AI) and the assertion that it was sufficient for intelligence, such as Hubert Dreyfus and Haugeland.

1963: First PhD about computer vision by Larry Roberts MIT

1963: (1985) The philosopher John Haugeland in his 1985 book "Artificial Intelligence: The Very Idea" asked these two questions:
  • Can GOFAI produce human level artificial intelligence in a machine?
  • Is GOFAI the primary method that brains use to display intelligence?
AI founder Herbert A. Simon speculated in 1963 that the answers to both these questions was "yes". His evidence was the performance of programs he had co-written, such as Logic Theorist and the General Problem Solver, and his psychological research on human problem solving.

1966: Joseph Weizenbaum creates the Eliza Chatbot, an early example of natural language processing.
1967: MIT professor Marvin Minsky wrote: "Within a generation...the problem of creating 'artificial intelligence' will be substantially solved."

1968: Origin of Traditional Robotics: an approach to Artificial Intelligence by Donald Pieper, "The Kinematics of Manipulators Under Computer Control", at the Stanford Artificial Intelligence Laboratory (SAIL) in 1968.

1969-71: The classical AI "blocksworld" system SHRLDU, designed by Terry Winograd (mentor to Google founders Larry Page and Sergey Brin) revolved around an internal, updatable cognitive model of the world, that represented the software's understanding of the locations and properties of a set of stacked physical objects (Winograd,1971). SHRDLU carried on a simple dialog (via teletype) with a user, about a small world of objects (the BLOCKS world) shown on an early display screen (DEC-340 attached to a PDP-6 computer)

1979: Hans Moravec builds the Stanford Cart, one of the first autonomous vehicles (outdoor capable)

1980s: Back propagation and multi layer networks used in neural nets (only 2 or 3 layers)

1980s: Rule based Expert Systems, a more heuristic form of logical reasoning with symbols encoded the knowledge of a particular discipline, such as law or medicine

1984: Douglas Lenat (1950-2023) began work on a project he named Cyc that aimed to encode common sense in a machine. Lenat and his team added terms (facts and concepts) to Cyc's ontology and explained the relationships between them via rules. By 2017, the team had 1.5 million terms and 24.5 million rules. Yet Cyc is still nowhere near achieving general intelligence. Doug Lenat made the representation of common-sense knowledge in machine-interpretable form his life's work
Alan Kay's speech at Doug Lenat's memorial

1985: Robotics loop closing (Rodney Brooks, Raja Chatila) – if a robot sees a landmark a second time it can tighten up on uncertainties

1985: Origin of behavioural based robotics. Rodney Brooks wrote "A Robust Layered Control System for a Mobile Robot", in 1985, which appeared in a journal in 1986, when it was called the Subsumption Architecture. This later became the behavior-based approach to robotics and eventually through technical innovations by others morphed into behavior trees.

This has lead to more than 20 million robots in people’s homes, numerically more robots by far than any other robots ever built, and behavior trees are now underneath the hood of two thirds of the world’s video games, and many physical robots from UAVs to robots in factories.

1986: Marvin Minsky publishes "The Society of Mind". A mind grows out of an accumulation of mindless parts.
1986: David Rumelhart, Geoffrey Hinton, and Ronald Williams published a paper Learning Representations by Back-Propagating Errors, which re-established the neural networks field using a small number of layers of neuron models, each much like the Perceptron model. There was a great flurry of activity for the next decade until most researchers once again abandoned neural networks.

1986: Perhaps the most pivotal work in neural networks in the last 50 years was the multi-volume Parallel Distributed Processing (PDP) by David Rumelhart, James McClellan, and the PDP Research Group, released in 1986 by MIT Press. Chapter 1 lays out a similar hope to that shown by Rosenblatt:
People are smarter than today's computers because the brain employs a basic computational architecture that is more suited to deal with a central aspect of the natural information processing tasks that people are so good at. ...We will introduce a computational framework for modeling cognitive processes that seems… closer than other frameworks to the style of computation as it might be done by the brain.

Rumelhart and McClelland dismissed symbol-manipulation as a marginal phenomenon, “not of the essence human computation”.
1986: The term Deep Learning was introduced to the machine learning community by Rina Dechter in 1986

1987: Chris Langton instigated the notion of artificial life (Alife) at a workshop11 in Los Alamos, New Mexico, in 1987. The enterprise was to make living systems without the direct aid of biological structures. The work was inspired largely by John Von Neumann, and his early work on self-reproducing machines in cellular automata.

1988: One of Hinton's postdocs, Yann LeCun, went on to AT&T Bell Laboratories in 1988, where he and a postdoc named Yoshua Bengio used neural nets for optical character recognition; U.S. banks soon adopted the technique for processing checks. Hinton, LeCun, and Bengio eventually won the 2019 Turing Award and are sometimes called the godfathers of deep learning.

Late 1980s: The market for expert systems crashed because they required specialized hardware and couldn't compete with the cheaper desktop computers that were becoming common

1989: “Knowledge discovery in databases” started as an off-shoot of machine learning, with the first Knowledge Discovery and Data Mining workshop taking place at an AI conference in 1989 and helping to coin the term “data mining” in the process

1989: “Fast, Cheap, and Out of Control: A Robot Invasion of the Solar System”, by Rodney Brooks and Anita Flynn where we had proposed the idea of small rovers to explore planets, and explicitly Mars, rather than large ones that were under development at that time

1991: Rodney Brooks published “Intelligence without Reason”. This is both a critique of existing AI being determined by the current state of computers and a suggestion for a better way forward based on emulating insects (behavioural robotics)
1991: Simultaneous Localisation and Mapping (SLAM) Hugh Durrant-Whyte and John Leonard: symbolic systems replaced with geometry with statistical models of uncertainty ( used in self-driving cars , navigation and data collection from quadcopter drones, inputs from GPS )

1997: IBMs Deep Blue defeats world chess champion Gary Kasparov
1997: Soft landing of the Pathfinder mission to Mars. A little later in the afternoon, to hearty cheers, the Sojourner robot rover deployed onto the surface of Mars, the first mobile ambassador from Earth
Early 2000s: new symbolic-reasoning systems based on algorithms capable of solving a class of problems called 3SAT and with another advance called simultaneous localization and mapping. SLAM (Simultaneous Localisation and Mapping) is a technique for building maps incrementally as a robot moves around in the world

2001: Rodney Brooks company iRobot, on the morning of September 11, sent robots to ground zero in New York City. Those robots scoured nearby evacuated buildings for any injured survivors that might still be trapped inside.

2001-11: Packbot robots from irobot were deployed in the thousands in Afghanistan and Iraq searching for nuclear materials in radioactive environments, and dealing with road side bombs by the tens of thousands. By 2011 we had almost ten years of operational experience with thousands of robots in harsh war time conditions with human in the loop giving supervisory commands

2002: iRobot (Rodney Brooks company) introduced the Roomba
2005: The DARPA (Defense Advanced Research Projects Agency) Grand Challenge was won by Stanford Driverless car by driving 211 km on an unrehearsed road

2006: Geoffrey Hinton and Ruslan Salakhutdinov, published "Reducing the Dimensionality of Data with Neural Networks", where an idea called clamping allowed the layers to be trained incrementally. This made neural networks undead once again, and in the last handful of years this deep learning approach has exploded into practicality of machine learning

2009: Foundational work on neurosymbolic models is (D’AvilaGarcez,Lamb,& Gabbay,2009) which examined the mappings between symbolic systems and neural networks

2010s: Neural nets learning from massive data sets

2011: A week after the tsunami, on March 18th 2011, when Brooks was still on the board of iRobot, we got word that perhaps our robots could be helpful at Fukushima. We rushed six robots to Japan, donating them, and not worrying about ever getting reimbursed–we knew the robots were on a one way trip. Once they were sent into the reactor buildings they would be too contaminated to ever come back to us. We sent people from iRobot to train TEPCO staff on how to use the robots, and they were soon deployed even before the reactors had all been shut down.

The four smaller robots that iRobot sent, the Packbot 510, weighing 18kg (40 pounds) each with a long arm, were able to open access doors, enter, and send back images. Sometimes they needed to work in pairs so that the one furtherest away from the human operators could send back signals via an intermediate robot acting as a wifi relay. The robots were able to send images of analog dials so that the operators could read pressures in certain systems, they were able to send images of pipes to show which ones were still intact, and they were able to send back radiation levels. Satoshi Tadokoro, who sent in some of his robots later in the year to climb over steep rubble piles and up steep stairs that Packbot could not negotiate, said⁠3 “[I]f they did not have Packbot, the cool shutdown of the plant would have [been] delayed considerably”. The two bigger brothers, both were the 710 model, weighing 157kg (346 pounds) with a lifting capacity of 100kg (220 pounds) where used to operate an industrial vacuum cleaner, move debris, and cut through fences so that other specialized robots could access particular work sites.
But the robots we sent to Fukushima were not just remote control machines. They had an Artificial Intelligence (AI) based operating system, known as Aware 2.0, that allowed the robots to build maps, plan optimal paths, right themselves should they tumble down a slope, and to retrace their path when they lost contact with their human operators. This does not sound much like sexy advanced AI, and indeed it is not so advanced compared to what clever videos from corporate research labs appear to show, or painstakingly crafted edge-of-just-possible demonstrations from academic research labs are able to do when things all work as planned. But simple and un-sexy is the nature of the sort of AI we can currently put on robots in real, messy, operational environments.

2011: IBM’s Watson wins Jeopardy

2011-15: Partially in response to the Fukushima disaster the US Defense Advanced Research Projects Agency (DARPA) set up a challenge competition for robots to operate in disaster areas

The competition ran from late 2011 to June 5th and 6th of 2015 when the final competition was held. The robots were semi-autonomous with communications from human operators over a deliberately unreliable and degraded communications link. This short video focuses on the second place team but also shows some of the other teams, and gives a good overview of the state of the art in 2015. For a selection of greatest failures at the competition see this link.

2012: Nvidia noticed the trend and created CUDA, a platform that enabled researchers to use GPUs for general-purpose processing. Among these researchers was a Ph.D. student in Hinton's lab named Alex Krizhevsky, who used CUDA to write the code for a neural network that blew everyone away in ImageNet competition, which challenged AI researchers to build computer-vision systems that could sort more than 1 million images into 1,000 categories of objects

AlexNet's error rate was 15 percent, compared with the 26 percent error rate of the second-best entry. The neural net owed its runaway victory to GPU power and a "deep" structure of multiple layers containing 650,000 neurons in all.
In the next year's ImageNet competition, almost everyone used neural networks.

2013-18: Speech transliteration systems improve and proliferate – we can talk to our devices

2014: Google program had automatically generated this caption: “A group of young people playing a game of Frisbee”. (reported in a NYT article)
2015: LeCun, Bengio, Hinton (LeCun 2015)
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

2015: Diffusion models were introduced in 2015 as a method to learn a model that can sample from a highly complex probability distribution. They used techniques from non-equilibrium thermodynamics, especially diffusion. Diffusion models have been commonly used to generate images from text. Still, recent innovations have expanded their use in deep-learning and generative AI for applications like developing drugs, using natural language processing to create more complex images and predicting human choices based on eye tracking.
2016: Google's AlphaGo AI defeated world champion Lee Sedol, with the final score being 4:1.
2017: In one of Deep Mind’s most influential papers “Mastering the game of Go without human knowledge”,the very goal was to dispense with human knowledge altogether, so as to “learn, tabularasa, superhuman proficiency in challenging domains”(Silveretal.,2017).
(this claim has been disputed by Gary Marcus)

2017-19: New architectures, such as the Transformer(Vaswanietal.,2017) developed, which underlies GPT-2(Radfordetal.,2019)

2018: Behavioural AI:
Blind cheetah robot climbs stairs with obstacles: visit the link then scroll down for the video

2019: Hinton, LeCun, and Bengio won the 2019 Turing Award and are sometimes called the godfathers of deep learning.
2019: The Bitter Lesson by Rich Sutton, one of founders of reinforcement learning.
The biggest lesson that can be read from 70 years of AI research is that general methods thatleverage computation are ultimately the most effective, and by a large margin…researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation.…the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation.
(This analysis is disputed by Gary Marcus in his hybrid essay)

2019: Rubik’s cube solved with a robot hand: video

2020: Open AI introduces GPT3 natural language model which later spouts bigoted remarks

2021: DALL-E images from text captions

2022: Text to images
Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. It is considered to be a part of the ongoing artificial intelligence boom. It is primarily used to generate detailed images conditioned on text descriptions.

Stable Diffusion is a latent diffusion model, a kind of deep generative artificial neural network. Its code and model weights have been released publicly, and it can run on most consumer hardware equipped with a modest GPU with at least 4 GB VRAM. This marked a departure from previous proprietary text-to-image models such as DALL-E and Midjourney which were accessible only via cloud services.

2022, November: ChatGPT is a chatbot and virtual assistant developed by OpenAI and launched on November 30, 2022. Based on large language models (LLMs), it enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. Successive user prompts and replies are considered at each conversation stage as context.

ChatGPT is credited with starting the AI boom, which has led to ongoing rapid investment in and public attention to the field of artificial intelligence (AI). By January 2023, it had become what was then the fastest-growing consumer software application in history, gaining over 100 million users and contributing to the growth of OpenAI's current valuation of $86 billion.

Wednesday, June 26, 2024

an AI taxonomy

When I dig below the hype and both positive and critical evaluations of ChatGPT what I discover is that AI thought leaders disagree and argue with each other a lot. As well as the disagreements about the reliability of ChatGPT there is also the issue of different types of AI. The ascendancy of Deep Learning in the public consciousness is a relatively recent phenomenon in the 60 plus years history of AI. As part of my research I found the need to clarify a broad bunch of terms that refer to very different approaches. A fundamental part of understanding AI is knowing the different types of AI.

Aside for future article: Moreover, underlying these different approaches originally were different ideas about how the human mind and / or brain works.

What is AI?

Artificial Intelligence is the field of developing computers and robots that are capable of behaving in ways that both mimic and go beyond human capabilities. An AI can do discovery, inference and reasoning just like a human. I’ve underlined robots because they, too, have been very much sidelined in the current ChatGPT hype.

The difference between AI and AGI, which also introduces the terms narrow and weak AI

The phrase “Artificial Intelligence” originated from John McCarthy at the Dartmouth Conference in 1956. In the beginning there was only AI, the goal to build a machine from which we couldn’t tell the difference from a human. This was Alan Turing’s imitation game, aka the Turing test. In 1950 Alan Turing published “Computer Machinery and Intelligence” which presented his Imitation Game challenge which later became known as the Turing Test. The Turing Test has now become outdated but that is a side issue to the present article.

Imitating humans in all or most respects turned out to be very hard to do. Over time, as researchers built machines that could perform a particular sub task that humans could do, such as playing chess or correctly labelling images or self driving cars then those sub goals retained the AI label. So, new labels, AGI (Artificial General Intelligence) and ASI (Artificial Super Intelligence) were invented to renew the original goal of imitating or surpassing all of human intelligence.

An alternative could have been using the terms narrow AI or weak AI. Narrow AI is a good descriptor for one narrow task. Weak AI is a similar term, an AI that implements a limited part of the mind.

What is Machine Learning (ML)?

Some experts refer to their work as Machine Learning. For me this needs clarification because there is overlap in the usage of ML and AI. ML is a subset of AI but in what way?

ML means that the machine is Learning (duh!). It is learning by example. In some cases lots of examples. It improves its performance with training over time. This is a different type of coding to the one I am used to where you write code to do something and that something is predetermined by the code and doesn’t change over time. With machine learning, we can use algorithms that have the ability to learn.

There are lots of these algorithms. See this geeks for geeks page for a comprehensive list. One simple example is linear regression. The machine can take an input of a series of points on a graph and map a straight line of best fit to those points. To do this you would need the data (the points), the linear regression algorithm and some python code to do the work. I provide links to a couple of beginner's hands on AI courses in the reference section below that take you through this process.

Phrases that go with ML: data, big data (especially for Deep Learning which is a subset of ML), self learning, statistical models, self correction, can only use structured and semi-structured data

I've been searching for a relatively simple example of making Machine Learning (making rather than doing ML) suitable for middle school students. Most people are current excitedly focused on the doing but my belief is that to understand it deeply you need to make it.

Rodney Brooks illustration of the early machine learning device by Donald Michie provides an entertaining introduction to the topic. Michie built a machine that could learn to play the game of tic-tac-toe (Noughts and Crosses in British English) from 304 matchboxes, small rectangular boxes which were the containers for matches, and which had an outer cover and a sliding inner box to hold the matches. He put a label on one end of each of these sliding boxes, and carefully filled them with precise numbers of colored beads. With the help of a human operator, mindlessly following some simple rules, he had a machine that could not only play tic-tac-toe but could learn to get better at it. He called his machine MENACE, for Matchbox Educable Noughts And Crosses Engine, and published⁠ a report on it in 1961

Brooks references a 1962 Scientific American article by Martin Gardner which illustrated the concepts with a simpler version to play hexapawn, three white chess pawns against three black chess pawns on a three by three chessboard. As in chess, a pawn may move forward one space into an empty square, or capture an enemy pawn by moving diagonally forward one space. If you get a pawn to the last row, you win. You also win if you capture all the enemy's pawns, or if the enemy cannot move.

Impressively, TheGamer has coded MENACE in Scratch, here and puttering has coded hexapawn in Scratch, here.

What AI isn’t Machine Learning?

You can have forms of AI that don’t learn over time. Symbolic AI, Traditional robotics and Behaviour based robotics could all fit this category. They are programmed, they do some human like stuff but don’t change or improve over time without human reprogramming. They are still important but sidelined at the moment due to the LLM (Large Language Models) hype. A little more on this below.

This diagram, found on the web, is missing the two robotic forms of AI and the Neuro-Symbolic hybrids

What is Deep Learning?

To repeat, Machine learning (ML) is a subfield of AI that uses algorithms trained on data to produce adaptable models that can perform a variety of complex tasks.

Deep learning is a subset of machine learning that uses several layers within neural networks to do some of the most complex ML tasks sometimes without any human intervention. But sometimes there is human intervention, in the case of Reinforcement Learning. I won't go into detail about Deep Learning here because it is long as well as smelly. I do provide a link to a detailed version in the Reference.

Four main types of AI

Following Rodney Brooks, there are at least four types of AI. They are, along with approximate start dates:
  1. Symbolic (1956) aka Good Old Fashioned AI (GOFAI). Acccording to Herb Simon symbols represent the external world; thought consists of expanding, breaking up, and reforming symbol structures, and intelligence is nothing more than the ability to process symbols. This was the originally dominant of AI (from the 1956 Dartmouth Conference) but the enormous effort over decades by Doug Lenant’s Cyc Project failed so now it has become marginalised due to the successes of Deep Learning.
  2. Neural networks (1954, 1960, 1969, 1986, 2006) aka Connectionism, which evolved into Deep Learning…lots of different start dates here since it has died and then returned from the dead a few times). Deep Learning gets all the media attention these days.
  3. Traditional robotics (1968)
  4. Behaviour-based robotics (1985) aka embodied or situated AI or insect inspired AI! (my term)

One more important thing. Some authors, notably Gary Marcus, say that Neuro-Symbolic hybrids are the way forward to robust, reliable AI.

Here's my crude Venn diagram of the different types of AI:
ML = Machine Learning; DL = Deep Learning; S = Symbolic AI;
H = neuro-symbolic hybrid; TR = Traditional Robotics;
BR = Behavioural Robotics

To understand the current deficiencies of the AI debate / hype it’s necessary to look at the strengths and weaknesses of these different types. Rodney Brooks does evaluate them, in his 2018 blog referenced below, against these criteria: Composition, Grounding, Spatial, Sentience and Ambiguity

AI development has had a tortured zig, zag history. Another fascinating way to view it is from the influences and underlying belief systems / philosophies of the AI founders and developers. If we build our machines in our own image then what is that image?


Brooks, Rodney. Steps Towards Super Intelligence, 1. How we got here (2018)
Brooks, Rodney. Machine Learning Explained (2017)
Marcus, Gary. The next decade in AI: Four Steps Towards Robust Artificial Intelligence (2020)
Deep Learning (DL) vs Machine Learning (ML): A Comparative Guide
Excellent, wide ranging explanations of ML and DL
The 10 Best Machine Learning Algorithms for Data Science Beginners
Linear Regression to fit points to a straight line is their number one
Your first machine learning project in python step by step
Free introductory hands on course to Deep Learning

Friday, May 10, 2024

short descriptors of different learning theories

Some years ago I organised a wiki called "Learning Evolves". This folded because the hosts, Wikispaces, closed down. At the time I couldn't find an equivalent site (free for educators).

Back then I discovered lots of different learning theories. I was surprised by how many there were. Since then I've often thought of providing succint descriptions of some of the more important learning theories. This is one way (I stress here, not the only way) to make a start on how we learn.

Given my present confinement (recovering from a busted achilles tendon, which gives me more time for theory) I've decided to do it. This is a rough draft. I'm leaving a lot of stuff out. Probably I will return to this page and do updates from time to time.

In Society of Mind Marvin Minsky said the trick is that there is no trick. "There is no single secret, magic trick to learning; we simply have to learn a large society of different ways to learn". So we need to study a wide variety of learning theories to learn about the wide variety of tricks that different people use to learn. It's a lot of work and takes some time. There is no general theory of learning just as there is no general theory of intelligence. So, because learning theories are fuzzy, slippable, embodied and situated things and not sharp, hard edged purely logical things they do require a lot of study to understand them. It doesn't begin or end with study of learning theory. There is philosophy, history, evolution, artificial intelligence, neuroscience and more.

Enactivism: knowledge stored in the form of motor reponses and acquired by the act of "doing". It is a form of cognition inherently tied to actions, as in the handcrafter as way of knowing. It is an intuitive non-symbolic form of learning

Instructionism or Behaviourism: Responses that are rewarded tend to be repeated. Educational outcomes can be identified: fact recall, skills and attitudes. Education can be optimised to achieve measurable changes in these desired outcomes

Cognitivism: the mind is a bit like a computer, it has meaningful structures (schemas, representations, symbols) which receives inputs that are processed and produce outputs.

Constructivism: children build or construct their own intellectual structures

Constructionism: to build personal or social meaning with engaging objects controlled by computer code in a language like Logo which evolved into Scratch

Phenomenology: focuses on an individual’s first-hand experiences rather than the abstract experience of others

Social learning: the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem solving under adult guidance or in collaboration with more capable peers (the zone of proximal development, Vygotsky)

Update (11/5/24):

What is missing here is the need to combine different learning theories in a way which integrates their various strengths and leaves out their weaknesses. The best effort I have seen which does this is Diana Laurillard's, Conversational Framework. I have written up a summary of that framework which I feel needs some polishing before publishing.

Monday, April 22, 2024

Missing in Australia: 21st C Maker Ed Jobs

Fab Learn Jobs Board

This Job Board is open to "educational makerspaces around the world" but for some strange reason I never see Australian jobs advertised here. Oh, yes, I am being ironic and rhetorical. IMO an important reason, although not the only one, is that we have a one size, (doesn't) fits all, national standards based curriculum called ACARA.

Why aren’t jobs like this being advertised in Australia? Or, if I have missed them then please show me where they are?

Posted March 21, 2024
Westfield State University:
Research, Innovation, Design and Entrepreneurial (RIDE) Center Coordinator
General Statement of Duties: Full time salaried position
The RIDE Center coordinator will support the Executive Director in managing the equipment, space, and programming. The space includes a design studio and MakerSpace with 3D printers, Adobe and other equipment software, laser cutter, woodworking, sewing, computer programing, vacuum formers, and circuitry tools. They will help to coordinate and contribute to a positive user experience, managing classroom, student, and community visits and activities, helping with scheduling events, communication, student support, preparation of reports/assessments, administrative/office tasks, and other RIDE center needs. They will assist with coordination of student interns, work study, and graduate assistants, and community engagements associated with RIDE centers. They will assist the Executive Director with RIDE expositions, workshops, speaker series, and other events, as well as training students, faculty, staff, and community on equipment and software use within the center.
Posted March 21, 2024
St Mary’s School in Orange County

St. Mary’s School is an independent day school that serves over 700 students, Pre-K through Grade 8, in Aliso Viejo, CA. As the only independent school in Orange County that offers the international baccalaureate (IB) program from primary school through middle school, St. Mary’s is committed to a globally-minded and innovative curriculum that, in many ways, stands alone within the educational landscape in Orange County. St. Mary’s students are prepared not only for the next steps in their educational journey, but also to become courageous, caring, global citizens and enlightened leaders of tomorrow.

This summer, construction will begin on a 28,000 square foot facility which will include a design center comprising five specialty labs and a gallery space. The director of the design center will lead the transition into this new space, oversee the design center and its resources, and collaborate with faculty and academic leadership to fully integrate design thinking with St. Mary’s outstanding IB and design-centered curriculum. The director of the design center will report to the director of technology and innovation, and will bring an expertise in design thinking and a relational approach to leadership to the role. St. Mary’s looks forward to welcoming the director of the design center to start July 1, 2024, or later by mutual agreement.

Posted June 1, 2023
Lab Instructor for The da Vinci Lab (DVL) for Creative Arts & Sciences
St. Stephen’s Episcopal School-Houston is looking for a full-time Lab Instructor for The da Vinci Lab (DVL) for Creative Arts & Sciences (DVL) – a makerspace for students in 1st  to 8th grades. The goal of the program is to offer a creative space for students that inspires collaborative learning and cross-pollination of learning techniques and creative skills. The Lab Instructor teaches maintenance and use of equipment, sets and delivers the yearly curriculum for DVL, and records the ways in which learning and making take place within the space.
Posted June 1, 2023
Maker Space Manager, UC Santa Barbara Library
Responsible for the day-to-day operational management of a new Library service for UCSB students to engage in making activities. Develops opportunities for experiential and project based learning with digital and non-digital creative technologies for varying skill levels. Maintains high levels of customer service in the delivery of Makerspace services. Supervises student assistants in providing peer-to-peer support for project design and creation and ensuring safe use of equipment. The inaugural Makerspace Manager will be an integral part of ensuring a smooth launch of the Makerspace and for informing the development of its service portfolio.
Posted February 28, 2022
Location: Cincinnati, Ohio, U.S.A.
Position Overview:

Seven Hills Middle School seeks an inspiring, high-energy, and passionate teacher to serve as Director for our signature Innovation Lab.  Housed in a specially designed makerspace in our new, state-of-the-art Middle School building, the Innovation Lab program engages students in a series of sequenced projects designed to foster design thinking skills. In an empathy-based approach, students consider the needs or challenges faced by others as they work in project teams to conceptualize, design, prototype, test, fail, iterate, and, in many cases, present their fabrications to authentic audiences.

In preparation for these projects, students learn a series of fabrication and design skills. Sixth-grade students develop basic skills as they work with hand, power, and digital tools on projects that include designing for others. Seventh-grade students dive more deeply into the engineering design process. They explore and develop spatial reasoning, empathy, and creative thinking skills as they take on a series of challenges. An eighth-grade Computer Science elective course teaches students to use loops, variables, functions and conditionals to build efficient and adaptable computer programs. Students also design, build, and program robotic devices. In addition to teaching courses, the Innovation Lab Director supports student-driven projects each day during lunch. All student projects increase in scope, complexity, and sophistication as they acquire new skills, but the basic formula is to help students learn to understand and empathize with challenges faced by others and to use their creativity and imagination to design effective solutions.

Saturday, April 13, 2024

The gears of my childhood, again!

Lessons from the Gear Thinkers

I’ve been rereading Seymour Papert's Mindstorms. I thought I had understood it. But I needed the update. Recently, I’ve been part of a curriculum reform which overall has created waves. This was partly because of leadership errors (a mix of good and bad interventions) and partly because middle class parents complain when Schools depart from traditional structures.

Whilst I was writing my interpretation (here) of “The Gears of My Childhood” (Preface to Mindstorms) I discovered a bunch of other interpretations in Meaningful Making book 3 (free download!). Some of them I thought enhanced my interpretation of the "Gears" article. I’ll quote some extracts. Hopefully, this might encourage some to read the originals. Even though my main goal is to clarify my own thinking about what to learn from Seymour’s gears reflection.

Gears of Learning by Ridhi Aggarwal, p. 10
Children should be given the opportunity to explore their questions like babies explore the world around them ...

Children would learn by doing only when they make things that are answers to their own questions. Based on this idea, we started a Question Hour in which children could just share their daily curiosities about anything and everything. They raised questions and discussed possibilities, and then they explored the ideas by making things.
Papert reloaded by Federica Selleri, p. 14
As Papert said, we need to create and take care of the conditions in which the learning process takes place, because the creation of cognitive models is closely linked to the experience associated with them.

Therefore, it is important to pay particular attention to the context in which the experience takes place, and to design it in such a way that it can be about generating ideas and not about running into obstacles. This means thinking about the tools you want students to use, and trying them out for yourself to evaluate their possibilities, but listening to the students’ hypothesis about how things work and supporting their investigations.
What makes a project meaningful? by Lina Cannone, p. 16
I believe that a synergy between teacher and learner must be nurtured. We must abandon pre-planned activities and projects that ignore the participation of the learner. We must give way to the co-planning of activities
Finding my Gear at Twenty-Three by Nadine Abu Tuhaimer, p. 21
After graduation, I realized that my love for tinkering with objects outshined my love for programming,

At 24, I decided to take the “Fab Academy – How to Make Almost Anything” course. This is a six month long intensive program that teaches the principles of digital fabrication

Since then, I’ve been teaching in the Fab Academy program and trying to incorporate what I learned with the different educational programs I run at the Fab Lab where I work, the first Fab Lab in Jordan.
Making means heads and heart, not just hands by Lior Schenk, p. 22
Car child did not become car professional — he became a mathematician. He also became a cyberneticist and renowned learning theorist, responsible for both the 1:1 computing initiatives and the constructionist movements rippling across education to this day.

Gears were, he describes, “both abstract and sensory,” acting as “a transitional object” connecting the formal knowledge of mathematics and the body knowledge of the child.

This notion of knowing — what it means to know something, to learn, to develop knowledge formed the central thesis of Papert’s career. Knowledge is not merely absorbed through cognitive assimilation, but actively constructed through affective components as well. Papert would assert, in other words, that we learn best when we are actively engaged in constructing things in the world. Real, tangible things. Things you can hold, manipulate, and feel in order to make sense of them.

Papert’s successes, as he would ascribe, were not due to interacting with gears as objects — rather due to falling in love with the gears as more than objects, as a conduit across intellectual and emotional worlds.

As Dr. Humerto Maturana said, “Love, allowing the other to be a legitimate other, is the only emotion that expands intelligence.”
Time to Tinker by Lars Beck Johannsen, p. 28
I believe that we need to help our students discover their own gears, and help them channel it into their projects whenever possible. I also believe that it is a teacher’s task to help students develop new gears. Another task is being aware of the way you learn. If something is easy to you, it is natural to believe that it is also easy for everyone else, but that is not the case. We need to help our kids to discover their strengths!

There are a few things that could make this happen. One is knowing your students! Not just on a factual basis but also on a more personal basis. How would you otherwise discover, what makes them tick, what they love, who they are?

I strongly urge all the schools I work with to make time for more project based, constructionist, student-centered learning. The after-school programs, which most kids attend because the parents are working, also need to be a more inspiring place to spend your time. A place to tinker, do what you love, make stuff together with other kids, and have fun!
Between the garage and the electronics workshop, by Mouhamadou Ngom, p. 33
To conclude, I would say that the most important part of learning by doing is careful observation. My secret as a specialist in electro-mechanics is to take careful notes. For example, before disassembling a mechanism, I mark the intersections between the different gears. This is why I ask learners to observe well, to listen well, and to document their work.
Find your unique gear by Xiaoling Zhang, p.35
Dr. Papert’s experience makes me think that it might be a natural human instinct to love fiddling with objects as a prompt to explore the world around us. By building and playing with things, we are also building the connections between ourselves and the physical world. When it happens frequently and reliably, then it becomes a way of thinking. It makes it easier when we see consistency in the world to believe that there are laws behind seemingly superficial phenomena and to discover even more possibilities.

… every child or every person has their own unique “gear.” But can everyone find their gear? Or can we help them to find something that THEY love and can be applied as a bridge to understand more abstract ideas and the world. It seems that unique gear can’t be cloned or taught, but must be discovered

SUMMING UP, the lesson from the Gear Thinkers:

  • Children should be given the opportunity to explore their questions
  • We must give way to the co-planning of activities
  • Listen to your students; pay attention to detail
  • Be a trail blazer! Setup the first FabLab in your location
  • Knowledge is actively constructed using hands, head and heart
  • Love is essential for optimal knowledge growth (of the objects we work with as well as human-human)
  • Know your students, personally
  • Everyone has to find their own gear. They might need help with this
  • Observe everything carefully

Wednesday, April 10, 2024

My Skinner Moment (updated 2024 reflection)

For a long time I really disliked the whole idea of Skinner's Behaviourism. This was a strong emotional feeling.

I saw behaviourism as drill and practice imposed by an authority figure, a teacher.

I came of age in the late 60s during the anti-Vietnam War movement. A stupendous social change occurred around about 1968. The government introduced their pull a birthday date marble out of a barrel military draft bill to send selected 18 yos to fight the Viet Cong. We began to question everything … racism, capitalism, imperialism, communism, Ho Chi Minh, Mao Tsetung, political power … everything. My friends were either locked up for 18 months for resisting the draft or went into the underground. There were many citizens quite happy to hide them.

Question everything.

With this backdrop do you think it would be likely that I would support a teaching methodology where the authority (the teacher) promoted relentless drill and practice. No way!

I also saw Skinner's absolute refusal to speculate on what happened inside the brain as a huge copout, as some sort of proof of the sterility of his whole approach.

As a methodology behaviourism seemed to symbolise the main thing that was wrong with School and Education. That it was BORING.

So, when I began teaching Maths and Science I followed authors who promoted creativity. An early interest in Science was 'The Act of Creation' by Arthur Koestler. Later, when computers entered education, I discovered the writings and Constructionist philosophy of Seymour Papert.

This history forms an emotional backdrop to this article. The action happened in 1996-97. When I realised that I had drifted into combining logo programming and behaviourist methods successfully in my classroom then it was a real shock, for a while I was in a state of disbelief.

So I had to write about it and theorise it. I'm still theorising it. For me this event was a difficult self reflection, an accomodation, where my view of the world suddenly crashed in the face of reality. This article covers a lot of ground - behaviourism, constructionism, learning maths, how to use computers in school, School with a capital 'S' (the institution of school and it's ingrained ways) and what works for the disadvantaged.

The Disadvantaged school setting:

Paralowie was / is a disadvantaged school in the northern suburbs of Adelaide, Australia ( 549 school card holders out of approx. 1100 students -- 1996 figures ). Although my new composite class was "extended" (representing the top 1/3 in ability at this year level) I didn't think the class was progressing particularly well in the 20 weeks I had taught them for up to the end of Term 3 before I started using the Quadratics software. I have already mentioned the poor skills of a substantial number of students when substituting negative numbers. Eg. substitute -2 into -2x^2 + 3x. Others resented the fact that they had been performing in the top half of their previous class but now were performing in bottom half of their new class. I had several requests from students to return to their previous class because the new work was "too hard". Poor attendance was a problem with about 3 students being away on a good day and up to 8 or 10 being absent on a bad day. Homework effort was poor from many because they had managed to get through with little homework in Years 8 and 9 and at any rate it is not cool to do homework. Moreover, in disadvantaged schools I find that it takes 2 to 3 terms for students to adapt and accept a new teacher and there is a continual behavioural testing out period during this time before things settle down.


(for instance, in the teaching of Quadratics)
A new reflection, rewritten April 2024


During 1996 and 1997 I wrote my own Quadratics drill and practice software in Logo to assist my teaching of the Quadratics topic to a Year 10 Pure Maths class.

The software was very successful in helping the students learn Quadratics (see companion article for evaluation of the software -- ‘Quadratics Software Evaluation

Paradoxically, I became uneasy about the success of the software, as I came to realise that I was using Behaviourist methods successfully. My uneasiness came from the fact that as a Logo enthusiast I was committed to a Constructionist educationally philosophy which is way down the other end of the spectrum of teaching methodologies from where Behaviourism lies. At one point I desperately thought to myself, "I have become Skinner, is there any way out?"

My uneasiness led to further study and reflection of the nature of behaviourism, constructionism and school -- this is the resultant synthesis of my dilemma.

What is Behaviourism and what is it good for ?

Behaviourism is the idea that rewards strengthen certain behaviours. That is correct as far as it goes. But behaviourism has never explained how brains learn new ideas . On page 75 of ‘Society of Mind’, Minsky says:-
"Harvard psychologist B.F. Skinner ... recognised that higher animals did indeed exhibit new forms of behaviour, which he called 'operants.' Skinner's experiments confirmed that when a certain operant is followed by a reward, it is likely to reappear more frequently on later occasions. ... this kind of learning has much larger effects if the animal cannot predict when it will be rewarded. ....Skinner's discoveries had a wide influence in psychology and education, but never led to explaining how brains produce new operants ..... Those twin ideas - reward/success and punish/failure - do not explain enough about how people learn to produce the new ideas that enable them to solve difficult problems that could not otherwise be solved ..."

So behaviourist methods, like a computer drill and practice program, may work well for a prepackaged curricula, which is the norm in senior maths courses. I'll use the teaching of Quadratics at Year 10 level as an example of what I mean.

Does School have a mind of its own? Is quadratics real maths!

Seymour Papert (1993) talks about how School assimilates the computer to do things according to how School has traditionally done them, as though School is an independent organism with it own set rules, procedures and homeostasis. How does School manage to achieve this, using this case as an example?

  1. By putting Quadratics into the Curriculum. Who ever questions that?
  2. By buying maths textbooks with lots of Quadratics in them. Invariably these textbooks break down the complex topic of quadratics into small parts and then relentlessly drill the students in practising those parts until "understanding" is reached.
  3. By telling students they have to do Pure Maths in Year 11 to obtain certain desired for academic and career pathways.
  4. By creating a pre-Pure Maths extended class in Year 10 for the top group to prepare them for the "very important" Year 11 Pure Maths class.

This raises a big question which is hardly ever asked: Is learning Quadratics in this way, real maths, anyway? Well, clearly Quadratics is in the Curriculum because it is pregnant with maths skills. There are number skills of substitution and calculation (BEDMAS), there is graphing using the Cartesian co-ordinates, there is looking for the change in patterns as the 'a', 'b' and 'c' values vary. There is derivation of formula, like Axis of symmetry = -b / 2a. Then we have square roots, unreal numbers, the full quadratic formula ... there is even Halley's comet, parabolic reflectors and chucking a ball up in the air, not to forget "problem solving" ...... what a list. Clearly, no respectable maths teacher or School would take Quadratics out of the Curriculum !!

That is the case for Quadratics in the maths curriculum. Are you convinced?

But, is quadratics the sort of maths we really need in schools?

Papert argues (Mindstorms, Ch 2 Mathophobia) that school maths in general quadratics in particular are in schools largely for historical reasons that have now passed us by. School math does not fit well with the natural ways that children learn and so becomes a series of not fun hurdles which become harder and harder to jump.

In Papert’s view, Quadratics became important for School maths because it fitted into pencil and paper technologies which were the best ones available when the traditional Curriculum was formulated.

“As I see it, a major factor that determined what mathematics went into school math was what could be done in the setting of school classrooms with the primitive technology of pencil and paper. For example, children can draw graphs with pencil and paper. So it was decided to let children draw many graphs… As a result every educated person vaguely remembers that y = x^2 is the equation of a parabola. And although most parents have very little idea of why anyone should know this, they become indignant when their children do not. They assume that there must be a profound and objective reason known to those who better understand these things.”
- Mindstorms p. 52

Seymour’s main response to this was to create a Mathland where learning maths fitted more into children’s natural ways of learning. The first thing he put into his Mathland was turtle graphics / Logo. His broader agenda was to invent children’s maths with the following design criteria:

  • appropriability principle … the serious maths of space, movement and repetition is appropriable to children
  • continuity principle … with well established personal knowledge
  • power principle… empower students to create personally meaningful projects
  • principle of cultural resonance …the topic makes sense in a larger social context to children and adults

I have come to believe that the maths we need in schools involves self directed exploration, creating ones own projects, play and problem finding as well as problem solving. The problem with the current maths learning environment in secondary schools is that it is very strong on teaching maths skills but very weak in creating learning environments where students will come to enjoy maths and become self motivated in learning it.

So there is the case against quadratics. Are you convinced?

Alienation and social sorting:-

Not one student asked me, "Why do we have to do Quadratics?" or "How do they relate to real life?" questions that I would have found very difficult to answer. However, many students did say (and some more than once), "the work in this class is too hard, I want to go back to my other (not extended) class". This put a lot of pressure on me as the teacher. I was trying to set and maintain a higher standard of work to prepare students for Year 11 Pure Maths. But if I pushed to hard I would have students coming to me and asking to be moved out to an "easier" class. The losers in this process were the advanced section of the class who in effect were being held back by the tail. All of these problems were substantially overcome shortly after I introduced my quadratics software.

One of the social functions of Schooling is to condition the clients for their role and social niche in later life. Maths with its traditional emphasis on sacred knowledge (like Quadratics) and marks is particularly well suited for this. I can see these forces at work in the student responses in the previous paragraph. There was a passive acceptance of the right of School to put the Quadratics hurdle in place. The advanced element of the class believed they could jump this hurdle and were comfortable with that. The less skilled and motivated members of the class had strong doubts about their ability to jump the hurdle and tried to organise a soft option. Even though many students at Disadvantaged schools may reject School ("school sux") with varying degrees of hostility it does not seem there is significant group consciously rejecting the right of School to make fundamental judgements about their future social niche in life.

My Quadratics software resolved some of these problems for students by making it easier or perhaps more interesting to jump the hurdle. But in the process it begs the question of what School maths ought to look like this in the first place.

Student needs and Teacher deeds:-

The software seemed to meet the needs of many students in a Disadvantaged school who want to do well in a preparatory Pure Maths class at Year 10 level. Hurdles were jumped by many who without the software would have failed to jump them.

Although my own teaching mode is constructionist by preference I find that in Disadvantaged schools a fair bit of repetitive drill and skill is required anyway, more so than what is required in a middle class school. Otherwise students simply forget basic concepts. At any rate a balance between constructionist exploration and drill & skill is always required. In My Opinion.

Back then, the leading advocate of Computer Aided Instruction (CAI) in the USA was Patrick Suppes. I was helped by Papert's non dogmatic appreciation of what Suppes was trying to achieve, as expressed in 'The Children's Machine':-

"The concept of CAI, for which Suppe's original work was the seminal model, has been criticised as using the computer as an expensive set of flash cards. Nothing could be further from Suppe's intention than any idea of mere repetitive rote. His theoretical approach had persuaded him that a correct theory of learning would allow the computer to generate, in a way that no set of flash cards could imitate, an optimal sequence of presentations based on the past history of the individual learner. At the same time the children's responses would provide significant data for the further development of the theory of learning. This was serious high science." (164)

Papert goes onto explore his reasons for rejecting Suppes approach which is an argument that Relationship is more central to how our minds develop rather than Logic. See Ch. 8 'Computerists' of 'The Children's Machine' for the full argument. Also I’ve added a footnote on Minsky’s view of the limits of logical thinking.

I then turned to Cynthia Solomon who has documented Suppes work in greater detail and discovered something that was very interesting. Computer based drill and practice programs (developed to a fine art by Suppes) do work and in particular they work best for disadvantaged students and schools! These programs do not work as well for middle class students! (Solomon, pp. 22 & 27).

I interpret the finding by Suppes, as reported by Solomon, that CAI drill and practice assists the Disadvantaged but not the middle class students in this way:-

  1. Middle class kids would be more likely to do their homework (put in the time at home to generate a significant number of parabolas so that the patterns would start to make sense) and so would not need the quick fix provided by a quadratics software program, so much.
  2. Middle class kids question the system of School but are more likely to stay and perform within it.

Disadvantaged kids are more likely to question the system, reject it and drop out of it, either physically or mentally.

Almost 3 decades later: still trying to resolve this dilemma!

To restate the dilemma -- I didn't like behaviourist approaches but I worked hard to make one work and it worked well!

After this experience I didn’t abandon the Constructionist approach. But I did begin to study other learning theories seriously as well. The list is long so I won’t go into all that here.

My initial response was to take this sort of position: There are different methods of teaching which range along a spectrum from Constructionist to Instructionist. What a good teacher does is walk the walk along this continuum, knowing when to employ each method.

My Skinner moment persisted, as did my Papert moment.

Another way to look at it is that the learning environment rules. At Paralowie I was lucky to have a Principal, Pat Thomson, who understood the benefits of setting up teachers in classroom environments that they wanted. I was setup in a room with old XT computers that no one else wanted and ran the logo on 3.5 inch floppy discs. 1990s nirvana, for me.

Later when that Principal left that room was transformed and I was thrown into a different arrangement. In short, I diversified. I had no real choice.

Later still, as a late career thing, I decided to focus on working with aboriginal students, the most disadvantaged cohort in Australia. With that group I have tried different variation of Direct Instruction. I think the evidence shows that is needed. This is another can of worms that would take too long to discuss here.

Still later, I have recently discovered Diana Laurillard’s The Conversational Framework (reference) which I think successfully integrates a wide spectrum of learning theories. I will publish on that theory shortly.

Finally, Conrad Wolfram also sees the need for a radical reform of the Maths curriculum ('The Maths Fix' (2020)). The debate goes on.


Laurillard, Diana. The significance of Constructionism as a distinctive pedagogy. Proceedings of the 2020 Constructionism Conference (free download). The University of Dublin, Trinity College Dublin, IRELAND
Minsky, Marvin. The Society of Mind. Picador 1987
Papert, Seymour. Mindstorms: Children, Computers and Powerful Ideas (1980)
Papert, Seymour. The Children's Machine: Rethinking School in the Age of the Computer, 1993, Basic Books
Solomon, Cynthia. Computer Environments for Children: A Reflection on Theories of Learning and Education, 1986, The MIT Press, Cambridge, Massachusetts


Minsky (1987) defines logical thinking as follows:-
"The popular but unsound theory that much of human reasoning proceeds in accord with clear cut rules that lead to foolproof conclusions. In my view, we employ logical reasoning only in special forms of adult thought, which are used mainly to summarise what has already been discovered. Most of our ordinary mental work -- that is, our commonsense reasoning -- is based more on 'thinking by analogy' -- that is, applying to our present circumstances our representations of seemingly previous experiences." (329)

Quadratics software evaluation

This was originally written in 1996. I also wrote an accompanying reflection at the time which I now think needs to be updated. So, I'm republishing this one with my new reflection, which is titled, "My Skinner Moment"

Paralowie R12 School
November 1996

I don't like drill and practice but it works, for some things

This year while teaching a Year 10 maths class I programmed my own Quadratics software in logo for student use.

The impact on the class was immediate and positive. Many students in the class had previously been bogged down in substituting negative numbers into quadratic expressions and getting nowhere fast. Suddenly, for them, things began to fall into place. Freed from the requirements of doing many rapid substitutions and calculations (generate table of values, draw graph, then start looking for patterns) they were suddenly able to see the relationship between the 'a', 'b' and 'c' values and the variation in shape of the parabolic curve. Rather than having to concentrate on the computation they could begin to concentrate on the patterns. By the 'a', 'b' and 'c' values I mean the values in this equation:- y = ax2 + bx + c and how changing 'a', 'b' and 'c' will effect the parabolic curve.

I was so encouraged by this turn-around that I began to burn the midnight oil adding extra features to my software. This was an interactive process because I was perceiving students needs in lesson time and changing the software at night to meet those needs.

I hadn't anticipated that so many students in this "extended" class would have major difficulties with "basic" skills that "should" have been mastered in Years 8 and 9. Yet when I presented students with an equation like:-
y = 2x2 - x + 3
and asked them to substitute x = -2 into it, then the success rate was not too high! So, one feature I added to my software was a drill and practice substitution into a quadratic equation. Students were given 'a', 'b' and 'c' values and an x value to substitute and required to calculate the value of the function, or the y value.

For example:
y = ax2 + bx + c
if a = -1 b = 2 c = 3 and x = -1 then what is y ?

I found that the software released me from "lecture mode" and I was able to use much more time meeting some urgent needs of individual students while the others were happily occupied with the program. I could spend substantial slabs of time with a handful of students who really did need quite a lot of help. I could feel the mood changing in the class. Equations and parabolic graphs could be generated in seconds rather than many minutes. The students were able to concentrate on the structure of the parabola and how it was effected by changing a, b and c values without being tormented by their low skill level (in quite a few cases) in calculating the substitutions required to draw the curve. I did receive a lot of spontaneous positive feedback from students about the usefulness of the software.

Another thing I noticed was that the more able students in the class quickly mastered the program. They accepted it as a challenge to be quickly mastered and did just that. Then some of them would boast about it, "too easy sir", comments like that.

So, I began to add more advanced features to my program, to extend the advanced element further, to push out the leading edge. How do you find the axis of symmetry in all cases? How do you find the y value at the turning point? How do you find the x intercepts in certain specialised cases? We have not yet got to the stage of doing the full quadratic formula (that is part of the Year 11 Pure Maths course) but with the aid of my software I was fast approaching that point with the advanced element of the class. The leading edge was being extended, visibly.

So my program was catering for the needs of students across the whole ability range. It could do that because I was writing it and rewriting it on a weekly basis. I see that as a major advantage over a commercial product.

Some students were thrown in their pencil and paper work when the quadratic had a large 'b' value and they had mapped out a table of x values from +3 to -3 and the axis of symmetry might lie on the edge or outside of this domain. Lacking any knowledge of the overall structure of the curve (importance of axis of symmetry and turning point) their performance in mapping the correct graph was poor in quite a few cases.

My understanding and appreciation of this problem and other nuances of quadratics increased dramatically in the course of writing the software. For instance, initially I made the program draw the parabola by starting at one end and drawing to the other end. This created all sorts of problems at the limits because as the equation changed so did the limits. The effect was that some of my curves did not even begin to be drawn, I couldn't keep them on the screen. I eventually solved this frustrating problem by starting to draw the curve at the turning point of the parabola, drawing one side to the outer limits, then jumping back to the turning point and drawing the other side. This problem solving process reinforced in my own mind the central importance of axis of symmetry and turning point in the teaching of quadratics. The mechanical plotting of x values between +3 and -3 often just does not work in the case of quadratics with large 'b' values because the axis of symmetry has moved so far to the right or left.

All the signs of a class being turned around from just battling through to success were there to see. Students became more engaged in the tasks, they asked many more questions than previously, you could visibly see the confidence of many students increase, they became more animated and more positive in their relationship with mathematics and the teacher. Moreover, I felt that I could set more difficult and challenging questions in the program and subsequent tests than I would not otherwise have been able to do.

Looking in my marks book I can see that at least 7 students out of 27 have turned their results around from failing badly to pass marks and in some cases highly successful marks. I'll cite some statistics from my marks book to try to convince, you, the reader (who wasn't in the room to see the change) that a very significant turn around did occur. The Quadratics unit was a 6 week block. I did not use the computer software for the first two and a half weeks because I had not finalised it. In that first two and a half weeks I was mainly using lecture, textbook and homework mode. I also used one interesting activity from MCTP (Algebra Walk, pp. 213-18). In the third week I tested the students only on their ability to substitute values into an equation (two quadratics and one straight line) and plot the graph (first test). The results were poor, average class mark was 56%. I then introduced the Quadratic software and used it extensively for the next 3 weeks. In week 6 I tested the students twice. For test 2 they had to plot a quadratic again and also make predictions from other quadratic formulae about how altering 'a', 'b' and 'c' values would affect the y intercept, axis of symmetry and whether the curve was upright or upside down. This time the average mark for test 2 was 82%, a remarkable improvement over the first test.

For the final test (test 3) I offered students a choice - either do a pencil and paper version or a computer version. Nearly all students opted to practice for the test on the computer and 11 out of 27 choose to do their final test on the computer. One interesting aspect of this was that the computer test was set up for mastery learning. If a student got a question wrong they were invited to try again. They couldn't proceed to the next question until they got the previous one correct. Initially I had programmed it differently, that if a student gave a wrong answer, they got a "no" message and then the problem just disappeared and the next question appeared on the screen. However, when I was doing the test myself, I found this feature incredibly annoying, that when I got the wrong answer, I didn't have the opportunity to try again or to reflect on my mistake in any way. So I changed it. If the technology makes it easy then it seems silly not to use it.

So, conceptually, the final testing process for students who opted for the computer version was very different. They were being continually informed of their progress score as they went along. If they got a wrong answer they were required to persist until they got it right. In their final score this appeared as a larger denominator. If they did the test and didn't like their progress, they had the option of starting over again if time permitted. The program simply generated different questions (of the same type) each time it was run, so it was no difficulty for me to offer multiple chances for retesting.

There was some interesting discussion at the end by students about their reasons for which type of test they chose. Some high ability students said they found practising on the computer very useful but clearly saw it as risky to do their final test on the computer, given their established mastery of the pencil and paper medium. Other high ability students were confident enough to take that risk. Other students said they found it easier to solve the problems on the computer. Some made comments like "its faster". This was interesting because the same problems (actually the computer test had a greater variety of problems) were being set in both mediums but many students clearly felt that it felt very different and expressed preference for one over the other. Another factor was that doing the computer test was more public, less private. The room is set up with the computers around the walls so that all computer screens face towards the centre of the room. This made "collaboration" easier ("cheating") but also made mistakes more public.

A comparison between the final test results was also interesting. I offered 3 tests in total over 6 weeks of instruction (12 * 100 minute lessons), the first two tests were pencil and paper only but in the last test students were offered a choice (either computer or pencil and paper). Mainly due to high absenteeism only 17 out of the 27 class members sat for all 3 tests. Fortunately for the last test (test 3), this group of seventeen split themselves into roughly two equal groups, one group of 8 who chose to do the computer test, the other group of 9 who chose to do the paper and pencil test. For the previous two tests (tests 1 and 2) the percentage results of these two groups was roughly the same (71% versus 68% average). But for the final test (test 3) the group who chose the computer test scored an average of 95% compared with 68% for the pencil and paper group. Quite a difference !

I have explained above that the two tests were not really comparable (even though the questions were of the same type) because the computer based test provided instant feedback and monitored progress. Once again I would argue that it would be ridiculous not to incorporate these features into the computer program since they greatly assist in keeping students focused and motivated. This introduces formative elements into a summative test, which from a learning viewpoint is surely a good thing.

Here is an example of how students who did the pencil and paper test were disadvantaged. One question asked for the 'a', 'b' and 'c' values of this quadratic:
y = x2 - 4

Two of the top students (averages in mid 90's for first two tests) in the class got confused on this question and made this elementary mistake:-
a = 1 (correct)
b = -4 (wrong, the answer is b = 0)
c = 0 (wrong, the answer is c = -4)

Since they made this mistake they also got wrong the y intercept, axis of symmetry and y value at turning point, losing 5 marks in total.

If they had been doing the computer test then they would have received instant feedback on their first error, b = -4, and would have easily corrected it (being in the high ability range), resulting in the loss of only 1 mark.

The program at this stage has these features as displayed in the main menu:-
  • Practice number skills
  • Vary 'a' value
  • Vary 'b' value
  • Vary 'c' value
  • Do my own graph
  • Work out the axis of symmetry
  • Test
    • Solve y = ax2 - c
    • Solve y = ax2 + bx
    • Solve y = (dx + e)(fx + g)

Final evaluation by students:-

I prepared a final evaluation sheet for students seeking their opinion of how they had learnt about quadratics. Twenty students successfully completed the final evaluation sheet. I asked them to evaluate 8 possible modes of learning according to this scale:-

1 = helped lots
2 = helped a fair bit
3 = helped a little bit
4 = didn't help at all

When I totalled the results the Quadratics software program came out on the top of the list: 10 students wrote that it helped lots, 8 said helped a fair bit, 2 said helped a little bit and none said that it didn't help at all.

"Indicate how much each of the following helped you learn Quadratics using this code. Write a number next to each statement below."

32 Quadratics software program
35 My own efforts in class
37 Help from friends, class mates
42 Help from teacher, one to one
49 Teacher explaining in front of the class
50 Doing lots of homework
54 Working through the textbook
71 Help from parents or other adults outside the class (eg. tutor)


Test 1 (end of week 3):

Average class mark = 56%

Plot these 3 graphs on the same set of axes. Show tables of values:-

y = 3x - 4
y = x2 + 4x
y = -2x2 + 2x - 1

Test 2 (week 6):

Average class mark = 82%

y = 2x2 + 4x + 1
Find y when x = 1
What is the y intercept?
Calculate the axis of symmetry (Hint: AS = -b / 2a)
Is the graph upright or upside down?

y = x2 - 2x - 3
Find y when x = 3
What is the y intercept ?
What is the axis of symmetry?

y = -0.5x2 + x
Find y when x = -2
What is the y intercept?
Calculate the axis of symmetry.
Is the graph upright or upside down?

y = x2 - 2x - 3
Calculate a table of values, eg. x = +3 to -3
Draw axes, plot the graph
What is the y intercept ?
Draw in the axis of symmetry.
Work out the x and y values at the turning point.
What are the x intercepts ? (there are two of them).

Test 3 (week 6) pencil and paper version.

Average mark for those who chose this test = 68%
Average mark for those who chose comparable computer test = 95%

y = -2x2 + 2x + 1
x = -2
Calculate the y value

y = 3x2 - x - 2
x = -1 Calculate the y value.
a = 2, b = 2, c = 0
Find the axis of symmetry.
a = -2, b = 4, c = 3

Find the axis of symmetry

y = x2 - 4
Find the a, b and c values
Find the y intercept
Find the axis of symmetry
Find the y value at the turning point
Find the x intercepts

y = 2x2 + 4x
Find the a, b and c values
Find the y intercept
Find the axis of symmetry
Find the y value at the turning point
Find the x intercepts

y = (x + 3)(x - 2)
Find the x intercepts
Then expand the brackets using FOIL and
Find the a, b and c values
Find the y intercept
Find the axis of symmetry
Find the y value at the turning point