CHAPTER II
REVIEW OF THE LITERATURE
Most educators agree that we must help learners to think and use their knowledge base to solve problems (e.g. Bruer, 1993; Glaser, 1992; McGilly, 1994; Schank, 1994). In fact problem-solving has been said, by some scholars, to be the single most important skill a human can possess. (Jonassen, 1997; Polya, 1957) In light of this it is a contradiction many current educational practices generally facilitate neither the effective structuring of knowledge nor the acquisition of problem-solving skills. This may be due to the desire for breath of curriculum coverage (Usiskin, 1997), improving performance on standardized assessments (Schank, 1995), the intractable disposition of teachers toward change, (Sarason, 1990), or educators insufficient knowledge of cognitive science and ways to teach relevant content that facilitates thinking. (Skemp, 1987).
Bruer (1993) asserts "acquisition of new knowledge
depends in predictable ways upon the interaction of existing knowledge, encoding process, and the instructional environment." This statement implies that the tenets of cognitive science may be fundamental to understanding the learning process. This chapter reviews pertinent theories of cognitive science and extrapolates how they apply to the design and implementation of effective curriculum.
Cognitive Theory and Instruction
Heckman (1993) defines cognitive science as the exploration of mechanisms by which people acquire, process and use information. Examining knowledge types and memory structures can help in the understanding of information processing and use.
Knowledge Types
Most authorities operate under the supposition that there are two knowledge types, declarative and procedural (e.g. Anderson, 1983: Bransford, 1990; Bruer, 1992; Lawler,1986; Schank, 1995). Others expand on this to include metacognitive knowledge (McGilly, 1994) that embodies contextual knowledge (Tennyson, 1991). Simply put, declarative is factual knowledge, procedural is skills knowledge, metacognitive is self-regulatory knowledge and contextual is situational knowledge.
Declarative knowledge is knowledge about the world and its characteristics. It may be stored as isolated and disjoint pieces of information or as a part of a body of interconnected pieces that are conceptually linked to other pieces (McGilly, 1994). According to Andersons (1983) Tri-code Theory these pieces of information can be represented as a temporal string, which encodes the order of a set of items; a spatial image, which encodes a spatial configuration; or an abstract proposition, which encodes meaning. The first two types, temporal string and spatial image, are declarative knowledge types and the third, abstract proposition, is part of procedural knowledge.
Procedural knowledge is knowledge of how we do things. These representations involve structure, category and attribute information (Anderson, 1983). This knowledge type is part of a production system that involves "condition-action pairs that specify that if a certain state occurs ..,then particular mental(and possible physical) actions should take place." (Anderson, 1987)
Metacognitive knowledge is knowledge of ones own skills, understanding, and abilities. This type of knowledge in needed to monitor and regulate ones own learning. A subset of metacognitive knowledge is contextual knowledge. Tennyson (1991) contends that the internal organization of information is based more on employment needs than on attributes or hierarchical organization. His research shows that individuals can solve complex problems only if they possess the necessary contextual knowledge, knowledge of why and when (Tennyson, 1991). Contextual knowledge includes not only information but also cultural aspects directly associated with that information (Brown, 1989). Here cultural means the selection of criteria, values, and feelings associated with the information of given contextual situations.
Memory Systems
Memory is not a single process (Halpern, 1996). It is rather a complex series of systems and processes within those systems, each with its own function and characteristics. Cognitive scientists are not in total agreement on the number or types of memory structures. The standard picture of memory architecture acknowledges the existence of a short-term memory where encoding sensory perceptions and metalevel processing occur, as well as long-term memory where declarative knowledge is stored and procedural knowledge is processed.
The conventional memory model asserts that sensory information enters short-term memory from the external world where it is initially encoded, related to information in long-term memory, then processed for storage or elimination. Information moves in and out of short-term memory relatively fast, within seconds. This is due to the limited storage capacity of short-term memory which is believed to be three to nine units of information depending on maturation level and memory span (Bruer, 1993; Halpern, 1992). The terms working memory and short-term memory are used synonymously by most authors (Bruer, 1993; Halpern, 1996; McGilly, 1994; Shank, 1994), but one author, Anderson(1983), makes a distinction between them. He prefers the term working memory over short-term memory and contends that the momentary capacity of working memory is much greater than the traditional view of short-term memory. The capacity is not discretely quantifiable but instead dependent upon the amount of activation created by a goal element. The capacity is still thought to be very limited.
Two systems, episodic and semantic, are thought to be part of declarative memory. Episodic memory enables humans to store and recall events in which we have participated, such as a birthday or a graduation. Semantic memory is where we store facts like addition tables or significant dates in history (Bruer, 1993; Halpern, 1996). Since both of these memory structures contain information people can easily talk about they are given the name declarative memory. Other tasks such as how to hit a curve-ball or interpolate regression lines requires a different kind of memory. This is typically referred to as procedural memory.
Procedural memory is remembering how to do something. Here knowledge is represented in the form of productions that are a series of condition-action pairs (Lajoie, 1993). These conditions specify some data pattern; then the basic action adds some new data elements to working memory resulting in a behavior or retrieval of other productions.
Information Processing
The nuclear theory of cognitive science regards information processing. According to this theory information enters the system and activates mental processes that result in physical or mental actions. This complex interplay between knowledge types, memory structures, and process transactions should be a primary concern to educators and learners.
A popular model of information processing asserts that information comes to us from the task environment (external world) through our sensory system and enters our working memory. The working memory acts in a way similar to a computers central processing unit as it discriminates what is to be temporarily retained for interaction with long-term memory and what is to be expunged (Bruer, 1993). The process of elaboration transfers information from working memory to long-term memory where it connects this new information with existing information, or, through rehearsal, stores the information as isolated facts.(McGilly 1994) Information in the long term memory relevant to the new information, and task at hand, is retrieved to working memory where it is combined with appropriate procedures resulting in a behavior. Germane to this whole process is how knowledge is represented in the memory structures and what types of production systems are used to manage the knowledge.
When exploring problem-solving, it is appropriate to focus on the associative structures of semantic (facts) memory and procedural (skills) memory. Cognitive scientists commonly refer to these structures in semantic memory as schemata (Bruer, 1993; McGilly, 1994; Shank, 1992; West, 1991). Schemata are network structures where pieces of information are conceptually linked to other pieces (McGilly, 1994). These spider-web like structures are in turn linked to one another by nodes where common information lies (Anderson, 1983). A schema is not only a knowledge storage location but also a structure to aid in the assimilation of new information (Bruer, 1993). Similarly, the associative structures in procedural memory form rules where the learner links certain conditions to certain actions (Bruer, 1993). In turn these rules can be linked together forming a problem-solving sequence or process schemata (West, 1991). Because our working memory seeks to store incoming information in related schema found in long term memory, our prior knowledge made up of existing schemata, influences what we notice and how we interpret the task environment (Bruer, 1993). Thus, prior knowledge, which affects how students interpret school instruction, should be a consideration in the design of curriculum and instruction.
John Anderson, in his book The Architecture of Cognition (1983), takes issue with the primacy of schemata theory in knowledge representation and processing. He claims that schema theory leaves unexplained the connection between declarative and procedural knowledge. Said another way, it is unclear how one gets any action between storage schemata and process schemata. He, like others, propose an alternate architecture that incorporates schema theory but adds postulates that more precisely explain information processing transactions.
Andersons ACT* theory (Adaptive Control of Thought) defines long term memory as having two distinct components: declarative memory and production memory. Information processing involves multiple and continuous interaction between working memory, declarative memory and production memory. The schematic, Figure 1, exhibits the rudimentary structural components:

Figure 1
Information Processing Model
Working memory contains information that the system can currently access. This can include information encoded from the task environment, information retrieved from declarative memory, or actions of productions from production memory. Encoding processes information from the task environment and deposits it in working memory; performance converts commands in working memory into behaviors; storage can create new records or strengthen existing records in declarative memory and retrieval reclaims information from declarative memory. In the match process, data in working memory are mated with the conditions of productions; lastly, execution transfers the actions of matched productions to working memory. This process of production matching and execution is referred to as production application. The circular arrow called application cycles back into the production box to reflect the fact that "new productions are learned from a cognitive study of history of application of existing productions" (Anderson, 1983).
Production system theories are a focus of ACT*. These productions are sets of condition-action pairs that constitute the procedural, how to, knowledge of ACT*. Central to this theory is the discrimination between declarative and procedural knowledge. ACT* theory maintains that all new knowledge is initially internalized in declarative memory in schemata like structures called tangled hierarchies (Anderson, 1983). Each node of this tangled hierarchy has an associated strength dependent on the frequency of use. This node strength in turn determines how likely a source of activation in working memory will spread through the declarative network (Anderson, 1991).
Sources of activation are important to the learner since it initiates the spread of activation and subsequent interactions between working memory, declarative memory and production memory. An object becomes a source of activation in working memory in three ways. First an environmental stimulus can be encoded into an activation source. Second, when a production executes, its actions can build structures that in turn become new sources of activation. Third, a goal structure placed in working memory by an individual can become a powerful source of activation (Anderson, 1981). Unique to a goal element is its ability to sustain activation without rehearsal, and unlike other sources of activation, a goal element will not spontaneously turn off (Anderson, 1983).
Relevant to the pursuit of problem-solving is the development of new productions and execution of existing productions, since they provide a connection between declarative knowledge and behavior. Clearly a fact can be committed to memory in a few seconds; however, it appears that new productions can be created only after much practice (Hannafin, 1989). Anderson states:
The acquisition of productions is unlike acquisition of facts in the declarative component. It is not possible to simply add a production in the way it is possible to simply encode a cognitive unit. Rather, procedural learning occurs only in executing a skill; one learns by doing. This is one of the reasons why procedural learning is much more gradual than declarative learning.
Productions form the systems procedural component (Anderson, 1991). As implied in the previous paragraph, productions are sets of condition-action pairs. The condition specifies some data pattern, and if elements matching these patterns are in working memory, then the production can be applied, resulting in behavior. Bringing declarative knowledge to working memory for access by productions provides some flexibility but also some limitations. The relatively small working memory space and time required for transfer places a heavy burden on working memory. In fact Anderson attributes many novice learner errors to this strain on the working memory. The more expert learner uses a compilation process to imbed declarative knowledge within productions so that retrieval from declarative memory is not necessary. This greatly increases speed and reduces errors (Anderson, 1981).
Knowledge compilation has two subprocesses of composition and proceduralization. Composition takes a series of productions and condenses them into one production that effectively sequences the original series (Lewis, 1978). Composition speeds up processing greatly. Proceduralization builds new versions of existing productions. These productions have pertinent declarative knowledge embedded in them thus alleviating the need to bring that kind of information to working memory for access. Having declarative knowledge incorporated in the production allows working memory to be available for a concurrent task. Relevant goal structures and active practice play major roles in proceduralization (Anderson, 1991).
The goal structures match declarative knowledge to appropriate procedures, but much repetition is needed before the knowledge is incorporated into the production, making retrieval from working memory no longer necessary. This is a relatively slow process. Andersons theory of procedural learning is one of learning by doing. His contention that information structuring and processing requires active learner involvement is supported by an avalanche of experts (Bransford, 1990; Bruer, 1993; Champagne, 1992; Checkley, 1997; Dewey, 1938; Greeno, 1992; Halpern, 1992; Heid, 1997; Lawler, 1986; McGilly, 1994; Means, 1997; Ohlsson, 1992; Rudy, 1989; Shank, 1995; Sternberg, 1996; Usiskin, 1997; West, 1991).
The prevailing principles of the aforementioned theory of cognitive architecture is reflected in Roger Shanks (1995) Acquisition Hypothesis:
When considering what someone should know, it is
vital to simultaneously consider how they will come to
know it. How we learn determines what we learn.
He contends that in a natural learning process an individual must adopt a goal (goal structures), generate questions (accessing related declarative schemata and productions), and develop answers (construct productions containing specific knowledge). Shank(1995) claims the process of wondering about self generated questions creates indices and accesses schemata that facilitate transfer and long-term retention.
Expert and Novice Characteristics
"Learning is the process by which novices become experts." (Bruer, 1993) This quotation makes clear the need to examine characteristics of expert and novice learners. Along with the characteristics comes the need to investigate the means that move a learner on the novice/expert continuum.
Expert behavior is a manifestation of three faculties: the types of knowledge people have, the organization of that knowledge, and the methods they learn to process or monitor the knowledge (Bruer, 1993). Hirsch emphasizes the necessity for domain specific knowledge in learning, and others agree that this is a trait of an expert learner (Bruer, 1993; Carroll, 1997; Glasser, 1992; Mestre, 1986; Ohlsson, 1992). Within this knowledge rich domain, experts employ strong methods, methods that are situation and domain specific. Novices, on the other hand, tend to employ weak methods that are widely applicable and require little specific knowledge. Weak methods applied by novices tend to produce errors because they are overly general, causing problem-solving steps to be performed in situations that are not appropriate (Ohlsson, 1992).
Expert-novice perception and organization of their knowledge differ in significant ways. Anderson (1983) feels a key to expertise in many problem-solving domains, like geometry, is the experts ability to perceive patterns. This capability is facilitated by the development of data driven productions that combine specific declarative knowledge with productions through active learner practice.
In one instance Bruer (1993) observed experts perceive and categorize elements of a problem-solving task in physics according to deep features such as physical principles or laws. The novice sorted problems by surface features such as key word identification or