AKRI

Papers : Machine Intelligence in a Human Environment - 1993

download pdf file

John L.Gordon AKRI.

Abstract

This paper describes a research project which considers the application of Artificial Intelligence to a monitoring and control system which shares an environment with humans. The system is installed in a domestic building and works with real world data. A highly constrained sensor network provides information to the system. The system must then attempt to create a model for control which is acceptable to the human occupants of the environment. Results from extended operation of the system show that various techniques can integrate and operate successfully in this environment. The conclusion attempts to show that integrated work of this sort is essential to the development of intelligent systems.

1. Introduction

The work described in this paper was the subject of a Ph.D. research project carried out at Liverpool Polytechnic (Gordon 1992). A working system was installed in a Domestic Environment and detailed results were obtained during a twelve month operational period. The research project lasted for four years. The system was operational for most of this time but under development for the first three years. Justification for an approach based on AI, considering alternatives such as automata and the transition matrix, has been discussed (Gordon 1990).

1.1 Broad outline of system

The system is based on two main processing elements which are linked to a small building through a sensor and actuator network. A simple processor called the Autonomic Microprocessor Controller (AMC), provides direct environmental interface and some user interface through a small keypad and display. This processor is connected to a larger processor called the High Level Processor (HLP), via a serial data link. The purpose of the HLP is to monitor the occupation of the building through the sensor network and to make inductive observations. This function is termed Occupant Location Monitor (OLM). Figure 1 shows the theoretical structure and connectivity of the complete system.

Figure 1 : Structure of the Research Project

Figure 1 : Structure of the Research Project

The AMC is so labelled because it provides fast, reflex control of the environment. It operates from a simple rule set and can perform independently of the HLP. Data from the AMC is relayed to the HLP, where higher level computation attempts to maintain a computer representation of the environment; chiefly the occupation of the environment.

Occupant location is heavily constrained by the use of a very simple sensing technique. Sensing people is done either by a single movement detector in each room, or from external doorway contacts. It is the highly constrained nature of the sensing method which forces reliance on inductive processes and learning, to attempt to maintain a model of the occupation of the environment. This model is intended to show the number of people who occupy any room in the premises at all times.

Higher level operations rely chiefly on the use of two parallel memory models. Human analogies for such memory models discussing long term, short term and working memory, have been developed previously (Narayanan 1986). Short Term Memory (STM) is used to remember the most recent movement related events which arrive from the environment via the AMC. It is also used as the focus for the inductive processes which deposit a memory model at every cell in STM. This memory model reflects the current belief in the occupation of the environment.

Long Term Memory is provided with averaged sensor data by the AMC. Unlike STM, LTM receives information about all sensor and actuators including temperature, light level etc. The information in LTM is generalised and applies to a span of time and not a time instant. This information is intended to provide predictions about the likely state of environmental sensors, to assist in event evaluation. LTM is also intended to be an experimental vehicle to evaluate the benefit that learning may bring to a Machine Control system which is to operate entirely in a Human Environment with heavily constrained sensing.

1.2 Description of the environment

A typical HLP main screen is shown in figure 2. A plan of the building shows the state of each room movement sensor, a star indicating an active sensor and a box an inactive sensor. The number of people believed to be in each room is also shown on the plan. This information is taken from the most recent cell in STM.

The AMC data table on the right of figure 2 shows the state of several of the other environmental sensors. Other information shown in figure 2 includes data communication from the HLP AMC link, messages from various system managers and an area reserved for user input through typed commands. Entry to and exit from this environment can be achieved through doorways marked L and M in figure 2.

The environment is monitored via a series of single bit digital sensors which detect movement in rooms, or doors open or closed. Analogue sensors measure inside and outside temperature, light level, water temperature and electricity consumption.

Six lights are under direct digital control along with the water heater and central heating. Other input/output is allocated to the user interface and communications channel

1.3 Objectives of the research

The main aim is to investigate the use of machine intelligence to improve the information which is available from simple sensors in a small building. Attaching heavy constraints to the sensing network means that heuristic information has to be added in order to obtain a working system.

The techniques used in this work are also being investigated. In an environment which requires rapid motor response coupled with complex processing, the dual AMC HLP system can be evaluated. Memory models which are familiar to AI research are also the subject of investigation in a particular application domain. Short Term Memory can be evaluated in its role as system interface to immediate processing and memory indexing whilst Long Term Memory can be evaluated to discover if useful information may be learned and rapidly accessed when required.

This work has provided an opportunity to investigate the application of AI techniques in an embedded environment.

Figure 2: Typical Information Screnn from the HLP

Figure 2: Typical Information Screnn from the HLP

2. System Description

Section 1.1 provided a broad outline of the system and introduced the component parts. A more detailed description of the system will now be provided.

2.1 The AMC

Figure 1 shows that the AMC is coupled directly to the environment through sensors, motor effectors and a user keypad and display. The AMC can also be seen to operate from a set of simple control rules of the form :-

If the light level < switch over value

and the kitchen is occupied

and overrides are not in operation

then set the kitchen light timer for a fixed time period

If timer for device x is set

then operate device x If timer for device x is zero

then make device x inactive If people have been in bed > 6 hours and the water temperature is < 48°C

and overrides are not in operation

then bring the water to 48°C

Environmental sensors are monitored in a continuous loop. The length of the loop is slightly variable but averages 6ms. This period is well within the time constant for all digital sensors used in this work and more than adequate for analogue sensors. Sensor readings are averaged to provide raw data for LTM. User interface and control functions are part of the basic processing loop.

Higher level control rules are performed more infrequently along with other less urgent system maintenance tasks. AMC-HLP communications is carried out as a parallel operation.

The AMC uses a matrix of variable timers to maintain the operation of peripheral devices. Timers are also controlled by overrides and set points. Control options may be performed autonomously by the AMC or through direct intervention from the HLP. Intervention is most successful if performed through an intermediate set point.

The AMC is so labelled because it provides fast, reflex control of the environment. It operates from a simple rule set and can perform independently of the HLP. Data from the AMC is relayed to the HLP, where higher level computation attempts to maintain a computer representation of the environment; chiefly the occupation of the environment.

Occupant location is heavily constrained by the use of a very simple sensing technique. Sensing people is done either by a single movement detector in each room, or from external doorway contacts. It is the highly constrained nature of the sensing method which forces reliance on inductive processes and learning, to attempt to maintain a model of the occupation of the environment. This model is intended to show the number of people who occupy any room in the premises at all times.

Higher level operations rely chiefly on the use of two parallel memory models. Human analogies for such memory models discussing long term, short term and working memory, have been developed previously (Narayanan 1986). Short Term Memory (STM) is used to remember the most recent movement related events which arrive from the environment via the AMC. It is also used as the focus for the inductive processes which deposit a memory model at every cell in STM. This memory model reflects the current belief in the occupation of the environment

Long Term Memory is provided with averaged sensor data by the AMC. Unlike STM, LTM receives information about all sensor and actuators including temperature, light level etc. The information in LTM is generalised and applies to a span of time and not a time instant. This information is intended to provide predictions about the likely state of environmental sensors, to assist in event evaluation. LTM is also intended to be an experimental vehicle to evaluate the benefit that learning may bring to a Machine Control system which is to operate entirely in a Human Environment with heavily constrained sensing.

2.2 The HLP

The functions of the HLP were originally developed in LISP but later ported to 'C' and extended to allow real time operation. The HLP maintains a user interface and an open channel to the AMC. A menu system and simple command line processor are available for user analysis and control of the system. Control of the system is not the intention of the initial research. Analysis of automated performance and learning is the focus of investigation.

2.3 Occupant Location

The heart of the HLP is the Occupant Location Monitor. This function relies on an evaluate, apply, error, loop and has been influenced by work on temporal differencing (Sutton 1987) and incremental learning (Vrain and Lu 1988). Communications with the AMC provides movement events in a holding buffer. These events are added, in order of arrival, to the top of Short Term Memory for evaluation. Evaluation takes place within STM by considering the current world state, the most recent event and the recent history of events and believed world states. The evaluation process produces a list of possible explanations for the event, given the current belief in the state of the world. Explanations are given a likelihood value based on several simple heuristics. The heuristics from earlier work have been slightly modified following analysis of operational results. Heuristics include:

An account of the type of movement predicted and Consideration of movement sequence.

The most likely explanation from the list is then applied to STM to give a new world model which is believed to represent the real environment.

The OLM actively searches for errors and potential errors in the world model. In an environment where real data is sparse, errors are an important part of the processing cycle. The AMC can provide autonomous control of the environment during high level communication black out caused by error correction.

2.4 Memory Models

Memory models used in this work (Gordon 1992) are generally based on those models proposed by researchers in the cognitive and AI domains. The AMC provides reflex or Autonomic control where little memory is required. The precise nature of a response by the AMC can be tuned by the HLP in terms of length of activation of a lamp or heater etc.

STM forms a close link with environmental events and is the focus of the decision making process. LTM contains a more generalised memory which may be used in a predictive capacity to help the system make better live decisions.

3. Short Term Memory and Event Evaluation

In this work, STM provides a memory space for the most recent movement related events from the environment. Experiments have lead to the use of a STM which contains 800 memory cells. This is considerably more than the seven items proposed for human STM (Miller 1965) but is based on extensive experimental observation and is particular to this area of Machine Intelligence.

3.1 The structure of STM

Figure 3 : The structure of each STM cell

Figure 3 : The structure of each STM cell

The structure of the Short Term Memory model used in this work is dictated be the need to provide a specific machine function. STM is the focus of event evaluation and the maintenance of a world model of the occupancy of the building. Figure 3 shows the structure of each of the 800 STM cells used.

Each cell contains important information which is derived directly from the environment. This information includes:

  • Event identification (which movement sensor fired)
  • The time of the event
  • The time slot number 0..511.
  • State of all movement sensors (Active or Inactive)

A world model contained in each STM cell shows the OLMs belief that each room in the environment is occupied by a certain number of people. The decision tree route pointer contains a number which represents the path taken through the decision space during evaluation. This number will normally be zero unless error correction has been performed at this cell. The node memory and pointer are provided to avoid redundant searching.

STM also contains predictive information which is provided by an appropriate structure in LTM. The information contains predicted values for each sensor which cover the current time period.

3.2 Adding new events

When an event occurs in the environment, it is immediately processed by the AMC which also takes any necessary action such as operating a light. The AMC also relays information about the event to the HLP. This information includes sensor identification, the activity type, rising or falling and the precise time of the event.

At the HLP, the event information is placed in a holding buffer. Normally, the event is immediately transferred to the top of STM along with the previous copy of the world model. STM is then sent for evaluation.

3.3 Evaluating events

During event evaluation, the OLM prepares a list of all possible explanations for an event. This list is based on room connectivity and on heuristic information concerning possible sensor imperfections. For instance, a sensor may not fire when a person moves within a room for one of several reasons. There is not space within this paper to discuss the details of event evaluation but an overview can be provided.

Movements can be divided into categories which reflect their likelihood of occurring. Movement between adjoining rooms and movement within a room are the most likely type. Movement through rooms which have active sensors, and therefore will not fire to register a new movement, have been shown to occur frequently. Movements which are the result of sensors not firing at all are least likely and can be said to occur only rarely.

Additional factors influence the likelihood which can be attached to any explanation for an event. The time difference between the firing of sensors in a room and the previous firing of a sensor in an adjoining room can be shown to be an important factor in deciding which explanation for an event is the most likely. The number of people thought to be moving during any event will influence the likelihood value attached to that explanation; favouring fewer people moving. The list of explanations will also include combinations of separate explanations.

When the list of explanations is complete and each has a likelihood value attached, the explanation which is thought to be most likely is used to derive a new world model for the current cell in STM. This represents a path zero through the decision space.

3.4 Error Detection and Correction

Errors may arise in one of two forms. A catastrophic error occurs when an explanation for the current event cannot be found. If a catastrophic error occurs, the OLM must have made an error in a previous world model derivation. An error correction mechanism back tracks through STM in an attempt to discover where the previous error has occurred and to rectify the belief history so that the latest event can be explained from the previous world model. Doyle's Truth Maintenance System offers an alternative approach to problems of this type (Doyle 1979) in that the work shows how events are stored and reasons for belief are maintained to help in the belief revision process and dependency-directed correction.

Error correction is a form of best first search which also provides initial tree pruning to locate a potential source of error. The heuristics which order the likelihood list of possible explanations for an event provide the best first ordering of tree branches to try. Figure 4. shows a section from a typical search tree.

Figure 4 : Section of a decison tree showing best first route

Figure 4 : Section of a decison tree showing best first route

A great deal of redundancy exist within this type of event tree. An example of this is shown in figure 5. A temporary memory mechanism within the STM structure is used to eliminate redundant search.

The second type of error is termed non-catastrophic because new world models may be derived which result from the latest environmental event. Non-catastrophic errors are tested for when the OLM finds sensor patterns which are not normally expected. For instance, if a room sensor is inactive for an extended period of time and the world model shows that room to be occupied, then it is possible that in fact the room is not occupied and that a parallel and successful model exists which shows this. The new model can be searched for by the error correction method.

Figure 5. Section of a decison tree showing redundancy

Figure 5. Section of a decison tree showing redundancy

The existence of non-catastrophic errors adds complication to the strategy used for error correction. Correction of non-catastrophic errors and subsequent firing of the sensor in the now empty room can lead to a situation where the explanation for the sensor firing cannot be found in the search space because the previous error correction has blanked that search space from view. A strategy which includes partial and full rebuilding of STM can provide a solution to these problems.

4. Long Term Memory and Feedback

There are two main reasons for investigating a Long Term Memory structure in this work. The most immediate reason is to evaluate the influence which long term learning may have on immediate event evaluation. This involves attempting to discover if predictions about the likely activity of sensors can have a positive influence on OLM functions.

The second reason is to observe the generalisations which may be learned from the environment and examine how these can match instances of building occupation.

4.1 How learning may be of value

The most direct way that learned information about sensor activity can be of value in this system is to form the basis for the prediction of likely sensor activity. The state of sensors during particular time periods would be the source information to be learned. Other factors could form the basis of learning such as the state of building occupation during certain time periods. This option was not implemented during the research work because the world model is the subject of uncertain reasoning and would have added a further level of uncertainty to the structure of LTM.

Investigations of a LTM structure following an extended learning period may also provide information which can affect the design of subsequent systems.

4.2 The structure of Long Term Memory

This system learns from averaged sensor and motor data. Sensors and actuators are averaged by the AMC over a base period of 2.8125 minutes which is 1/512 part of one day. Considering the sensors and motor effectors used, shorter periods than this were thought unlikely to contain useful data. The HLP carries out subsequent averaging to provide periods which cover greater parts of each day and contain average value and trend of activity.

During this project, and resulting from the monitoring of environmental events, a period of 22.5 minutes was chosen to be the source period of averaged data for input to Long Term Memory. The original specification of LTM called for a series of hierarchical time periods ending with the base period of 2.8125 minutes. These periods were originally called Sensor Slots (SS) and each level was identified with a letter. So SSA periods represented one day, SSB periods half of one day, SSG periods 22.5 minutes and SSJ periods represented 2.8125 minutes.

Experiments have all been conducted using the SSG period, which is believed to be the period most likely to capture generalisations about the human occupation of buildings. Other researchers investigating the efficient arrival and displacement of lifts (Al-Sharif 1992), for instance, use periods of five minutes for temporal estimations. This is not incompatible with the base period SSJ. The SSG period is likely to be more generally representative than a five minute period and be less prone to random fluctuations.

LTM is chiefly indexed through time. Each end effector may have many memories associated with an SSG period. A memory cell is the basic building block of LTM and is a frame like structure (Minsky 1975) formed around a value and trend with a set of associated contextual information.

In the initial research work, each memory cell contained contextual information but was otherwise isolated from links with the rest of LTM. A more recent proposal for the structure of LTM and each memory cell, shown in figure 6, contains additional temporal and environmental context which is intended to improve the memory selection process. The main improvements in this structure include links or pointers to other memory cells related to environmental context and to temporal sequencing. These links also contain strength of link measures so that the belief in a context can be estimated. The importance of context to Machine Learning has been considered by other researchers, typically (Michalski 1983).

Figure 6 : The Structure of LTM

Figure 6 : The Structure of LTM

4.3 How learning takes place

Information is gathered by the HLP throughout the day. This information is averaged and relayed by the AMC at each new SSJ period. At the end of each day, all hierarchical time periods are constructed from the base data. The SSG period is currently the only one used in learning. It is intended that learning will take place during a period in which the HLP is otherwise lightly loaded. In the current system, the time which learning starts is 3:00am. Learning will slow the response time of the OLM if environmental activity is high. The learning process should be complete within a 30 minute period but may last as long as 2 hours in the unlikely event of high environmental activity. Again. the AMC will provide effective control buffering for the environment during the learning period.

Details of the learning method adopted are outside the scope of this paper but the general ideas will be presented.

For each SSG period, end effectors are examined in turn. The new environmental value is matched with appropriate memories in LTM taking value, trend and context into account. If an appropriate match is found, the existing memory cell in LTM is re-enforced and various contextual links are modified to reflect the addition. A utility value for that memory is also adjusted to reflect the recent utilisation of the LTM cell.

Neighbouring LTM cells (SSG+1 and SSG-1) are also tested with the new memory. Cells from these periods may also be re-enforced if found but by less than the cell found at the correct SSG period.

If no match is found for the new memory in LTM, the new memory is placed in an empty cell position. If no empty cell positions remain, memory is full, then the least useful cell is displaced by the new memory. The process of finding the least useful LTM cell employs utility and time since last used as part of the selection, in a similar way to the strength of knowledge calculation used by (Anderson 1990).

4.4 The application of Long Term Memory data

The LTM structure contains many memories for a given sensor or actuator. The OLM must select an appropriate memory given the current environmental conditions. The memory for any sensor may then be used to anticipate or predict the likely state of that sensor or actuator in the near future. Knowing the likely state of every sensor in the near future can have a positive influence on the ordering of the list of explanations for an event at any instant in time.

STM is used as an index to assist in selecting the most appropriate memory from LTM for each sensor and actuator. An SSG number or period of the day will be used as the major indexing mechanism. The OLM will usually need to know the predictive information available from a memory in the current SSG frame. When this is the case, the most recent information from the environment is used as a key to an appropriate memory structure in LTM. Matching is performed between current and remembered environments. A memory which normally follows the one that has just been used is also favoured in the selection process. This provides temporal context to the selection process.

Once suitable memories have been recalled, the value and trend information from the memories are used in the evaluation process. For instance, if an event occurs in room A which has doorways to occupied rooms B and C, the initial evaluation may have preferred that a person moved from room B to room A. Now if sensor perdition shows that in the near future, room B is likely to be active but room C is normally quiet, this will provide support for the argument that a person moved from room C to room A and not B to A.

5 Operational Results

Results from the research have been presented in a thesis. Subsequent observations and several minor modifications have lead to the collection of additional data concerning the performance of the system.

5.1 General Observations

The HLP main screen provides a simple viewing option where the performance of the OLM can be evaluated at any instant in time. Observations of this window, though subjective and irregular, show surprising compliance between the real world and the occupation believed by the OLM. Furthermore, an operational log shows that where additional people are believed to be in the building, they are normally shown to be in otherwise occupied rooms. This being the case, a control system based on this world model would provide appropriate control decisions. During the original period of data recording, observational results were used to supplement the detailed results available in log files. Out of 186 observations of building occupation, the OLM contained the correct number of occupants in the correct rooms for 118 observations (64%). This is a positive indicator considering the low level of input data to the system.

Results following the initial research have not been as detailed but still show useful compliance with the real environment.

5.2 Using a performance measure

Taking objective results from the performance of the OLM in a real environment over an extended period of time (over 300 days) is difficult if no actual measure of the state of the environment is available. In order to provide a performance indicator for the OLM, the proportion of time spent performing catastrophic error correction, over run time, was recorded automatically. The time which the building was known to be unoccupied was excluded from these calculations.

Results over the period from October 1990 to January 1992 showed that the percentage of time spent performing error correction when LTM was not used was 1.091%. During periods where LTM was used to influence on line decisions, 0.8985% of the time was spent performing error correction. The total standard deviation for all results was 0.188 and the difference between results using and not using learning was 0.1925. These results offer a slight improvement over those presented in the original thesis and represent a total of 340 days of detailed records.

Additional results are also available but cannot now be compared with previous work because of alterations to the error correction method.

5.3 Observations concerning LTM

The figures from the performance measure suggest that the OLM makes fewer mistakes when LTM is used to supply predictive data for on line decision making.

Examination of LTM reveals that some memories are re-enforced on a regular basis and those memories have persisted for over twelve months. Other memories are little used and may only have applied on the occasion of their creation. Such memories will have little chance of survival over extended periods of time.

Memories of other system parameters have shown a useful trend during extensive use. The AMC always operates in one of three modes. Mode one is normal daily operation, mode two is usually selected at night when the occupants of the premises are in bed and mode three is selected when the premises are empty. These modes clearly form the main source of alert state for the security sub-system.

Figure 7 shows a graphical view of the AMC mode as predicted by LTM and which applies generally on Wednesdays. The thickness of the line reflects the systems belief in its prediction. This prediction is generally accurate and would clearly be of use to a more intelligent security sentinel.

Figure 7 : Generalised Security Information from LTM

Figure 7 : Generalised Security Information from LTM

6 Conclusions

This research project has been conducted in the belief that the investigation of complete systems which encapsulate a range of techniques from the field of Artificial Intelligence is of considerable value to the science. Research in specific areas of AI which provide theories of a fundamental nature are clearly the corner stone of the subject. Work of a highly integrated nature provides the opportunity to discover how theories and techniques can integrate to provide intelligence at an application level.

This work employs several approaches but chiefly investigates the role of Autonomic and Intentional processors in a machine control application and the use of specific memory mechanisms in the provision of higher level functions. Results from the work are encouraging and suggest that further development of the current system is worthwhile. Current development is taking place chiefly in the area of Long Term Memory and prediction. Better methods for error correction are also required and the use of LTM to control parts of this mechanism is suggested by previous investigations.

Placing heavy constraints on environmental sensing has forced the processing system to develop in support of the sensor and actuator network. This seems a more natural way for intelligence to develop in machines rather than attempting to marry incompatible processors and sensing methods. A next generation system would not go too far in the advancement of the sensors used but improvements are intended. Improvements in the integration of the HLP and AMC are also desirable. The AMC has already provided adequate control cover during the experimental period and could take on a greater role in decision making. Decisions would be adapted by interference from the HLP and only overridden in exceptional circumstances. The next version of the AMC should be able to accept re-definitions of sensor control loops. The current version has fixed loops which are subject to tuning or switching by the HLP.

Finally; to look back at the integration of sensors and processors may be to look back at the emergence of the processors which have eventually developed intelligence. Memory models used in higher level processing should be appropriate to the sensing employed and the control required.

7. References

  1. Al-Sharif L.R. and Barney G.C. "Lift system symulation and loading-Journey curves", Control systems centre report No 750, UMIST Manchester, 1992
  2. Anderson J.R., "A Theory of the Origins of Human Knowledge." Machine Learning Paradigms and Methods, Ed Jamie Carbonell, MIT Press, pp313-351, 1990
  3. Doyle. J. "A Truth Maintenamce System." Artificial Intelligence, Vol 12. pp231-272, 1979
  4. Gordon. J.L., Williams D. & Hobson C.A., "Deriving complex location data from simple movement sensors." Robotica Vol 8. pp151-158, 1990.
  5. Gordon. J.L., Williams D. & Hobson C.A., "Using Short and Long Term Memory to induce environmental information from simple events." Robotica, Vol 10. pp 65-74, 1992
  6. Gordon. J.L., "Investigation into the Application of Artificial Intelligence to Small Building Management" Ph.D. Thesis. Liverpool Polytechnic. 1992
  7. Michalski R.S., Step R.E., "Learning from Observation: Conceptual Clustering." Machine Learning, Eds Michalski, Carbonell and Mitchell, Tioga Publishing Co. 1983 pp331-363, 1983
  8. Miller G.A., "The magical number seven, plus or minus two, Some limits on our capacity for processing information." Psychol Review, 63, pp 81-97, 1965
  9. Minsky M. "A Framework for Representing Knowledge." The Psychology of Computer Vision, Winston P. (ed), McGraw-Hill, New York, pp211-277. 1975
  10. Narayanan A., "Memory Models of Man & Machine." Artificial Intelligence, Principles and Actions.Ed Masoud Yazdani, Chapman & Hall, pp 226-259, 1986
  11. Sutton. R.S. "Learning to predict by methods of temporal difference." Machine Learning. Vol 3. pp 9-44, 1987
  12. Vrain. Christel & Lu. Cheng-Ren. "An Analogical Method to do Incremental Learning of Concepts." Proceedings E.W.S.L. October. Pitman. Ed Derek Sleeman. pp227-235. 1988