2.0 Ingredients for Building an Intelligent System
As one goes about building an intelligent system for an application, one needs to consider a few issues. In this section, we review them so that they can be highlighted when concrete examples are used later in the article.
2.1 Problem Context: Business Value and People Impact
Civilizations have long dreamt of creating a caring and progressive society. A reasonable and common-sense definition of such a society may be where people respect each other, grow according to their potential, have basic comforts and live peacefully with their environment. There can be off-course alternative principles grounded in politics, religion or ethics, but for the purpose of the article, we will adopt a common-sense definition that is neutral to extreme interpretations. In its modern form, the United Nations has defined millennium development goals towards such a vision . One may argue that the trend of sustainability, or Smart City  as popularly called, is a step towards meeting the vision whereby Information and Communication Technology (ICT) is used to manage precious resources like water, land, air and food efficiently.
As far as technologies go, Artificial Intelligence (AI) is the technological flavor of the day. When one looks at AI for improving society, apart from the issue of how a specific technical problem will be solved, one should worry about the wider context:
- Should all insights that can be generated, be communicated?
- Should all activities that can be automated, be automated?
- What testing should be mandatory to deploy an automated product to work with humans?
- Who should be credited when something works or breaks?
Answering them needs going back to ethical questions of what we want:
- Do no harm to humans; treat them without bias from technology perspective
- Do no harm to all life forms and environment
Unfortunately, easy as this seems, stakeholders (businesses, researchers, governments) often do not consider the full picture and pass responsibility to each other causing potentially end harm to public in the long run.
Since an AI system works with data, data is often the logical starting point to understand how such a system may work in practice. Most common form of data are enterprise data in large databases, social data collected by collaboration companies, sensor data from Internet-of-things (IoT) devices and open data. While access and price of data is an open problem , open data is often an easier option to start building an intelligent system.
Open data refers to data being made freely available for reuse. Although open data has been the norm in academic community, it has received a major impetus in the past decade from government open data where governments are increasingly taking initiatives to make their data available online in open formats and under licenses that allow use, reuse & redistribution of government data. Over five hundred open data catalogs exist for cities, state and federal governments that have made their data publicly available . Some prominent repositories are London (UK), Chicago (USA), Washington DC (USA), Dublin (Ireland), USA (data.gov), India (data.gov.in) and Kenya (opendata.go.ke). Some of these agencies have also opened up their data as a platform encouraging development of applications for public good. However, these datasets have to be prepared for analysis with richer data integration, semantics and contextual models.
2.3 AI Methods
Any good standard text book of AI reviews the main ingredients needed to build an intelligent agent . Prominent among them are techniques to gather data (speech, image and vision processing); learn patterns from data (Machine Learning); formally represent knowledge extracted from data or explicitly given by people (Knowledge Representation), methods to reason with knowledge and take decisions balancing goals, optimizing resources and managing uncertainties (Reasoning); and take decisions (Execution Control and Robotics).
After an agent is built, it needs to have means to interact with outside world, including people (Human Computer Interaction) and other agents (Multi-Agent Systems). The agent can be embodied within a virtual entity like a chatbot, website or mobile application, or a physical entity like a robot or device.
To show the historic focus in AI, Figure 1 shows a “word cloud” created using the accepted papers at AAAI 1997. (A “word cloud” shows most frequent words with size of text representing the relative frequencies.) The Figure show top 150 top-words). One notices that University is prominent and many sub-areas of AI are represented like Knowledge, Learning, Constraints, Planning, Search.
[Fig 1.] Image is a “word cloud” created with Wordle using data about papers published at AAAI 1997.
However, in common usage by business community and popular press today, term AI is used for its narrow sub-field of machine learning (ML), and even there, to its sub-field of Deep Learning (DL) . Figure 2 shows a “word cloud” created using the accepted papers at AAAI 2017. One may notice that Learning (and related terms) is quite prominent and so are Chinese authors.
[Fig 2.] Image is a “word cloud” created with Wordle using data about papers published at AAAI 2017 – two decades later than 1997.
There is no doubt that DL has revolutionized all aspects of Computer Science in the last 5 years and has become the next big buzz in Information Technology (IT) industry . However, it is still a small (but growing) piece of the full puzzle of building a usable intelligent system that people can use 
, let alone solve problems that people actually face and without creating new problems like bias or safety concerns. They ignore efforts needed to bring together an intelligent system that not only learns insights but also can represent, reason, execute, monitor environment and adapt itself to achieve short- and long- term goals. They ignore the new work needed to understand the ethical issues, privacy concerns and security risks that systems built using black-box models learnt on large data pose when working with humans.
3.0 Levels of cognitive abilities
So, what is meant by a cognitive system when one reads of the phrase today? For the purpose of the article, we will consider it as a computer system that will work with humans (social) and comes at different levels of thinking abilities (intelligence) on its own (independently, also known as, autonomously). This can be arranged as levels in the order of sophistication and considered akin to grading a personal secretary based on his or her proficiencies.
Level-1: Understand Data to Help
Systems at Level-1 process take input data and give out potentially insightful patterns. The input can be textual, audio or video data. The output depends on data and its format, but is broadly a set of topics (labels) and trends. A preferred characteristic of such systems is to interact and engage with end users in natural language and not a specialized language like SQL.
Technologies used by Level-1 systems include machine learning (including deep learning), data mining and rules for analysis and natural language processing (NLP) and visualization for interaction with people. They help a person understand data but fall short of suggesting what to do with the insights. Such systems are also susceptible to spurious inputs as the Tay system illustrated.
Level-2: Suggest to Act
Systems and Level-2 process input data, uses a model of what a person wants to achieve (goals) and gives out recommendations to act. The input data can be textual, audio or video data. The goals can be that of achieving something or maintaining a condition. The output is a prescription, which can be simple or complex depending on the uncertainties the system models.
Technologies used by such systems go beyond Level-1 and include rules, computational logics, planning, and automated reasoning. They help a person find their way through alternative decision choices consistent with their goals but rely on the person knowing what they want. For example, suggesting tags on a photo about who is in it is an example of Level-2 system. Such systems are susceptible to spurious data and also people with flickering goals. For example, a Level-2 system may help a person decide how they should schedule their day. But if the person misses his anniversary because he never expressed it as his priority explicitly or implicitly but was later confronted by his complaining wife, who is at fault? Blame it on the secretary?