It’s time again for a new entry in the Data Connections Blog! In my last entry, we looked at scary predictions about the future of Artificial Intelligence (AI) and why these might be motivated by any number of unstated factors. While I certainly don't mean to minimize anyone's concerns about AI, it makes sense to ensure those concerns are grounded in reality. If you have not read Blog entry #7, you may want to start here before reading this entry.
Let’s back up a bit and talk about a few AI basics. When I told my editor-in-chief (an unpaid and thankless role graciously filled by my much appreciated wife and law firm partner) that I was going to do a Blog entry on AI basics, she emphasized the need to make the entry relatable. After spending hours staring at a blank screen… I decided that relatable might be a high bar for this topic. That said, I think it will be interesting but, of course, you can be the judge!
Since it is easy to be confused about what AI actually is, I decided to go straight to the source - AI itself - for some advice. In this case I consulted the well-known ChatGPT.
My ChatGPT 3.5 prompt was: Hi ChatGPT, can you explain artificial intelligence?
Here is ChatGPT’s response:
Absolutely, Artificial Intelligence (AI) refers to the development of computer systems capable of performing tasks that typically require human intelligence. These tasks include learning, reasoning, problem-solving, perception, understanding natural language, and more.
AI can be divided into two broad categories: Narrow AI and General AI.
- Narrow AI (Weak AI): This type of AI is designed to perform a narrow task, such as facial recognition, language translation, or playing a game like chess. Narrow AI systems are trained to accomplish specific tasks within a limited domain and do not possess the broad range of abilities associated with human intelligence.
- General AI (Strong AI): General AI, also known as artificial general intelligence (AGI), refers to a hypothetical AI system that exhibits intelligence and cognitive abilities similar to those of humans. AGI would be capable of understanding and learning any intellectual task that a human being can. However, achieving AGI remains a significant challenge and is currently only a subject of theoretical research.
Okay, let’s take a moment to consider this part of the answer before moving on. Perhaps unsurprisingly, this answer closely mirrors many AI descriptions available on the Internet. Maybe ChatGPT’s intent is to lull humans into a false sense of security but, by the above definition, all of today’s AI is clearly Narrow AI. ChatGPT characterizes that scary, apocalyptic and Strong AI (i.e., smarter than humans) as only hypothetical.[1] This reinforces our discussion in the last Blog entry about how far away our current AI capabilities are from generally matching (or surpassing) the intellectual capabilities of humans. Notably, AI currently is task-oriented - i.e., really good, or even astoundingly good at doing a defined thing or set of things. But not, as things currently stand, able on its own to expand its capabilities beyond the tasks for which it has been trained. So, that’s comforting….
Continuing on, here is the second part of ChatGPT’s answer:
AI systems often rely on various techniques and approaches, including:
- Machine Learning: A subset of AI that enables systems to learn from data without being explicitly programmed. It includes techniques such as supervised learning, unsupervised learning, and reinforcement learning.
- Deep Learning: A subfield of machine learning inspired by the structure and function of the human brain's neural networks. Deep learning models, known as artificial neural networks, can automatically learn to perform tasks like image and speech recognition.
- Natural Language Processing (NLP): The branch of AI concerned with the interaction between computers and humans through natural language. NLP enables computers to understand, interpret, and generate human language.
- Computer Vision: The field of AI that enables computers to interpret and understand the visual world through images or videos. Computer vision algorithms can recognize objects, people, places, and actions in visual data.
Let’s do another interrupt here to consider these AI approaches. The different ways AI learns guides its capabilities. Machine Learning (ML) systems, for example, are using either labeled data (“supervised learning”) or unlabeled data (“unsupervised learning”) or some hybrid of the two to learn discrete skills that enable it to take over tasks previously performed by people. For example, an ML system can identify emails that are spam and direct them to the junk folder, a helpful skill that works well most of the time (What? I had an appointment today? Please check your junk folder, sir...would you like to reschedule now?).
Labeled data is much easier to learn from since the learning cues are already there (i.e., this is a calendar entry; it is for tomorrow or these are pictures of dogs; these dogs are German Shepherds). However, there is much more unlabeled data in the world than labeled data as labeling can be very labor intensive - someone or some application has to add all those labels. More typically, with unlabeled data, the ML system is looking for patterns in the data that can be used to derive inferences/connections. This is where mistakes can be made, and they are unfortunately common. For example, what could a rusty orange object in the road be? Traffic cone? Old road sign? Manhole cover? Squirrel? To try to counteract this, ML typically requires very large data sets so patterns can be discerned, and anomalies overridden as statistically insignificant. We’ll explore the consequences of this approach more in an upcoming Blog post.
Next up in ChatGPT’s response is Deep Learning (DL). This is ML taken a step further. Like ML, DL uses trained algorithms or neural networks[2] modelled on the human brain to make decisions. However, DL uses more layers (3 or more) of neural networks than standard ML. So, DL can perform more complex tasks like powering the chatbot you may communicate with when making an on-line order.[3] I am sure you have seen a website offering “Customer Service” which ends up basically being a set of FAQs set up to respond to standard customer questions. Try asking Amazon’s Customer Service (as I did) “Which city is the capital of the state of Ohio?”… you will get a blank stare (well, a blank screen - but, it is essentially the same thing!).[4]
We have previously discussed Natural Language Processing (NLP). That is the process of training an algorithm on lots of ways humans use language to say things. If you have ever set up your new computer or mobile phone for voice recognition, you will know that, at least in the early days, so called “Chatbots” typically did not have NLP capability. So, you had to provide the bot all the (MANY) ways you could think of asking a question that matched the responses you wanted it to give. NLP instead allows us, in the astonishing ways we have seen with ChatGPT, to directly communicate with computers using typical words and phrases rather than by using computer programming or by simply responding to pre-programmed questions (see my prompt to ChatGPT above).
Computer Vision (CV) is another approach that involves training AI with data to enable image recognition. For a long time, computer search did not recognize or enable images. This field of AI has been very beneficial but has also opened up discussions that it will be difficult to cover fully in this Blog entry. There are at least two major ethical issues with Computer Vision (note, these are not entirely unique to CV). First, through selection or lack of data or both, there is often inadequate diversity in the images provided to AI in this training. This has led AI to generate bad and even biased results. Historical inequities in medical images, for example, tend to greatly under-represent women and minority groups in data such as x-ray images. This can lead to misdiagnosis or mistreatment of women and minorities when using AI to assist.[5] Second, the image data that AI is trained on can be manipulated via mislabeled images (whether intentional or otherwise) to misidentify items, e.g., mis-labeling an image of a squirrel as manhole cover.[6] The results can be dangerous or catastrophic when applied to critical systems such as those that guide self-driving cars.
That was all ChatGPT had to say in explaining AI. And, while a bit incomplete, it is not a bad overview. However, then ChatGPT rather casually ended its response by noting:
…[E]thical considerations surrounding AI development, such as bias mitigation, transparency, and accountability, are becoming increasingly important topics of discussion.
I think most of us could wholeheartedly agree! What is missing in this response is some idea of how to not only start but also to make significant progress on AI ethics. Of course that is a bigger issue with much of today’s AI. We will also discuss this in an upcoming Blog as we work to ensure the AI in your driverless car knows the important difference between a squirrel and a manhole cover.
I hope you’ll be back next time as we continue to explore data connections!
______________________________________________
[1] There are those who add a third category to the discussion of types of AI which is captioned “Artificial Superintelligence”. In this state, AI would surpass human intelligence in every respect. This stage of AI is beyond hypothetical.
[2] Neural Networks are algorithms that have been trained to perform specific tasks.
[3] A chatbot, per Wikipedia: [is] a software application or web interface that is designed to mimic human conversation through text or voice interactions. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use deep learning and natural language processing, but simpler chatbots have existed for decades.
[4] See this humorous account of ordering a Domino’s pizza from a bot by text for a similar example: https://www.linkedin.com/pulse/i-tried-ordering-dominos-over-text-message-had-horrible-gigi-peccolo
[5] Image bias in cancer diagnosis: https://datascience.cancer.gov/news-events/blog/trusting-data-look-data-bias; See also for bias in x-ray images in AI: https://www.nature.com/articles/s41591-021-01595-0.
[6] https://www.technologyreview.com/2019/06/21/828/a-new-set-of-images-that-fool-ai-could-help-make-it-more-hacker-proof/. NOTE: this article actually refers to mislabeling a dragonfly as a manhole cover… but a squirrel is just as likely and provides a better image!
______________________________________________
If you have questions about your data and your legal compliance programs for data, Mortinger & Mortinger LLC can help! Contact me directly at: steve@mortingerlaw.com
Mortinger & Mortinger LLC when experience is important and cost matters.