Lesson 15: AI Coding Assistance

This lesson was generated with assistance from Jupter AI using ChatGPT 3.5 Turbo.

Overview#

Using an AI code assistant, we will explore the use of generative AI models, mainly language models (LMs) such as ChatGPT, in Jupyter. This is to perform various coding tasks such as generating, completing, debugging, explaining, formatting, and optimizing Python codes. By the end of this lesson, you will be able to:

explain the pros and cons of generative AI in enhancing your Python learning and productivity
use a generative AI model such as ChatGPT in Jupyter through an AI code assistant such as Jupyter AI
chat with and perform coding tasks in Jupyter with a generative AI model of your choice
utilize multiple generative AI models in an open crowdsourced environment with Chatbot Arena (optional)

%load_ext jupyter_ai              
from ai_assistant import api_key  #Import api_key module  
api_key.set_API_key('OPENAI')     #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 2
      1 get_ipython().run_line_magic('load_ext', 'jupyter_ai')
----> 2 from ai_assistant import api_key  #Import api_key module  
      3 api_key.set_API_key('OPENAI')     #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'

ModuleNotFoundError: No module named 'ai_assistant'

1. Introduction#

1.1 Generative AI to enhance your Python learning and productivity#

Generative AI refers to artificial intelligence capabilities that can generate new content and insights automatically. In this lesson, we will explore how generative AI within the Jupyter notebook environment can augment human capabilities, and enhance learning and productivity.

AI - Language models (LMs) have different capabilities with respect to reasoning, coding, mathematics, and language comprehension. This figure shows proficiency in mathematics (GSM8K Score) and model’s generalisation abilities (Exam Score) on the Hungarian National High School Exam (Image Credit: DeepSeek-LLM)

1.2 AI code assistant#

AI code assistants such as Jupyter AI leverage generative AI models from different providers such as OpenAI : ChatGPT within an IDE environment such as JupyterLab. An AI code assistant will provide:

prompt engineering with respect to your programming language,
context-aware code suggestions, completions, debugging, formatting, explaination and generation
chat user-interface to ask question and get help on related topics such as installation troubleshooting
and many more

This is to improvde learning and productivitiy.

1.3 Gallery of AI code assistants#

Selecting an AI code assistant depends on factors such as language support, integration with preferred IDEs, customization options, accuracy in suggestions, real-time feedback, resource efficiency, and cost considerations as shown in the table.

AI Code Assistant	Providers: Models	Compatible IDE	Pros	Cons	Use-fee
Jupyter AI	AI21, Anthropic, AWS, Cohere, Hugging Face, NVIDIA, OpenAI and more (via third-party plugins)	JupyterLab	1. Seamless Jupyter integration 2. Multi-feature chat user-interface 3. Supports text-embedding	1. Not user-friendly 2. Chat user-interface may not function 3. Authentication of API KEY for each notebook 4. Requires plugins for each provider	Free but generative AI models may not be free
ChatGPT Jupyter AI Assistant	[OpenAI: ChatGPT](OpenAI	JupyterLab	Impressive and user-friendly code assistant features	1. No longer maintained and advise to switch to Jupyter AI 2. Code assistant features may not function	Free but generative AI models may not be free
Amazon CodeWhisperer	Amazon: In-house AI models	JupyterLab, PyCharm, and VSCode	1. Seamless Jupyter integration 2. Real-time feedback	Installation build error may occure	Individual Tier is free for individual use
GitHub Copilot	OpenAI: Codex	PyCharm, and VSCode	1. Powerful and mature code assistant 2 . Context-aware suggestions 3. Automated code refactoring	No Jupyter integration	Free for students and educators

Here we will use Jupyter AI, but you can also experiment with other ones.

2. Jupyter AI Extension#

This section is modefied from jupyter-ai documentation. You can also check the YouTube video AWS re:Invent 2023 - Jupyter AI where the developers introduce this tool.

2.1 Overview#

Jupyter AI connects generative AI models with Jupyter notebooks, which can enhance your learning and productivity. Specifically, Jupyter AI:

turns your notebook into generative AI playground
provides chat user-interface in JupyterLab for chatting with your generative AI model
supports a wide range of generative model providers including AI21, Anthropic, AWS, Cohere, Hugging Face, NVIDIA, and OpenAI
allows users to run generative AI models on their own machines through GPT4All rather than relying on cloud-based services.

In this section we will learn the first two points with focus on ChatGPT3.5 Turbo of OpenAI.

2.2 Installing Jupyter AI#

2.2.1 Installation Steps#

Steps to install Jupyter AI:

Open an Anaconda Prompt (Anaconda3) or Anaconda Prompt (Miniconda3)
It is not a bad idea to update your pip before installing a new package

pip install --upgrade pip

Then you can install Python Jupyter AI with pip:

pip install jupyter-ai

Alternatively, you can use install this extension with a conda. Details on installing and using Jupyter AI can be found on Jupyter AI official documentation. The above steps should work for windows and linux users. For mac users, you need to do more steps as shown on jupyter-ai GitHub repository.

2.2.2 Installation Troubeshooting#

The Chat UI on the left menu may not work you will get this error message:

There seems to be a problem with the Chat backend, please look at the JupyterLab server logs or contact your administrator to correct this problem.

You might need to install few extra packages such as langchain_nvidia_ai_endpoints and cohere, and restart your computer. Check this stackoverflow post for details.

2.3 Loading Jupyter AI magic commands#

To use Jupyter AI, you need enable the %ai and %%ai magic commands in your notebook.

What is a magic command? Ask your LLM.

# # Load extension
# %load_ext jupyter_ai

2.4 Select provider#

Jupyter AI supports a wide range of model providers and models. To use Jupyter AI with a particular provider, you must install its Python plugins for that provider and set the provider’s API key (or other credentials) in your notebook or in the Jupyter AI Chat user-interface (UI) at the left menu.

You can view the available providers and models as follows:

# List available LM
%ai list

The environment variable names of API-keys are used when setting up a model. If multiple variable names are listed for a provider, all must be specified. Check Jupyter AI offical documentation for information about how to use each of the above listed providers.

The label Set is

✅ if you provided the API-key for that provider
❌ if you did not provide the API-key for that provider
N/A if the provider does not require API-key

The label Models shows the provider_name:model_name.

Aliases are are nicknames for models. For example, typing chatgpt is the same as typing openai-chat:gpt-3.5-turbo.

2.5 Install provider plugins#

You need to select your language model, and you can also select an embedding model.

A language model are typically pre-trained.
An embedding model is used when learning and asking about local data.

You can select language model and embedding model through the Jupyter AI Chat interface at the left menu or manually. However, Jupyter AI requires third-party plugins, so before we use a model, we need to install the Python plugins for that model or provider.

The provider that we will select is OpenAI that is the developer of ChaTGPT 3.5 Turbo and many other LLMs. For other providers, you need to check the required plugins.

# Uncomment and run the command below to install openAI plugins
#!pip install openai

2.6 Get API-key for the selected provider#

To be able to use Jupyter AI for a given notebook, you need to do the environment variable authentication in that notebook using your unique API-key. API-key is a special code that grants users access to the provider services. This code is like a password, so you should not share this code with anyone.

For this lesson the provider is OpenAI, but you can select any other provider of your choice. You need to create an OpenAI account to get OpenAI API key. It will be free for a period of time, and then you can add a credit card number to get charged for your usage of paid services. This GitHub file provides information about OpenAI API Key and how to get OpenAI API Key.

2.7 Set API-key for the select provider#

One on the drawbacks of Jupyter AI environment variable authentication through the Chat UI may be insufficent, and you need to do environment variable authentication using your unique API-key in the notebook.

3.7.1 Option 1: Reading your API key from the notebook#

Pros: Convenient
Cons: Security risk; Ensure to remove your API key before sharing the notebook

To set API key from your notebook, you need to import the operating system module os and use os.environ['Environment_variable']= 'API_key' to set the API for your selected provider.

The example below is for OpenAI provider:

%load_ext jupyter_ai
import os
os.environ['OPENAI_API_KEY'] = 'add your OpenAP API key here'

Try it out:

## Load Jupyter AI extension
# %load_ext jupyter_ai 
# # Set API key
# import os
# os.environ['OPENAI_API_KEY'] = 'add your OpenAP API key here'

3.7.2 Option 2: Reading API key from an external file#

Pros: Enhanced security; Separating the API key into a separate file enables sharing the notebook without exposing the key
Cons: Additional setup required; You must include the code snippet below at the start of each notebook and specify the path to your API key file

To set an API key from a file, copy and paste your API-key to a textfile, let us say ‘OPENAI_API_KEY.txt’. Then place this code at the beginning of your file. Make sure to change the file_path_name as needed.

# # Load Jupyter AI extension
# %load_ext jupyter_ai

# #Read API key from a file
# def read_API_Key(file_name):

#     # Open the file in read mode 
#     with open(file_name, 'r') as file:
#         # Read the content of the file
#         API_key = file.read().strip()  # strip() removes any leading or trailing whitespace
#     return API_key

# # Set API key
# import os
# file_path_name = 'ai_assistant/OPENAI_API_KEY.txt'
# os.environ['OPENAI_API_KEY'] = read_API_Key(file_path_name)

3.7.3 Option 3: Create an authentication module#

Pros: Versatile solution; Enables authentication for various providers from any location
Cons: Technical proficiency needed; Involves coding skills

At the beginning of my notebook, I can place these three lines to load AI magic commands and set up environment variable authentication:

%load_ext jupyter_ai
from ai_assistant import api_key  # Import the api_key module
api_key.set_API_key('OPENAI')     # Set the API key for the selected provider: 'OPENAI' or 'ANTHROPIC'

By passing the provider name to the api_key.set_API_key() function, authentication is performed based on the API key saved in a file. The module is saved somewhere on my machine, but I can import it from anywhere because I structured it as a package with __init__.py, rather than just a single Python module file as we covered in a previous lesson.

While we have not covered package structures, you can ask your generative AI model to demonstrate how to create a package like ai_assistant and develop a module such as api_key with the set_API_key function.

# %load_ext jupyter_ai              
# from ai_assistant import api_key  #Import api_key module  
# api_key.set_API_key('OPENAI')     #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'

2.8 Getting help (optional)#

Let use look at the help of Jupyter AI to learn about what Jupyter AI offers and how to use this AI code assistant.

#Getting help
%ai --help

The above help tells us that the magic command

%%ai [OPTIONS] COMMAND 
# Or 
%%ai COMMAND [OPTIONS]

invokes a language model identified by MODEL_ID, with the prompt being contained in all lines after the first.

From OPTIONS , the most important opition is -f that allows you to format your model output as code, html, image, json, markdown, math, md, or text. If this is unclear, it will be clear with an example, so let us see few examples.

You can get help on a specific command. For example, let us get help on error command:

%ai error --help

2.9 Using `%%ai` and `%ai` magic commands (not very much recommended)#

2.9.1 Using magic command with default format#

Now we want to use ChatGPT-3.5 Turbo to generate a function the finds the minimum value in a list.

Here is our prompt.

Write a function that identifies the minimum value in a list without relying on the built-in min() function.
Ensure the function is capable of handling various data types and edge cases.
Run at least two test cases to validate the accuracy of the minimum value identification process.

Here is the general format:

%%ai provider:model [OPTIONS]
prompt

In that case this the provider and model would be %%ai openai-chat:gpt-3.5-turbo or simply use the provider-model aliase that is %%ai chatgpt.

Here is how to do it:

%%ai chatgpt
Function that identifies the minimum value in a list without relying on the built-in min() function
Function is capable of handling various data types and edge cases
Two test cases

In our prompt, we omitted specifying Python as Jupyter AI will automatically manage the task, providing necessary details like Python version and other relevant information to achieve the desired output.

More importantly, in the above example, the default output is markdown format. We can change this with the argument [OPTION]?

2.9.2. Formatting the output#

By default the output of an %%ai command will be formatted as markdown. You can override this using the -f or --format argument to your magic command. Valid formats include: code, markdown, math, html, text, json, and image (for Hugging Face Hub’s text-to-image models).

Repeat the above example using -f code

%%ai -f code chatgpt 
Function that identifies the minimum value in a list without relying on the built-in min() function
Function is capable of handling various data types and edge cases
Two test cases

Here is another example modified from Jupyter AI documentation

%%ai chatgpt -f math
Generate 3d solute transport equation

%%ai chatgpt -f md
Generate 3d solute transport equation in compact form with explaintation

%%ai chatgpt -f markdown
Markdown code for 3d solute transport equation in LaTeX surrounded by `$$`. Do not include explanation.

\[ \frac{{\partial c}}{{\partial t}} = D \nabla^2 c - \nabla \cdot (\mathbf{v}c) \]

2.9.3 The error command#

The error command explains the most recent error. For usage:

%ai error MODEL_ID

Run the code below, and use error command to understand error

a= 1
b= "2"
c= 1+b

%ai error chatgpt

To address and rectify this error, you can utilize the list variable Err[] or In[] as illustrated below.

%%ai chatgpt -f code
Fix {Err[12]}

Note

Using list variable Err[] is not advisable for codes with intricate formatting, as it may not yield the desired outcome.

2.10 Code Interaction with list variables (not recommended)#

Pros: Enables working solely within the notebook without the need for a Chat UI interface
Cons: Limited functionality for codes with complex formatting

Jupyter AI can assist you in interacting with code or markdown cells using Python expressions like {}. You can use the special list variables In[n],Out[n], or Err[n]:

{In[n]}: Retrieves the input
{Out[n]}: Retrieves the output
{Err[n]}: Retrives the error

of a specific cell where n is sequential number that Jupyter notebook assign to each cell based on execution order in the notebook. This is the number on the left hand side of the cell. For instance, {In[1]} would retrieve the input of cell [1].

{In[1]}

Now you can use these list variables to interact with your Jupyter notebook.

Taking the minimum function code above as an example, ask Jupyter AI to:

improve the code and run it for three test cases a list, dictionary, and tuple

Call the %%ai chatgpt -f code to try it out:

%%ai chatgpt -f code
improve the code below and run it for three test cases a list, dictionary, and tuple:
{In[15]}

2.11 Code Interaction with Chat UI (recommended)#

Pros: Provides the capability to execute various tasks, as demonstrated below
Cons: Requires more typing to customize the output as desired

With Chat UI you can ask your LM to perform many coding tasks.

Complete: LM provides code completion as suggested by developer
Debug: LM debugs an error message in your code
Expalain: LM provides explanations, documentation, and insights about the code or part of the code
Translate: LM translates codes between different programming languages or paradigms like converting flowchart symbols to a code to a figure
Review: LM reviews and suggests refactoring improvements to existing code such as optimizing performance, improving readability, or adhering to best practices
Format: LM automatically adds comments, docstrings, formatting to code cell, and formatting to markdown cell
Troubleshoot: LM troubleshoots errors when installing a new package
Spellcheck: LM corrects your language errors
Improve: LM can improve your content
Chat: LM answers your questions and provide information
And much more: LM can perform many other tasks in your Jupyter notebook

The idea is simple. You have a chat user-interface that allows you to

ask questions
include selection from a code or markdown cell
replace selection from a code or markdown cell

You can try out this code completion example:

# Generate a Pandas DataFrame with daily data from 2020-01-01 to 2023-12-31 in Fort Myers, Florida 
#columns: 
#(1) 'TMIN' that is the minimum temperature, 
#(2) 'TMAX' that is the maximum temperature, 
#(3) 'PRCP' that is precipitation in inches, 
#(4) 'AWDS' that is the average wind speed in miles per hour, 
# (5) 'STATION' which has two stations, 'Field Airport' and 'SWF Airport'. 

# The index is the date

# Display the DataFrame in JupyterLab

# Pandas operation to find rows of the days that has the maximum precipitation 
# in the study period  for each of the two stations for each year

# Display the DataFrame in JupyterLab

Try this below with using the Chat UI by copying and pasting the above incomplete code below and asking your LM to complete this code and return code only

import pandas as pd
import numpy as np

# Generate data
dates = pd.date_range(start='2020-01-01', end='2023-12-31')
stations = ['Field Airport', 'SWF Airport']
data = {
    'TMIN': np.random.randint(50, 90, len(dates)),
    'TMAX': np.random.randint(70, 100, len(dates)),
    'PRCP': np.random.uniform(0, 2, len(dates)),
    'AWDS': np.random.randint(5, 15, len(dates)),
    'STATION': np.random.choice(stations, len(dates))
}

df = pd.DataFrame(data, index=dates)
df

# Find rows with max precipitation for each station for each year
max_indices = df.groupby([df.index.year, 'STATION'])['PRCP'].idxmax()
result = df.loc[max_indices]
result

3. Class exercise#

Complete this exercise by utilizing:

Jupyter AI,
another AI chat assistant,
or any Language Model (LM) of your preference directly without an AI chat assistant.

The exercise aims to teach the utilization of Language Models (LMs) for coding assistance and emphasizes the significance of prompt engineering.

3.1 Problem statement#

An student asked: For a Pandas DataFrame, how to display the rows of the days with the maximum precipitation for each weather station in each year in our study period and area?

3.2 Prompt engineering and code generation#

Note

Prompt engineering involves crafting and refining the language or structure of prompts to improve the performance of a language model in generating accurate and relevant responses. Mastering prompt engineering enables effective utilization of LMs across tasks from problem-solving to creative writing. Learn more with Real Python's tutorial Prompt Engineering: A Practical Example.

Here is one prompt that we can start with and refine later as needed:

Generate a Pandas DataFrame with daily data from 2020-01-01 to 2023-12-31 in Fort Myers, Florida with the following columns:
(1) 'TMIN' that is the minimum temperature,
(2) 'TMAX' that is the maximum temperature,
(3) 'PRCP' that is precipitation in inches,
(4) 'AWDS' that is the average wind speed in miles per hour,
and (5) 'STATION' which has two stations, 'Field Airport' and 'SWF Airport'.

The index is the date.

Display the DataFrame to screen.

Find and display the rows of the days that has the maximum precipitation in the study period for each of the two stations for each year.

Let use see if our select LM can do this. You can use an AI code assistant such as Jupter AI or directly use any LM of your choice.

%%ai chatgpt -f code
Generate a Pandas DataFrame with daily data from 2020-01-01 to 2023-12-31 in Fort Myers, Florida with the following columns:
(1) 'TMIN' that is the minimum temperature,
(2) 'TMAX' that is the maximum temperature,
(3) 'PRCP' that is precipitation in inches,
(4) 'AWDS' that is the average wind speed in miles per hour,
and (5) 'STATION' which has two stations, 'Field Airport' and 'SWF Airport'.

The index is the date.

Display the DataFrame to screen.

Find and display the rows of the days that has the maximum precipitation in the study period for each of the two stations for each year.

This is a promising beginning.

We need to verify the output and not just rely on everything that our LM is providing. The code snippet above exhibits a few issues:

Our current LM setup does not have access to datasets.
It employs a for loop instead of utilizing Pandas operations.
It utilizes print instead of the display function, which presents data in a visually appealing tabular format for Jupyter notebooks.

To tackle the first issue, we can instruct our LM to access a specific data file online or on our machine, instead of generating random data.

Let us now address the second and third problems.

3.3 Code improvement#

Ask your LM to use Pandas operations instead of for loop, and to display results in JupyterLab that is to use display instead of print function.

Here is what the LM sugguested.

3.4 Handling challanging problems#

The above example demonstrates that our LM, which is ChatGPT 3.5 Turbo, is able to solve this relatively straightforward problem. However, a more challenging problem might require additional strategies to handle effectively.

When faced with a more complex problem, several approaches can be beneficial. I asked my LM to complete this section for me. The LM suggests the first four points. I added the heading of point 5 and asked my LM to complete it for me:

Break Down the Problem: If the problem is complex, breaking it down into smaller, more manageable sub-problems can help. Providing step-by-step instructions or dividing the problem into sequential tasks can guide the model in tackling each part systematically.
Provide Context and Examples: Offering context, examples, or related information can assist the model in understanding the problem better. Clear descriptions, relevant data samples, or background information can enhance the model’s comprehension and problem-solving capabilities.
Ask Specific Questions: Instead of presenting a broad or vague problem statement, asking specific questions or providing precise requirements can help direct the model’s attention to the key aspects of the problem.
Iterative Approach: In cases where the problem is intricate, an iterative approach may be beneficial. Engaging in a dialogue with the model, providing feedback on its responses, and refining the problem statement based on initial outputs can lead to a more targeted and accurate solution.
Consider Advanced Language Models: By leveraging a robust language model, you can potentially achieve more accurate results, handle more complex patterns, and tackle a wider range of tasks with greater efficiency.

Now let us try using different version of ChatGPT 4 in Chatbot Arena.

4. Other useful tools#

Chatbot Arena is an open-source research project developed to an open crowdsourced platform to evaluate LMs.

Let us try to use Chatbot Arena.

5. Conclusions#

Here are the key points to consider:

Pros of Language Models: These models offer the potential to improve various aspects of the coding process, spanning from initial development to code optimization.
Cons of Language Models: Drawbacks include creating a dependency on AI for coding tasks can hinder personal skill development; no guarantee of error-free code without human review; and the inability to provide creative solutions that require human insight.
Effective Prompts: Step-by-step, detailed, clear, precise, and contextually relevant prompts can proficiently guide language models towards precise and targeted responses.
AI Code Assistants: Tools like Jupyter AI can boost your Python learning and productivity by aiding in coding tasks directly within your integrated development environment (IDE), such as JupyterLab.

To sum up, AI assistanance is not here to replace the work that you do, but to help you. Try to balance the benefits of AI assistance with the need for personal skill development and critical thinking.

Lesson 15: AI Coding Assistance

Contents

Lesson 15: AI Coding Assistance

Overview#

1. Introduction#

1.1 Generative AI to enhance your Python learning and productivity#

1.2 AI code assistant#

1.3 Gallery of AI code assistants#

2. Jupyter AI Extension#

2.1 Overview#

2.2 Installing Jupyter AI#

2.2.1 Installation Steps#

2.2.2 Installation Troubeshooting#

2.3 Loading Jupyter AI magic commands#

2.4 Select provider#

2.5 Install provider plugins#

2.6 Get API-key for the selected provider#

2.7 Set API-key for the select provider#

3.7.1 Option 1: Reading your API key from the notebook#

3.7.2 Option 2: Reading API key from an external file#

3.7.3 Option 3: Create an authentication module#

2.8 Getting help (optional)#

2.9 Using %%ai and %ai magic commands (not very much recommended)#

2.9.1 Using magic command with default format#

2.9.2. Formatting the output#

2.9.3 The error command#

2.10 Code Interaction with list variables (not recommended)#

2.11 Code Interaction with Chat UI (recommended)#

3. Class exercise#

3.1 Problem statement#

3.2 Prompt engineering and code generation#

3.3 Code improvement#

3.4 Handling challanging problems#

4. Other useful tools#

5. Conclusions#

2.9 Using `%%ai` and `%ai` magic commands (not very much recommended)#