Lesson 15: AI Coding Assistance
#
This lesson was generated with assistance from Jupter AI using ChatGPT 3.5 Turbo.
Overview#
Using an AI code assistant, we will explore the use of generative AI models, mainly language models (LMs) such as ChatGPT, in Jupyter. This is to perform various coding tasks such as generating, completing, debugging, explaining, formatting, and optimizing Python codes. By the end of this lesson, you will be able to:
explain the pros and cons of generative AI in enhancing your Python learning and productivity
use a generative AI model such as ChatGPT in Jupyter through an AI code assistant such as Jupyter AI
chat with and perform coding tasks in Jupyter with a generative AI model of your choice
utilize multiple generative AI models in an open crowdsourced environment with Chatbot Arena (optional)
%load_ext jupyter_ai
from ai_assistant import api_key #Import api_key module
api_key.set_API_key('OPENAI') #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 2
1 get_ipython().run_line_magic('load_ext', 'jupyter_ai')
----> 2 from ai_assistant import api_key #Import api_key module
3 api_key.set_API_key('OPENAI') #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'
ModuleNotFoundError: No module named 'ai_assistant'
1. Introduction#
1.1 Generative AI to enhance your Python learning and productivity#
Generative AI refers to artificial intelligence capabilities that can generate new content and insights automatically. In this lesson, we will explore how generative AI within the Jupyter notebook environment can augment human capabilities, and enhance learning and productivity.

AI - Language models (LMs) have different capabilities with respect to reasoning, coding, mathematics, and language comprehension. This figure shows proficiency in mathematics (GSM8K Score) and model’s generalisation abilities (Exam Score) on the Hungarian National High School Exam (Image Credit: DeepSeek-LLM)
1.2 AI code assistant#
AI code assistants such as Jupyter AI leverage generative AI models from different providers such as OpenAI : ChatGPT within an IDE environment such as JupyterLab. An AI code assistant will provide:
prompt engineering with respect to your programming language,
context-aware code suggestions, completions, debugging, formatting, explaination and generation
chat user-interface to ask question and get help on related topics such as installation troubleshooting
and many more
This is to improvde learning and productivitiy.
1.3 Gallery of AI code assistants#
Selecting an AI code assistant depends on factors such as language support, integration with preferred IDEs, customization options, accuracy in suggestions, real-time feedback, resource efficiency, and cost considerations as shown in the table.
AI Code Assistant |
Providers: Models |
Compatible IDE |
Pros |
Cons |
Use-fee |
---|---|---|---|---|---|
AI21, Anthropic, AWS, Cohere, Hugging Face, NVIDIA, OpenAI and more (via third-party plugins) |
1. Seamless Jupyter integration |
1. Not user-friendly |
Free but generative AI models may not be free |
||
[OpenAI: ChatGPT](OpenAI |
Impressive and user-friendly code assistant features |
1. No longer maintained and advise to switch to Jupyter AI |
Free but generative AI models may not be free |
||
Amazon: In-house AI models |
JupyterLab, PyCharm, and VSCode |
1. Seamless Jupyter integration |
Installation build error may occure |
Individual Tier is free for individual use |
|
OpenAI: Codex |
1. Powerful and mature code assistant |
No Jupyter integration |
Free for students and educators |
Here we will use Jupyter AI, but you can also experiment with other ones.
2. Jupyter AI Extension#
This section is modefied from jupyter-ai documentation. You can also check the YouTube video AWS re:Invent 2023 - Jupyter AI where the developers introduce this tool.
2.1 Overview#
Jupyter AI connects generative AI models with Jupyter notebooks, which can enhance your learning and productivity. Specifically, Jupyter AI:
turns your notebook into generative AI playground
provides chat user-interface in JupyterLab for chatting with your generative AI model
supports a wide range of generative model providers including AI21, Anthropic, AWS, Cohere, Hugging Face, NVIDIA, and OpenAI
allows users to run generative AI models on their own machines through GPT4All rather than relying on cloud-based services.
In this section we will learn the first two points with focus on ChatGPT3.5 Turbo of OpenAI.
2.2 Installing Jupyter AI#
2.2.1 Installation Steps#
Steps to install Jupyter AI:
Open an Anaconda Prompt (Anaconda3) or Anaconda Prompt (Miniconda3)
It is not a bad idea to update your pip before installing a new package
pip install --upgrade pip
Then you can install Python Jupyter AI with pip:
pip install jupyter-ai
Alternatively, you can use install this extension with a conda. Details on installing and using Jupyter AI can be found on Jupyter AI official documentation. The above steps should work for windows and linux users. For mac users, you need to do more steps as shown on jupyter-ai GitHub repository.
2.2.2 Installation Troubeshooting#
The Chat UI on the left menu may not work you will get this error message:
There seems to be a problem with the Chat backend, please look at the JupyterLab server logs or contact your administrator to correct this problem.
You might need to install few extra packages such as langchain_nvidia_ai_endpoints
and cohere
, and restart your computer. Check this stackoverflow post for details.
2.3 Loading Jupyter AI magic commands#
To use Jupyter AI, you need enable the %ai
and %%ai
magic commands in your notebook.
What is a magic command? Ask your LLM.
# # Load extension
# %load_ext jupyter_ai
2.4 Select provider#
Jupyter AI supports a wide range of model providers and models. To use Jupyter AI with a particular provider, you must install its Python plugins for that provider and set the provider’s API key (or other credentials) in your notebook or in the Jupyter AI Chat user-interface (UI) at the left menu.
You can view the available providers and models as follows:
# List available LM
%ai list
The environment variable names of API-keys are used when setting up a model. If multiple variable names are listed for a provider, all must be specified. Check Jupyter AI offical documentation for information about how to use each of the above listed providers.
The label Set
is
✅ if you provided the API-key for that provider
❌ if you did not provide the API-key for that provider
N/A if the provider does not require API-key
The label Models
shows the provider_name:model_name
.
Aliases are are nicknames for models. For example, typing chatgpt
is the same as typing openai-chat:gpt-3.5-turbo
.
2.5 Install provider plugins#
You need to select your language model, and you can also select an embedding model.
A language model are typically pre-trained.
An embedding model is used when learning and asking about local data.
You can select language model and embedding model through the Jupyter AI Chat
interface at the left menu or manually. However, Jupyter AI requires third-party plugins, so before we use a model, we need to install the Python plugins for that model or provider.
The provider that we will select is OpenAI that is the developer of ChaTGPT 3.5 Turbo and many other LLMs. For other providers, you need to check the required plugins.
# Uncomment and run the command below to install openAI plugins
#!pip install openai
2.6 Get API-key for the selected provider#
To be able to use Jupyter AI for a given notebook, you need to do the environment variable authentication in that notebook using your unique API-key. API-key is a special code that grants users access to the provider services. This code is like a password, so you should not share this code with anyone.
For this lesson the provider is OpenAI, but you can select any other provider of your choice. You need to create an OpenAI account to get OpenAI API key. It will be free for a period of time, and then you can add a credit card number to get charged for your usage of paid services. This GitHub file provides information about OpenAI API Key and how to get OpenAI API Key.
2.7 Set API-key for the select provider#
One on the drawbacks of Jupyter AI environment variable authentication through the Chat UI may be insufficent, and you need to do environment variable authentication using your unique API-key in the notebook.
3.7.1 Option 1: Reading your API key from the notebook#
Pros: Convenient
Cons: Security risk; Ensure to remove your API key before sharing the notebook
To set API key from your notebook, you need to import the operating system module os
and use os.environ['Environment_variable']= 'API_key'
to set the API for your selected provider.
The example below is for OpenAI provider:
%load_ext jupyter_ai
import os
os.environ['OPENAI_API_KEY'] = 'add your OpenAP API key here'
Try it out:
## Load Jupyter AI extension
# %load_ext jupyter_ai
# # Set API key
# import os
# os.environ['OPENAI_API_KEY'] = 'add your OpenAP API key here'
3.7.2 Option 2: Reading API key from an external file#
Pros: Enhanced security; Separating the API key into a separate file enables sharing the notebook without exposing the key
Cons: Additional setup required; You must include the code snippet below at the start of each notebook and specify the path to your API key file
To set an API key from a file, copy and paste your API-key to a textfile, let us say ‘OPENAI_API_KEY.txt’.
Then place this code at the beginning of your file. Make sure to change the file_path_name
as needed.
# # Load Jupyter AI extension
# %load_ext jupyter_ai
# #Read API key from a file
# def read_API_Key(file_name):
# # Open the file in read mode
# with open(file_name, 'r') as file:
# # Read the content of the file
# API_key = file.read().strip() # strip() removes any leading or trailing whitespace
# return API_key
# # Set API key
# import os
# file_path_name = 'ai_assistant/OPENAI_API_KEY.txt'
# os.environ['OPENAI_API_KEY'] = read_API_Key(file_path_name)
3.7.3 Option 3: Create an authentication module#
Pros: Versatile solution; Enables authentication for various providers from any location
Cons: Technical proficiency needed; Involves coding skills
At the beginning of my notebook, I can place these three lines to load AI magic commands and set up environment variable authentication:
%load_ext jupyter_ai
from ai_assistant import api_key # Import the api_key module
api_key.set_API_key('OPENAI') # Set the API key for the selected provider: 'OPENAI' or 'ANTHROPIC'
By passing the provider name to the api_key.set_API_key()
function, authentication is performed based on the API key saved in a file. The module is saved somewhere on my machine, but I can import it from anywhere because I structured it as a package with __init__.py
, rather than just a single Python module file as we covered in a previous lesson.
While we have not covered package structures, you can ask your generative AI model to demonstrate how to create a package like ai_assistant
and develop a module such as api_key
with the set_API_key
function.
# %load_ext jupyter_ai
# from ai_assistant import api_key #Import api_key module
# api_key.set_API_key('OPENAI') #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'
2.8 Getting help (optional)#
Let use look at the help of Jupyter AI to learn about what Jupyter AI offers and how to use this AI code assistant.
#Getting help
%ai --help
The above help tells us that the magic command
%%ai [OPTIONS] COMMAND
# Or
%%ai COMMAND [OPTIONS]
invokes a language model identified by MODEL_ID, with the prompt being contained in all lines after the first.
From OPTIONS
, the most important opition is -f
that allows you to format your model output as code
, html
, image
, json
, markdown
, math
, md
, or text
. If this is unclear, it will be clear with an example, so let us see few examples.
You can get help on a specific command. For example, let us get help on error
command:
%ai error --help
2.9 Using %%ai
and %ai
magic commands (not very much recommended)#
2.9.1 Using magic command with default format#
Now we want to use ChatGPT-3.5 Turbo to generate a function the finds the minimum value in a list.
Here is our prompt.
Write a function that identifies the minimum value in a list without relying on the built-in min() function.
Ensure the function is capable of handling various data types and edge cases.
Run at least two test cases to validate the accuracy of the minimum value identification process.
Here is the general format:
%%ai provider:model [OPTIONS]
prompt
In that case this the provider and model would be %%ai openai-chat:gpt-3.5-turbo
or simply use the provider-model aliase that is %%ai chatgpt
.
Here is how to do it:
%%ai chatgpt
Function that identifies the minimum value in a list without relying on the built-in min() function
Function is capable of handling various data types and edge cases
Two test cases
In our prompt, we omitted specifying Python as Jupyter AI will automatically manage the task, providing necessary details like Python version and other relevant information to achieve the desired output.
More importantly, in the above example, the default output is markdown format. We can change this with the argument [OPTION]
?
2.9.2. Formatting the output#
By default the output of an %%ai
command will be formatted as markdown. You can override this using the -f
or --format
argument to your magic command. Valid formats include: code
, markdown
, math
, html
, text
, json
, and image
(for Hugging Face Hub’s text-to-image models).
Repeat the above example using -f code
%%ai -f code chatgpt
Function that identifies the minimum value in a list without relying on the built-in min() function
Function is capable of handling various data types and edge cases
Two test cases
Here is another example modified from Jupyter AI documentation
%%ai chatgpt -f math
Generate 3d solute transport equation
%%ai chatgpt -f md
Generate 3d solute transport equation in compact form with explaintation
%%ai chatgpt -f markdown
Markdown code for 3d solute transport equation in LaTeX surrounded by `$$`. Do not include explanation.
2.9.3 The error command#
The error
command explains the most recent error. For usage:
%ai error MODEL_ID
Run the code below, and use error command to understand error
a= 1
b= "2"
c= 1+b
%ai error chatgpt
To address and rectify this error, you can utilize the list variable Err[]
or In[]
as illustrated below.
%%ai chatgpt -f code
Fix {Err[12]}
Note
Using list variable Err[] is not advisable for codes with intricate formatting, as it may not yield the desired outcome.2.10 Code Interaction with list variables (not recommended)#
Pros: Enables working solely within the notebook without the need for a Chat UI interface
Cons: Limited functionality for codes with complex formatting
Jupyter AI can assist you in interacting with code or markdown cells using Python expressions like {}
. You can use the special list variables In[n]
,Out[n]
, or Err[n]
:
{In[n]}
: Retrieves the input{Out[n]}
: Retrieves the output{Err[n]}
: Retrives the error
of a specific cell where n
is sequential number that Jupyter notebook assign to each cell based on execution order in the notebook. This is the number on the left hand side of the cell. For instance, {In[1]}
would retrieve the input of cell [1].
{In[1]}
Now you can use these list variables to interact with your Jupyter notebook.
Taking the minimum function code above as an example, ask Jupyter AI to:
improve the code and run it for three test cases a list, dictionary, and tuple
Call the %%ai chatgpt -f code
to try it out:
%%ai chatgpt -f code
improve the code below and run it for three test cases a list, dictionary, and tuple:
{In[15]}
2.11 Code Interaction with Chat UI (recommended)#
Pros: Provides the capability to execute various tasks, as demonstrated below
Cons: Requires more typing to customize the output as desired
With Chat UI you can ask your LM to perform many coding tasks.
Complete: LM provides code completion as suggested by developer
Debug: LM debugs an error message in your code
Expalain: LM provides explanations, documentation, and insights about the code or part of the code
Translate: LM translates codes between different programming languages or paradigms like converting flowchart symbols to a code to a figure
Review: LM reviews and suggests refactoring improvements to existing code such as optimizing performance, improving readability, or adhering to best practices
Format: LM automatically adds comments, docstrings, formatting to code cell, and formatting to markdown cell
Troubleshoot: LM troubleshoots errors when installing a new package
Spellcheck: LM corrects your language errors
Improve: LM can improve your content
Chat: LM answers your questions and provide information
And much more: LM can perform many other tasks in your Jupyter notebook
The idea is simple. You have a chat user-interface that allows you to
ask questions
include selection from a code or markdown cell
replace selection from a code or markdown cell
You can try out this code completion example:
# Generate a Pandas DataFrame with daily data from 2020-01-01 to 2023-12-31 in Fort Myers, Florida
#columns:
#(1) 'TMIN' that is the minimum temperature,
#(2) 'TMAX' that is the maximum temperature,
#(3) 'PRCP' that is precipitation in inches,
#(4) 'AWDS' that is the average wind speed in miles per hour,
# (5) 'STATION' which has two stations, 'Field Airport' and 'SWF Airport'.
# The index is the date
# Display the DataFrame in JupyterLab
# Pandas operation to find rows of the days that has the maximum precipitation
# in the study period for each of the two stations for each year
# Display the DataFrame in JupyterLab
Try this below with using the Chat UI by copying and pasting the above incomplete code below and asking your LM to complete this code and return code only
import pandas as pd
import numpy as np
# Generate data
dates = pd.date_range(start='2020-01-01', end='2023-12-31')
stations = ['Field Airport', 'SWF Airport']
data = {
'TMIN': np.random.randint(50, 90, len(dates)),
'TMAX': np.random.randint(70, 100, len(dates)),
'PRCP': np.random.uniform(0, 2, len(dates)),
'AWDS': np.random.randint(5, 15, len(dates)),
'STATION': np.random.choice(stations, len(dates))
}
df = pd.DataFrame(data, index=dates)
df
# Find rows with max precipitation for each station for each year
max_indices = df.groupby([df.index.year, 'STATION'])['PRCP'].idxmax()
result = df.loc[max_indices]
result
3. Class exercise#
Complete this exercise by utilizing:
Jupyter AI,
another AI chat assistant,
or any Language Model (LM) of your preference directly without an AI chat assistant.
The exercise aims to teach the utilization of Language Models (LMs) for coding assistance and emphasizes the significance of prompt engineering.
3.1 Problem statement#
An student asked: For a Pandas DataFrame, how to display the rows of the days with the maximum precipitation for each weather station in each year in our study period and area?
3.2 Prompt engineering and code generation#
Note
Prompt engineering involves crafting and refining the language or structure of prompts to improve the performance of a language model in generating accurate and relevant responses. Mastering prompt engineering enables effective utilization of LMs across tasks from problem-solving to creative writing. Learn more with Real Python's tutorial Prompt Engineering: A Practical Example.Here is one prompt that we can start with and refine later as needed:
Generate a Pandas DataFrame with daily data from 2020-01-01 to 2023-12-31 in Fort Myers, Florida with the following columns:
(1) 'TMIN' that is the minimum temperature,
(2) 'TMAX' that is the maximum temperature,
(3) 'PRCP' that is precipitation in inches,
(4) 'AWDS' that is the average wind speed in miles per hour,
and (5) 'STATION' which has two stations, 'Field Airport' and 'SWF Airport'.
The index is the date.
Display the DataFrame to screen.
Find and display the rows of the days that has the maximum precipitation in the study period for each of the two stations for each year.
Let use see if our select LM can do this. You can use an AI code assistant such as Jupter AI or directly use any LM of your choice.
%%ai chatgpt -f code
Generate a Pandas DataFrame with daily data from 2020-01-01 to 2023-12-31 in Fort Myers, Florida with the following columns:
(1) 'TMIN' that is the minimum temperature,
(2) 'TMAX' that is the maximum temperature,
(3) 'PRCP' that is precipitation in inches,
(4) 'AWDS' that is the average wind speed in miles per hour,
and (5) 'STATION' which has two stations, 'Field Airport' and 'SWF Airport'.
The index is the date.
Display the DataFrame to screen.
Find and display the rows of the days that has the maximum precipitation in the study period for each of the two stations for each year.
This is a promising beginning.
We need to verify the output and not just rely on everything that our LM is providing. The code snippet above exhibits a few issues:
Our current LM setup does not have access to datasets.
It employs a for loop instead of utilizing Pandas operations.
It utilizes
print
instead of thedisplay
function, which presents data in a visually appealing tabular format for Jupyter notebooks.
To tackle the first issue, we can instruct our LM to access a specific data file online or on our machine, instead of generating random data.
Let us now address the second and third problems.
3.3 Code improvement#
Ask your LM to use Pandas operations instead of for loop, and to display results in JupyterLab that is to use display
instead of print
function.
Here is what the LM sugguested.
3.4 Handling challanging problems#
The above example demonstrates that our LM, which is ChatGPT 3.5 Turbo, is able to solve this relatively straightforward problem. However, a more challenging problem might require additional strategies to handle effectively.
When faced with a more complex problem, several approaches can be beneficial. I asked my LM to complete this section for me. The LM suggests the first four points. I added the heading of point 5 and asked my LM to complete it for me:
Break Down the Problem: If the problem is complex, breaking it down into smaller, more manageable sub-problems can help. Providing step-by-step instructions or dividing the problem into sequential tasks can guide the model in tackling each part systematically.
Provide Context and Examples: Offering context, examples, or related information can assist the model in understanding the problem better. Clear descriptions, relevant data samples, or background information can enhance the model’s comprehension and problem-solving capabilities.
Ask Specific Questions: Instead of presenting a broad or vague problem statement, asking specific questions or providing precise requirements can help direct the model’s attention to the key aspects of the problem.
Iterative Approach: In cases where the problem is intricate, an iterative approach may be beneficial. Engaging in a dialogue with the model, providing feedback on its responses, and refining the problem statement based on initial outputs can lead to a more targeted and accurate solution.
Consider Advanced Language Models: By leveraging a robust language model, you can potentially achieve more accurate results, handle more complex patterns, and tackle a wider range of tasks with greater efficiency.
Now let us try using different version of ChatGPT 4 in Chatbot Arena.
4. Other useful tools#
Chatbot Arena is an open-source research project developed to an open crowdsourced platform to evaluate LMs.
Let us try to use Chatbot Arena.
5. Conclusions#
Here are the key points to consider:
Pros of Language Models: These models offer the potential to improve various aspects of the coding process, spanning from initial development to code optimization.
Cons of Language Models: Drawbacks include creating a dependency on AI for coding tasks can hinder personal skill development; no guarantee of error-free code without human review; and the inability to provide creative solutions that require human insight.
Effective Prompts: Step-by-step, detailed, clear, precise, and contextually relevant prompts can proficiently guide language models towards precise and targeted responses.
AI Code Assistants: Tools like Jupyter AI can boost your Python learning and productivity by aiding in coding tasks directly within your integrated development environment (IDE), such as JupyterLab.
To sum up, AI assistanance is not here to replace the work that you do, but to help you. Try to balance the benefits of AI assistance with the need for personal skill development and critical thinking.