Lesson 15: AI Coding Assistance

#

This lesson was generated with assistance from Jupter AI using ChatGPT 3.5 Turbo.

Binder


Overview#

Using an AI code assistant, we will explore the use of generative AI models, mainly language models (LMs) such as ChatGPT, in Jupyter. This is to perform various coding tasks such as generating, completing, debugging, explaining, formatting, and optimizing Python codes. By the end of this lesson, you will be able to:

  • explain the pros and cons of generative AI in enhancing your Python learning and productivity

  • use a generative AI model such as ChatGPT in Jupyter through an AI code assistant such as Jupyter AI

  • chat with and perform coding tasks in Jupyter with a generative AI model of your choice

  • utilize multiple generative AI models in an open crowdsourced environment with Chatbot Arena (optional)


%load_ext jupyter_ai              
from ai_assistant import api_key  #Import api_key module  
api_key.set_API_key('OPENAI')     #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 2
      1 get_ipython().run_line_magic('load_ext', 'jupyter_ai')
----> 2 from ai_assistant import api_key  #Import api_key module  
      3 api_key.set_API_key('OPENAI')     #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'

ModuleNotFoundError: No module named 'ai_assistant'

1. Introduction#

1.1 Generative AI to enhance your Python learning and productivity#

Generative AI refers to artificial intelligence capabilities that can generate new content and insights automatically. In this lesson, we will explore how generative AI within the Jupyter notebook environment can augment human capabilities, and enhance learning and productivity.

AI - Language models (LMs) have different capabilities with respect to reasoning, coding, mathematics, and language comprehension. This figure shows proficiency in mathematics (GSM8K Score) and model’s generalisation abilities (Exam Score) on the Hungarian National High School Exam (Image Credit: DeepSeek-LLM)

1.2 AI code assistant#

AI code assistants such as Jupyter AI leverage generative AI models from different providers such as OpenAI : ChatGPT within an IDE environment such as JupyterLab. An AI code assistant will provide:

  • prompt engineering with respect to your programming language,

  • context-aware code suggestions, completions, debugging, formatting, explaination and generation

  • chat user-interface to ask question and get help on related topics such as installation troubleshooting

  • and many more

This is to improvde learning and productivitiy.

flowchart.png

2. Jupyter AI Extension#

This section is modefied from jupyter-ai documentation. You can also check the YouTube video AWS re:Invent 2023 - Jupyter AI where the developers introduce this tool.

2.1 Overview#

Jupyter AI connects generative AI models with Jupyter notebooks, which can enhance your learning and productivity. Specifically, Jupyter AI:

  1. turns your notebook into generative AI playground

  2. provides chat user-interface in JupyterLab for chatting with your generative AI model

  3. supports a wide range of generative model providers including AI21, Anthropic, AWS, Cohere, Hugging Face, NVIDIA, and OpenAI

  4. allows users to run generative AI models on their own machines through GPT4All rather than relying on cloud-based services.

In this section we will learn the first two points with focus on ChatGPT3.5 Turbo of OpenAI.

2.2 Installing Jupyter AI#

2.2.1 Installation Steps#

Steps to install Jupyter AI:

  1. Open an Anaconda Prompt (Anaconda3) or Anaconda Prompt (Miniconda3)

  2. It is not a bad idea to update your pip before installing a new package

pip install --upgrade pip
  1. Then you can install Python Jupyter AI with pip:

pip install jupyter-ai

Alternatively, you can use install this extension with a conda. Details on installing and using Jupyter AI can be found on Jupyter AI official documentation. The above steps should work for windows and linux users. For mac users, you need to do more steps as shown on jupyter-ai GitHub repository.

2.2.2 Installation Troubeshooting#

The Chat UI on the left menu may not work you will get this error message:

There seems to be a problem with the Chat backend, please look at the JupyterLab server logs or contact your administrator to correct this problem.

You might need to install few extra packages such as langchain_nvidia_ai_endpoints and cohere, and restart your computer. Check this stackoverflow post for details.

2.3 Loading Jupyter AI magic commands#

To use Jupyter AI, you need enable the %ai and %%ai magic commands in your notebook.

What is a magic command? Ask your LLM.

# # Load extension
# %load_ext jupyter_ai

2.4 Select provider#

Jupyter AI supports a wide range of model providers and models. To use Jupyter AI with a particular provider, you must install its Python plugins for that provider and set the provider’s API key (or other credentials) in your notebook or in the Jupyter AI Chat user-interface (UI) at the left menu.

You can view the available providers and models as follows:

# List available LM
%ai list

The environment variable names of API-keys are used when setting up a model. If multiple variable names are listed for a provider, all must be specified. Check Jupyter AI offical documentation for information about how to use each of the above listed providers.

The label Set is

  • ✅ if you provided the API-key for that provider

  • ❌ if you did not provide the API-key for that provider

  • N/A if the provider does not require API-key

The label Models shows the provider_name:model_name.

Aliases are are nicknames for models. For example, typing chatgpt is the same as typing openai-chat:gpt-3.5-turbo.

2.5 Install provider plugins#

You need to select your language model, and you can also select an embedding model.

  • A language model are typically pre-trained.

  • An embedding model is used when learning and asking about local data.

You can select language model and embedding model through the Jupyter AI Chat interface at the left menu or manually. However, Jupyter AI requires third-party plugins, so before we use a model, we need to install the Python plugins for that model or provider.

The provider that we will select is OpenAI that is the developer of ChaTGPT 3.5 Turbo and many other LLMs. For other providers, you need to check the required plugins.

# Uncomment and run the command below to install openAI plugins
#!pip install openai

2.6 Get API-key for the selected provider#

To be able to use Jupyter AI for a given notebook, you need to do the environment variable authentication in that notebook using your unique API-key. API-key is a special code that grants users access to the provider services. This code is like a password, so you should not share this code with anyone.

For this lesson the provider is OpenAI, but you can select any other provider of your choice. You need to create an OpenAI account to get OpenAI API key. It will be free for a period of time, and then you can add a credit card number to get charged for your usage of paid services. This GitHub file provides information about OpenAI API Key and how to get OpenAI API Key.

2.7 Set API-key for the select provider#

One on the drawbacks of Jupyter AI environment variable authentication through the Chat UI may be insufficent, and you need to do environment variable authentication using your unique API-key in the notebook.

3.7.1 Option 1: Reading your API key from the notebook#

  • Pros: Convenient

  • Cons: Security risk; Ensure to remove your API key before sharing the notebook

To set API key from your notebook, you need to import the operating system module os and use os.environ['Environment_variable']= 'API_key' to set the API for your selected provider.

The example below is for OpenAI provider:

%load_ext jupyter_ai
import os
os.environ['OPENAI_API_KEY'] = 'add your OpenAP API key here'

Try it out:

## Load Jupyter AI extension
# %load_ext jupyter_ai 
# # Set API key
# import os
# os.environ['OPENAI_API_KEY'] = 'add your OpenAP API key here'

3.7.2 Option 2: Reading API key from an external file#

  • Pros: Enhanced security; Separating the API key into a separate file enables sharing the notebook without exposing the key

  • Cons: Additional setup required; You must include the code snippet below at the start of each notebook and specify the path to your API key file

To set an API key from a file, copy and paste your API-key to a textfile, let us say ‘OPENAI_API_KEY.txt’. Then place this code at the beginning of your file. Make sure to change the file_path_name as needed.

# # Load Jupyter AI extension
# %load_ext jupyter_ai

# #Read API key from a file
# def read_API_Key(file_name):

#     # Open the file in read mode 
#     with open(file_name, 'r') as file:
#         # Read the content of the file
#         API_key = file.read().strip()  # strip() removes any leading or trailing whitespace
#     return API_key

# # Set API key
# import os
# file_path_name = 'ai_assistant/OPENAI_API_KEY.txt'
# os.environ['OPENAI_API_KEY'] = read_API_Key(file_path_name)

3.7.3 Option 3: Create an authentication module#

  • Pros: Versatile solution; Enables authentication for various providers from any location

  • Cons: Technical proficiency needed; Involves coding skills

At the beginning of my notebook, I can place these three lines to load AI magic commands and set up environment variable authentication:

%load_ext jupyter_ai
from ai_assistant import api_key  # Import the api_key module
api_key.set_API_key('OPENAI')     # Set the API key for the selected provider: 'OPENAI' or 'ANTHROPIC'

By passing the provider name to the api_key.set_API_key() function, authentication is performed based on the API key saved in a file. The module is saved somewhere on my machine, but I can import it from anywhere because I structured it as a package with __init__.py, rather than just a single Python module file as we covered in a previous lesson.

While we have not covered package structures, you can ask your generative AI model to demonstrate how to create a package like ai_assistant and develop a module such as api_key with the set_API_key function.

# %load_ext jupyter_ai              
# from ai_assistant import api_key  #Import api_key module  
# api_key.set_API_key('OPENAI')     #Set API key for selected Provider: 'OPENAI' and 'ANTHROPIC'

2.8 Getting help (optional)#

Let use look at the help of Jupyter AI to learn about what Jupyter AI offers and how to use this AI code assistant.

#Getting help
%ai --help

The above help tells us that the magic command

%%ai [OPTIONS] COMMAND 
# Or 
%%ai COMMAND [OPTIONS]

invokes a language model identified by MODEL_ID, with the prompt being contained in all lines after the first.

From OPTIONS , the most important opition is -f that allows you to format your model output as code, html, image, json, markdown, math, md, or text. If this is unclear, it will be clear with an example, so let us see few examples.

You can get help on a specific command. For example, let us get help on error command:

%ai error --help

3. Class exercise#

Complete this exercise by utilizing:

  • Jupyter AI,

  • another AI chat assistant,

  • or any Language Model (LM) of your preference directly without an AI chat assistant.

The exercise aims to teach the utilization of Language Models (LMs) for coding assistance and emphasizes the significance of prompt engineering.

3.1 Problem statement#

An student asked: For a Pandas DataFrame, how to display the rows of the days with the maximum precipitation for each weather station in each year in our study period and area?

3.2 Prompt engineering and code generation#

Note

Prompt engineering involves crafting and refining the language or structure of prompts to improve the performance of a language model in generating accurate and relevant responses. Mastering prompt engineering enables effective utilization of LMs across tasks from problem-solving to creative writing. Learn more with Real Python's tutorial Prompt Engineering: A Practical Example.

Here is one prompt that we can start with and refine later as needed:

Generate a Pandas DataFrame with daily data from 2020-01-01 to 2023-12-31 in Fort Myers, Florida with the following columns:
(1) 'TMIN' that is the minimum temperature,
(2) 'TMAX' that is the maximum temperature,
(3) 'PRCP' that is precipitation in inches,
(4) 'AWDS' that is the average wind speed in miles per hour,
and (5) 'STATION' which has two stations, 'Field Airport' and 'SWF Airport'.

The index is the date.

Display the DataFrame to screen.

Find and display the rows of the days that has the maximum precipitation in the study period for each of the two stations for each year.

Let use see if our select LM can do this. You can use an AI code assistant such as Jupter AI or directly use any LM of your choice.

%%ai chatgpt -f code
Generate a Pandas DataFrame with daily data from 2020-01-01 to 2023-12-31 in Fort Myers, Florida with the following columns:
(1) 'TMIN' that is the minimum temperature,
(2) 'TMAX' that is the maximum temperature,
(3) 'PRCP' that is precipitation in inches,
(4) 'AWDS' that is the average wind speed in miles per hour,
and (5) 'STATION' which has two stations, 'Field Airport' and 'SWF Airport'.

The index is the date.

Display the DataFrame to screen.

Find and display the rows of the days that has the maximum precipitation in the study period for each of the two stations for each year.

This is a promising beginning.

We need to verify the output and not just rely on everything that our LM is providing. The code snippet above exhibits a few issues:

  1. Our current LM setup does not have access to datasets.

  2. It employs a for loop instead of utilizing Pandas operations.

  3. It utilizes print instead of the display function, which presents data in a visually appealing tabular format for Jupyter notebooks.

To tackle the first issue, we can instruct our LM to access a specific data file online or on our machine, instead of generating random data.

Let us now address the second and third problems.

3.3 Code improvement#

Ask your LM to use Pandas operations instead of for loop, and to display results in JupyterLab that is to use display instead of print function.

Here is what the LM sugguested.

3.4 Handling challanging problems#

The above example demonstrates that our LM, which is ChatGPT 3.5 Turbo, is able to solve this relatively straightforward problem. However, a more challenging problem might require additional strategies to handle effectively.

When faced with a more complex problem, several approaches can be beneficial. I asked my LM to complete this section for me. The LM suggests the first four points. I added the heading of point 5 and asked my LM to complete it for me:

  1. Break Down the Problem: If the problem is complex, breaking it down into smaller, more manageable sub-problems can help. Providing step-by-step instructions or dividing the problem into sequential tasks can guide the model in tackling each part systematically.

  2. Provide Context and Examples: Offering context, examples, or related information can assist the model in understanding the problem better. Clear descriptions, relevant data samples, or background information can enhance the model’s comprehension and problem-solving capabilities.

  3. Ask Specific Questions: Instead of presenting a broad or vague problem statement, asking specific questions or providing precise requirements can help direct the model’s attention to the key aspects of the problem.

  4. Iterative Approach: In cases where the problem is intricate, an iterative approach may be beneficial. Engaging in a dialogue with the model, providing feedback on its responses, and refining the problem statement based on initial outputs can lead to a more targeted and accurate solution.

  5. Consider Advanced Language Models: By leveraging a robust language model, you can potentially achieve more accurate results, handle more complex patterns, and tackle a wider range of tasks with greater efficiency.

Now let us try using different version of ChatGPT 4 in Chatbot Arena.

4. Other useful tools#

Chatbot Arena is an open-source research project developed to an open crowdsourced platform to evaluate LMs.

Let us try to use Chatbot Arena.

5. Conclusions#

Here are the key points to consider:

  • Pros of Language Models: These models offer the potential to improve various aspects of the coding process, spanning from initial development to code optimization.

  • Cons of Language Models: Drawbacks include creating a dependency on AI for coding tasks can hinder personal skill development; no guarantee of error-free code without human review; and the inability to provide creative solutions that require human insight.

  • Effective Prompts: Step-by-step, detailed, clear, precise, and contextually relevant prompts can proficiently guide language models towards precise and targeted responses.

  • AI Code Assistants: Tools like Jupyter AI can boost your Python learning and productivity by aiding in coding tasks directly within your integrated development environment (IDE), such as JupyterLab.

To sum up, AI assistanance is not here to replace the work that you do, but to help you. Try to balance the benefits of AI assistance with the need for personal skill development and critical thinking.