How to use ChatGPT for data integration in finance software?

ChatGPT image prompt: “Generate an image for the Why use ChatGPT for Python Colab notebooks? article”.

By Artem Rymar
Business Analyst


Intro

Nowadays, ChatGPT needs no introduction, given its widespread adoption and reputation as the go-to conversational AI for tackling everything from coding conundrums to existential issues. It is capable of understanding and generating human-like text based on the input it receives. It's a powerful tool for a wide range of applications, from writing code to automating tasks and even creating content. Its ability to generate code snippets makes it particularly useful for developers looking to streamline their workflow. In short, what an incredible time to live and engineer in! 🚀

Why Use ChatGPT for Colab Notebooks?

  • Efficiency: Automate boilerplate code generation, allowing developers to focus on complex logic and problem-solving.

  • Accuracy: Reduce human error in repetitive tasks and improve the overall quality of code.

  • Innovation: Explore new solutions and approaches suggested by AI, potentially uncovering more efficient or effective methods.

Practical example. How I used ChatGPT to help me retrieve the interest rates data from public API sources

In the pursuit of automating financial operations within Kyriba, specifically for the integration of market rates, I leveraged ChatGPT's AI capabilities to streamline the process of fetching Shanghai Interbank Offered Rate (SHIBOR) data and preparing it for import.

The task at hand involved two main objectives: retrieving SHIBOR interest rates over a specified period and formatting this data for Kyriba's system in an efficient and error-free approach.

Fetching SHIBOR Interest Rates and importing them to Kyriba

My initial approach was to obtain SHIBOR interest rates via a GET request to a specified URL, incorporating parameters for the start and end dates. Recognizing the potential of AI to simplify this process, I prompted ChatGPT with a request to generate a Python script capable of executing the GET request, parsing the JSON response, and formatting the data into a structured CSV file. ChatGPT responded with a concise and functional Python script, adeptly using the requests and pandas libraries to fetch the data and save it in the desired CSV format.

First prompt:

"Can you help me write a Python script that fetches SHIBOR interest rates from a specified URL?"

https://www.shibor.org/ags/ms/cm-u-bk-shibor/ShiborHis?lang=en&startDate={start_date}&endDate={end_date}

The script should make a GET request using start and end dates as parameters. Upon receiving the JSON response, it should parse the data, focusing on extracting relevant fields, and then format this information into a structured CSV file. The fields of interest from the JSON response include the business date and interest rates. Please provide a function in the script for this process, including an example usage of the function with hypothetical start and end dates.

The output code:
# Import necessary libraries
import requests
import pandas as pd
from datetime import datetime


# Define the function to fetch data
def fetch_shibor_data(start_date, end_date):
    # Construct the URL
    url = f"https://www.shibor.org/ags/ms/cm-u-bk-shibor/ShiborHis?lang=en&startDate={start_date}&endDate={end_date}"

    # Make the HTTP GET request
    response = requests.get(url)

    # Check if the request was successful
    if response.status_code == 200:
        # Parse the JSON response
        data = response.json()

        # Extract records
        records = data.get("records", [])

        # Convert the records to a DataFrame
        df = pd.DataFrame(records)

        # Add date parsing and formatting if needed, e.g., to ensure dates are in a specific format
        df['showDateEN'] = pd.to_datetime(df['showDateEN']).dt.strftime('%Y-%m-%d')

        # Save the DataFrame to a CSV file
        csv_filename = f"shibor_data_{start_date}_to_{end_date}.csv"
        df.to_csv(csv_filename, index=False)

        print(f"Data successfully saved to {csv_filename}")
    else:
        print(f"Failed to fetch data. Status code: {response.status_code}")


# Example usage
start_date = "2021-10-21"  # Change to your desired start date
end_date = "2021-10-21"    # Change to your desired end date
fetch_shibor_data(start_date, end_date)

The output file:

Here I have noticed that the date format in the CSV output did not meet the requirements and requested a correction to ensure dates were formatted as "MM-DD-YYYY". Another challenge was to organize the result data properly so that for each of the Tenors from the results we would have its rate as a separate row in the CSV table.

Second prompt:

With the first attempt, I noticed that the output data should be formatted correctly for the input structure that is uploaded to the Kyriba application - the date format was not the proper one so I had to do some formatting of the output CSV file for that.

For business dates, we should fetch showDateEN but we need to turn it to the format of MM-DD-YYYY. For each of the 8 from the JSON response for our request we get fields: 1Y, 3M, ON, 1M, 6M, 9M, 1W, 2W.

We fetch their rate as a separate row. The first column businessDate will have showDateEN in the aforementioned format. Then in the rate column, I need the value of each of those request fields, for example, 2.3960 (taken from "1Y") and in the product code column value of 1Y itself. We should format this column by adding the prefix SHIBOR so it would be SHIBOR1Y or SHIBOR3M.

This script that was returned from ChatGPT makes a GET request to the SHIBOR website, extracts the relevant data from the JSON response, and saves it as a CSV file:

The output code:
import requests
import pandas as pd
from datetime import datetime

def fetch_and_process_shibor_data(start_date, end_date):
    url = f"https://www.shibor.org/ags/ms/cm-u-bk-shibor/ShiborHis?lang=en&startDate={start_date}&endDate={end_date}"
    response = requests.get(url)
    if response.status_code == 200:
        data = response.json()
        records = data['records']
        processed_records = [{
            "businessDate": datetime.strptime(record["showDateEN"], "%d %b %Y").strftime("%m-%d-%Y"),
            "rate": value,
            "productCode": f"SHIBOR{key}"
        } for record in records for key, value in record.items() if key not in ["showDateEN", "showDateCN"]]
        df = pd.DataFrame(processed_records)
        csv_filename = f"shibor_rates_{start_date}_to_{end_date}.csv"
        df.to_csv(csv_filename, index=False)
        print(f"Data successfully processed and saved to {csv_filename}")
    else:
        print("Failed to fetch data.")

# Example usage
fetch_and_process_shibor_data("2024-02-01", "2024-02-02")

The output file:

Third prompt:

Seeking to enrich the data set with additional information, I requested ChatGPT to modify the script to include empty columns for transactionTime, securityId, and productDescription. ChatGPT promptly adjusted the script, demonstrating its ability to understand and implement specific data structure requirements, thereby ensuring the output aligned perfectly with Kyriba's import specifications.

“I need to enhance this script to include additional empty columns in the CSV for transactionTime, securityId, and productDescription. These columns won't be populated with data from the JSON response, instead, they should be initialized as empty strings for each record in the dataset. Could you modify the script to accommodate these requirements, ensuring the CSV output aligns with specific import specifications”?

The output code:
# Inside the fetch_and_process_shibor_data function, adjust the processed_records list comprehension
processed_records = [{
    "businessDate": datetime.strptime(record["showDateEN"], "%d %b %Y").strftime("%m-%d-%Y"),
    "transactionTime": "",
    "rate": value,
    "productCode": f"SHIBOR{key}",
    "securityId": "",
    "productDescription": ""
} for record in records for key, value in record.items() if key not in ["showDateEN", "showDateCN"]]

The output file:

Reflection

This example illustrates the evolution of the solution, from a Python script for data fetching capabilities for dynamic data transformation. Within 3 attempts, we successfully managed to achieve the desired result file and import it into our Kyriba application.

These all demonstrate the practical application of programming and automation in financial data management, showcasing the adaptability and problem-solving process in real-world scenarios.

Important Notice: Safeguarding sensitive information

When leveraging ChatGPT or any AI-driven tool in your development process, it is imperative to prioritize data security and privacy. Do not input or share any sensitive, proprietary, or personally identifiable information (PII) when generating code or queries with ChatGPT. This includes but is not limited to, API keys, passwords, personal data of clients, or any confidential business logic.

Best practices and tips

  • Iterate on Prompts: If the first code generation isn't perfect, refine your prompts and try again.

  • Use Placeholder Data: When working with ChatGPT, always use generic or placeholder data that does not reveal actual details. Store sensitive information in environment variables or secure vaults and reference these in your code without exposing their values.

  • Stay Updated: Both ChatGPT and the Kyriba API evolve, so staying informed about updates and best practices is crucial.

Conclusion

Integrating ChatGPT into your Colab notebooks opens up a new realm of possibilities for coding and problem-solving. By following this guide, Kyriba developers can enhance their productivity, reduce errors, and code testing runs, and foster innovation. Dive into this AI-powered coding journey and unlock the potential of your projects.