Guide

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

ChatGPT has revolutionized how humans interact with computers: You can now retrieve a wealth of information on almost every subject using natural language queries. However, ChatGPT’s capabilities are not limited to responding to standalone text queries. You can also upload structured data in the form of files in CSV, Excel, JSON, and other formats and analyze the file contents in natural language through prompting. 

This article describes how to analyze CSV data with ChatGPT. You will learn how to analyze single and multiple CSV files, create graphs and charts using CSV data, and generate CSV files using dummy data and images. After reading this article, you will thoroughly understand how to analyze CSV data with ChatGPT.

Key concepts to analyze CSV data with ChatGPT

Concept Description
Data Analyst GPT by ChatGPT You can use the ChatGPT interface or a specialized Data Analyst GPT from ChatGPT to analyze CSV data. The latter generally performs better.
Analyzing single or multiple CSV files You can analyze both single and multiple CSV files. ChatGPT will infer the relationships among multiple CSV files. However, it is a good practice to provide ChatGPT with all the necessary metadata for accurate analysis, particularly in cases involving multiple files.
Plotting graphs and charts with ChatGPT ChatGPT enables you to create both static and interactive plots using CSV files. You can modify the plot style and format using text queries.
Generating CSV files from dummy data and screenshots ChatGPT can generate dummy data based on text queries. This is useful when you want to create synthetic data to test an algorithm. Additionally, you can copy and paste table contents and images containing structured data, and ChatGPT will convert them into CSV files.
Analyzing CSV data with AI platforms AI platforms like WisdomAI offer a comprehensive environment for developing production-grade business intelligence systems that utilize structured data, including CSV files.

Analyzing CSV data in a single file with ChatGPT

Analyzing a single CSV file in ChatGPT is as simple as uploading the file and asking questions in natural language. You can use a default ChatGPT model to analyze a CSV file, or you can use a GPT version that specializes in data analysis.

Analyzing CSV data with the default ChatGPT interface

To analyze a CSV file, go to ChatGPT, upload your CSV file using the “+” button at the bottom left of the text field, and ask questions about the CSV data. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

As an example, we will upload the CSV file from a Kaggle sales dataset and ask about the product category that yielded the highest profit in 2023. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

The screenshot above shows the result; it was double-checked using the Python Pandas library and is correct. You can ask as many questions as you want, and ChatGPT will attempt to answer them to the best of its ability. 

Analyze CSV data with a specialist tool: Data Analyst by ChatGPT

While the default ChatGPT is quite capable of answering basic data analysis queries, ChatGPT offers a specialist Data Analyst GPT for more complex data analysis tasks. To use it, click “Explore GPTs” on the left sidebar of your ChatGPT dashboard and search for “Data Analyst.” Then, click the one from ChatGPT. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Click the “Start chat” button to interact with the Data Analyst GPT. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

The rest of the process is straightforward: Simply upload your CSV file and ask questions, as illustrated in the following screenshot.

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Modifying and downloading a CSV file

You can modify an existing CSV file and download the modified version with ChatGPT. For example, in the following screenshot, we ask ChatGPT to add a unit price column in the CSV file. To ensure that we obtain the correct answer, we ask ChatGPT to explain its “thought process” for generating the file. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

The output above shows that ChatGPT correctly guessed the columns and added a new column. If you click the link to download the file, you should see the “Unit Price” column in the downloaded CSV file. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

You can click on the “Analysis” button at the end of the response to see the Python code used by ChatGPT to perform the analysis. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

{{banner-large-3="/banners"}}

Analyzing multiple CSV files with ChatGPT

ChatGPT allows you to analyze up to 10 CSV files simultaneously. It can also usually infer the relationship among the files. 

For sample analysis of multiple files, we will upload CSV files from the Bike Sales data sample from Kaggle. As an example, we will request the names of employees with the highest sales. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

The output above shows how ChatGPT identified the names of employees with the most sales. It generated a CSV file containing the employee names and total sales, as shown in the following screenshot.

You can click on the “Download” icon at the top right of the file to download it. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

You can make a modification in a CSV file based on data from multiple files. For example, in the following screenshot, we asked ChatGPT to add the employee and product names in the “SalesOrderItems” file. You can see in the output that ChatGPT was unable to find the “PRODUCTNAME” column in the Products table, so it referred to the “ProductsTexts.csv” file to extract the column name. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Next, it creates the file and presents it for downloading. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

If you download the CSV file, you should see “Employee Name” and “Product Name” columns. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Generating plots and graphs using CSV data in ChatGPT

You can easily generate both static and interactive graphs using CSV data in ChatGPT. 

Creating static charts

In the following screenshot, we ask ChatGPT to generate a chart showing the average monthly sales for all the years in the data file. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

ChatGPT selects the most appropriate graph for your request, which it thinks is a line plot in this case. Let’s ask it for a bar graph instead. 

Changing plot styles

You can specify plot details such as color, width, and legends, in your text query. For example, we can ask ChatGPT to increase bar widths for the previous graph. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

You can see that the bars are thicker now. You can provide various style specifications, and ChatGPT will make the corresponding changes to your plots. 

Adding interactivity 

You can also add interactivity to your plots. OpenAI says that bar, pie, scatter, and line charts currently have interactivity support.

For instance, we can ask ChatGPT to show a pie chart of total sales by category.

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

To make this pie chart interactive, click the “Switch to interactive chart” icon at the top right corner. Now, if you hover over a category, you will see the total sales for it. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Generating CSV files from scratch with ChatGPT

In addition to modifying existing CSV files and generating and downloading the modified version, you can also generate dummy CSV files through text instructions. In addition, ChatGPT can generate CSV files from screenshots, images, and structured data copied from various sources, including websites, Word files, and PDFs. 

Generating a dummy CSV file through a prompt

Let’s see an example of creating some dummy data to train a machine learning model. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

ChatGPT generated a CSV file with 500 records in around 50 seconds. The file contains both numerical and categorical data, making it ideal for classification problems. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Note: The synthetic data generated by ChatGPT must be carefully validated before training any machine learning model for production. 

Generating CSV through copy-pasting structured data

You can copy and paste structured data from other sources and generate CSV files. For example, we can copy the data from the following table and ask ChatGPT to generate a CSV file. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations
Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

You should see the following downloaded file. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Generating a CSV file from images and screenshots

In the next example, we will copy a screenshot of the table and ask ChatGPT to generate a CSV file using the table data. We also specify that it should see the team logo in the “Next Match” column and replace it with the full team name. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

The output below shows that ChatGPT not only leveraged its vision capabilities to identify the team logos but also used its default knowledge to convert them to full team names. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Challenges and limitations

While ChatGPT excels at analyzing CSV files, it has certain limitations, and several challenges can impact its performance in data analysis:

  • You cannot upload more than 10 CSV files simultaneously for data analysis in a single conversation.
  • The file size cannot exceed approximately 50 MB, depending on the data in each row. 
  • ChatGPT struggles when multiple tables are added to a single spreadsheet.
  • ChatGPT often hallucinates while analyzing large CSV files with complex relationships, and hallucinates if column names and the relationships among multiple columns and CSV files are not clearly defined. 
  • Privacy is another issue: You may not want to upload sensitive data to ChatGPT for analysis. 
  • For large datasets, ChatGPT may only read a small portion of the rows, leading to incomplete analysis.

Analyzing CSV files with AI platforms 

While ChatGPT is an excellent tool for analyzing single or multiple CSV files, it is not suitable for production-grade business intelligence (BI) systems involving multiple large CSV files in a team environment. This is where purpose-built AI platforms for BI come into play. Like ChatGPT, AI-based BI solutions enable the analysis of CSV files using natural language queries. However, they address the limitations of ChatGPT by allowing larger file sizes and more files, and they experience fewer hallucinations. 

These platforms typically utilize a semantic layer to capture global knowledge and integrate it into ontologies and knowledge graphs, thereby generating more accurate and error-free responses. Furthermore, they enable team collaboration, data privacy, and seamless integration with other tools, resulting in a production-grade business intelligence (BI) system.

WisdomAI is one such tool that offers a robust and personalized business analytics solution. It employs a semantic layer and context engine to extract data and metadata information from structured datasets, along with global domain knowledge, and integrates it into knowledge graphs. This results in a more robust and error-free analysis compared to ChatGPT. 

With WisdomAI, you can create domains where each domain contains a set of files you want to analyze simultaneously. A domain also contains associated rules, metadata, global information, knowledge sources, and advanced configurations you can use to tweak the data analysis settings. 

WisdomAI example

The screenshot below shows an existing domain called “BikeSales.” You can create a new domain by clicking on the “Add domain” button at the top right. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Once you open a domain, you will see an overview of your domain, data sources, knowledge, and advanced settings.

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

In the “DATA SOURCES” tab, you can add descriptions about what data the CSV files contain. Ideally, you should clarify any ambiguous terms in the description. 

If you click on the “Create Relationship” button from the top right in the above screenshot, you can define relationships among CSV files. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

You can add global domain knowledge about the data in CSV files using the “KNOWLEDGE” tab. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Finally, in the “ADVANCED” tab, you can customize the chat by specifying system instructions, setting the start date of the fiscal year for your data, and specifying the AI thinking level (low, medium, high) and other advanced settings. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Once you have chosen descriptions, metadata, and settings, you can start chatting with WisdomAI. It will generate responses based on information in the data sources, its global knowledge, and your advanced settings. 

Let’s ask some questions about the dataset. In the following screenshot, I asked WisdomAI to provide me with the names of the three suppliers whose products generated the highest sales. WisdomAI responded with the answer, the SQL query it used to generate the response, and a bar plot displaying the supplier names and sales bars. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

You can edit the plot, expand it, and also save it to a CSV file, Excel file, or PNG image. You can also provide feedback by clicking the thumb-up and or thumb-down icon at the bottom left of the response.

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

WisdomAI also presents a few follow-up questions that we might want to ask. 

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Clicking on the fourth question produces the following response:

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

You can view the sales for a company over time. You can change the company from the dropdown list at runtime, and the plot will update automatically. Also, the plots are interactive by default.

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

{{banner-small-3="/banners"}}

Last thoughts

ChatGPT provides a comprehensive set of functionalities for analyzing CSV files. From analyzing multiple CSV files simultaneously to generating static and interactive plots, and creating CSV files, you can comprehensively analyze a CSV file in ChatGPT.

However, ChatGPT has certain limitations when analyzing CSV files. You can only upload a limited number of files with a size limit, and ChatGPT often hallucinates while analyzing large CSV files with complex relationships.

AI-based business intelligence tools overcome these challenges by implementing a semantic context layer that generates more accurate and robust responses grounded in metadata and domain knowledge. WisdomAI is a data analytics and business intelligence solution that allows you to develop production-grade data analytics capabilities in a minimal timeframe. To learn more about how WisdomAI can solve your business intelligence and data analytics problems, book a demo with WisdomAI.

Continue reading this series