Guide

AI for Data Analysis: Best Practices for Descriptive, Diagnostic, Predictive, and Prescriptive Analytics

Data analysis is central to business decision-making, particularly in identifying performance bottlenecks, creating product strategies, preparing for market shifts, and driving operational efficiency. Traditional data analytics workflows, however, often remain limited by rigid processes, siloed data sources, slow turnaround times, and a reliance on specialized technical teams. These constraints can delay insights and hinder the agility needed in today’s fast-paced environments.

Generative AI offers a fundamentally different approach. By incorporating large language models (LLMs) and other AI techniques, data analysts can move beyond manual queries and static dashboards. Instead, they can interact with their data using natural language prompts and automate tasks like exploratory data analysis and visualization. As a result, analytics becomes more accessible (or “democratized”), allowing teams to self-serve insights while maintaining the precision demanded by technical stakeholders.

This article is aimed at businesses and data analysts interested in integrating generative AI (GenAI) into their day-to-day workflows. We briefly discuss the four types of data analytics and fully explore how GenAI techniques can improve each type. Finally, we discuss best practices for GenAI data analysis, including data governance, implementing guardrails, and capturing user feedback.

Maturity scale for data analytics

There are four types of data-driven decision-making: descriptive, diagnostic, predictive, and prescriptive. Each stage builds upon the previous one, moving from reporting to insights and ultimately informing business strategy. While many organizations remain anchored in the early stages, GenAI offers a path toward more agile, accessible, and automated analytics, as it already has the capabilities for descriptive and diagnostic analytics. However, it’s good to note that its application to predictive and prescriptive analytics remains largely aspirational, though it holds significant potential for the future.

Type of data analysis	Description	Output	Limitations
Descriptive	Summarizes historical or current data to know “what happened” and to identify patterns and trends	Visualization	Static charts do not explain why events occurred; limited to past and present
Diagnostic	Shows relationships and trends to know “why it happened”; typically involves correlation analysis and deeper explorations of data subsets	Root cause analysis	Often slowed by manual queries and specialized skill sets
Predictive	Uses historical data and models to know “what might happen next”; commonly involves statistical forecasting and machine learning	Forecasting	Vulnerable to model drift, data shifts can require extensive feature engineering and iterative model tuning
Prescriptive	Provides actionable steps and strategies to answer “what should we do next?”; relies on optimization algorithms, simulations, etc.	Business strategy	Results depend on the accuracy of underlying models and organizational buy-in.

Showing how the types of data analytics evolve (source)

In the following sections, we will explore how GenAI can accelerate progress up this ladder, making analytics more accessible and reducing friction in the workflow.

The role of AI in data analysis

As data volumes and varieties grow, AI techniques play an increasingly central role in automating the entire analytics pipeline. Today, organizations are using machine learning, deep learning, natural language processing, and generative AI to streamline the data analysis workflow, eliminating a great deal of manual overhead while increasing access to actionable insights:

Machine learning (ML): ML models establish a foundation for pattern recognition and predictive analytics. Using historical data and feature extraction, these models detect recurring signals and forecast likely outcomes. Best practices include version-controlling models and features, implementing CI/CD for model training and deployment, and using techniques like cross-validation to measure generalization on hold-out or test data. These approaches make sure that models remain robust as data evolves.‍
Deep learning (DL): Deep learning techniques help in handling complex, unstructured data—such as text, images, and audio—without exhaustive preprocessing. Implementing deep learning typically involves optimizing training pipelines with GPUs, using transfer learning to reduce data requirements, and continuously monitoring model performance for signs of drift or unexpected behavior.‍
Natural language processing (NLP): NLP techniques enable analysts to query, summarize, and manipulate data sets using natural language prompts. Implementing NLP involves training or fine-tuning domain-specific language models, applying role-based access controls to ensure that the right stakeholders can access the right data, and establishing prompt patterns to ensure consistent outputs.‍
Generative AI and LLMs: GenAI techniques, powered by LLMs, help you create entirely new insights and engage in an iterative conversation with your data. GenAI has many use cases in data analytics, including areas such as sales, supply chain, customer success, and anomaly detection, among others. However, you have to implement guardrails to avoid producing sensitive or misleading outputs, and you need to integrate feedback loops that let users flag inaccurate results. Over time, these measures help align GenAI outputs with business needs.

Structured data

When working with structured data in relational databases, ML models are commonly used for tasks like regression, classification, automated feature selection, anomaly detection, and forecasting. It’s important to ensure that the data is consistent, properly labeled, and easily interpretable. For example, if you have a categorical feature like accident_severity with values “high” and “low,” you may need to binary encode it (i.e., 0 and 1 for the two values) for compatibility with ML pipelines.

Meanwhile, LLMs focus on natural language processing. Instead of writing complex queries, you can simply ask questions like “what were the top three factors influencing last quarter’s sales dip?” and quickly extract actionable insights without extensive preprocessing.

Semi-structured data

For semi-structured data like JSON, XML, and logs, a hybrid approach combines traditional ML techniques with LLMs. This hybrid approach is necessary because semi-structured data often contains both well-defined fields and more free-form text, requiring ML methods to extract and process predictable elements while relying on LLMs to interpret and summarize more nested content.

Key fields, such as request latency or error codes, can be extracted and mapped into structured formats for ML models to analyze. Meanwhile, LLMs handle nested or irregular elements, like error messages or metadata, to identify patterns, summarize logs, and surface correlations without predefined parsing logic. This method combines the precision of structured processing with the adaptability of natural language understanding.

Structured, unstructured, and semi-structured data (source)

For more details on different data types and how to prepare your data for AI, read our guide on AI-ready data.

Unstructured data

Unstructured data like texts (e.g., PDFs or chat transcripts), images, and audio start with embedding models and vector databases to transform the raw content into semantically searchable representations. These embeddings will enable semantic search, summarization, sentiment analysis, and Q&A tasks.

For example, you can convert hundreds of PDF reports into embeddings stored in a vector database and later use LLMs to retrieve the most relevant content and provide concise, human-readable explanations. For images and audio, adopt multimodal models that interpret these formats and produce textual summaries. Over time, you can optimize your embedding models, fine-tune LLM prompts, and integrate feedback loops to improve accuracy and context-awareness.

Descriptive analytics with generative AI

In descriptive analytics, the focus is on understanding what happened in the past, identifying patterns, and tracking KPIs. Traditionally, this might involve sifting through dashboards or writing SQL queries to extract a few key metrics. GenAI accelerates this process.

For structured data, the focus is on “access and analyze” workflows. Instead of building complex queries or navigating multiple reports, you can prompt an LLM with a natural-language request, and it will return key metrics, trends, or top-level summaries. For unstructured data—such as a repository of customer feedback, event logs, or PDF documents—you can use GenAI to “search, summarize, and write,” transforming raw text into concise, actionable insights. The result is faster, more intuitive access to the core facts behind your data.

Beyond textual summaries, GenAI can simplify data visualization. Instead of manually configuring charts, you can say “show me last quarter’s sales trends in the US” and receive a generated chart highlighting any spikes or anomalies. Similarly, consider a scenario where you need to identify which customer segment is showing the most volatility in sales in EMEA. A static dashboard might provide aggregate metrics, but it would require drilling down through multiple filters and dimensions to pinpoint the exact segment. Instead, a generative AI model can let you ask a direct question—“which EMEA customer segment experienced the greatest month-over-month fluctuation in sales last quarter?”—and return a focused summary. It might highlight a particular segment, provide the exact percentage change, and even generate a comparison chart so you can see how that segment’s behavior differs from others.

Diagnostic analytics with generative AI

Once you know what happened, the next question is: Why did it happen? Diagnostic analytics helps you pinpoint root causes, seeking to explain anomalies, performance drops, or unexpected trends. GenAI helps streamline the investigative process without requiring you to manually parse through endless datasets.

For instance, to implement GenAI for anomaly detection, consider using LLMs to sift through logs, metrics, and telemetry data. You might prompt an LLM with a request like: “Identify any anomalies in server response times over the past week.” To improve the LLM’s performance, you can provide a few examples of the desired output format in the prompt. This approach is known as few-shot learning, as provided in the sample prompt below.

Identify anomalies in server response times based on the provided data.

Let your response follow the examples below:
Example 1:
Data: [Monday 8 AM - 5ms, Monday 9 AM - 7ms, Monday 10 AM - 50ms, Monday 11 AM - 6ms, Monday 12 PM - 8ms]
Response: "An anomaly occurred on Monday at 10 AM when latency spiked by 614% compared to the previous hour."

Example 2:
Data: [Tuesday 8 AM - 8ms, Tuesday 9 AM - 8ms, Tuesday 10 AM - 10ms, Tuesday 11 AM - 30ms, Tuesday 12 PM - 10ms]
Response: "An anomaly occurred on Tuesday at 11 AM when latency increased by 200% compared to the previous hour."

Now, it’s your turn:
Here is the provided data: [Wednesday 8 AM - 10ms, Wednesday 9 AM - 15ms, Wednesday 10 AM - 100ms, Wednesday 11 AM - 12ms, Wednesday 12 PM - 10ms]
Provide a response.

‍

By incorporating domain-specific context into these prompts, you ensure that the model focuses on business-critical dimensions (e.g., response latency, transaction volumes, etc.) rather than irrelevant details. For diagnosing root causes of issues like customer churn, you can combine structured features (e.g., demographics and usage frequency) with unstructured data (e.g., customer support transcripts and product feedback) and feed them into an LLM.

You can get immediate insights by prompting the model with something like: “Given this set of churned customer records and their support tickets, identify the most likely reasons they stopped using our product.” Perhaps the model reveals that churned users often cited poor documentation or frequent billing errors. With these findings, you can quickly prioritize improvements, such as updating the help docs or refining payment workflows.

In both examples, the key is to guide GenAI with precise, context-rich prompts and to combine it with relevant data. This approach helps you arrive at the “why” behind performance issues, customer dissatisfaction, or unusual behaviors with greater speed and accuracy than traditional, manual diagnostic methods.

Predictive analytics with generative AI

Once you understand what happened and why, the next logical step is anticipating future trends and outcomes. “What might happen next?” is the question predictive analytics aims to answer. Traditionally, this has meant training time-series models or regression algorithms on historical data. Currently, GenAI offers exploratory support for predictive tasks, such as synthesizing data sources or summarizing trends, but it requires additional integration with statistical and machine learning models for high accuracy.

In a traditional approach, predictive analytics starts with a data pipeline, where relevant historical data are collected. This is followed by data preprocessing steps that address missing values, data imbalance, and outliers. Then, engineers either apply a statistical method or build a machine learning model on the data. Finally, model performance is evaluated using metrics such as mean squared error (MSE), R-squared, and accuracy on both training and test datasets.

‍

Meanwhile, GenAI models can synthesize diverse data sources—such as sales transactions, market signals, and seasonal patterns—into intuitive forecasts. For instance, you might prompt an LLM as follows: “Based on the last 12 months of sales data and the upcoming holiday calendar, predict next quarter’s revenue by region.” The model can return a set of scenarios that factor in known industry events, evolving customer behavior, and even external variables like weather.

Businesses are already experimenting with GenAI to forecast customer lifetime value (CLV). By combining purchase histories, support interactions, and engagement metrics, you can ask: “Which customer segments are most likely to increase spending in the next six months?” The LLM might highlight a specific cohort—such as customers who recently engaged with a particular product feature—and estimate a percentage growth rate in their average spend. Another example is inventory forecasting, where you might integrate sales logs, supplier lead times, and social media sentiment data. You could prompt the model as follows: “Forecast the inventory needed for Product X over the next three months to avoid stockouts.” The resulting forecast could guide your procurement decisions, ensuring that you keep just enough inventory on hand without incurring excessive holding costs.

Since it can digest and summarize large amounts of information, GenAI can present outcomes in clear, natural language, rather than just raw numbers, so that business stakeholders can quickly understand and act on the insights. Beyond forecasting and summarization, it can also generate code to help build predictive models from historical data, making it easier to explore various algorithms and rapidly prototype solutions.

Prescriptive analytics with generative AI

After determining what happened, why it happened, and predicting what might happen next, you are ready to answer the ultimate question: “What should we do next?” As noted by KPMG, prescriptive analytics equals predictive analytics plus optimization. With GenAI, you can simulate various scenarios, optimize strategies, and recommend actions. However, for now, GenAI’s role in prescriptive analytics is best viewed as a collaborative tool for brainstorming and exploring possible scenarios rather than as a definitive decision-making tool.

The traditional approach to prescriptive analytics builds on predictive models or forecasts by applying optimization techniques to identify the best course of action under given constraints (e.g., budget, capacity, or time). This often involves linear programming, integer programming, or other operations research methods. The outputs guide decision-makers toward actions that maximize desired outcomes, such as profit, efficiency, or market share while accounting for resource limitations.

Prescriptive Analytics Techniques (source)

To complement the traditional approach, GenAI can help brainstorm multiple scenarios before finalizing an optimal solution. For example, consider simulating the impact of different marketing campaigns on product adoption. You could prompt an LLM this way: “Assume that I can run three different marketing campaigns targeting our EMEA region: A, B, and C. Predict the likely outcomes of each campaign on sales over the next two quarters.” The model might return a comparative summary showing how Campaign A might boost short-term conversions while Campaign B could yield steadier long-term growth. Campaign C could deliver higher engagement but at a greater cost, prompting you to rework the messaging or target different segments.

Beyond marketing, you can apply a similar approach to resource allocation, pricing strategies, or supply chain optimizations. By defining constraints (e.g., budget, inventory limits) and objectives (e.g., maximize profit, minimize downtime), you can prompt GenAI to suggest multiple action plans. Over time, you can refine the model’s understanding of your constraints and priorities by incorporating real-world feedback. This iterative process ensures that, as conditions evolve, your prescriptive analytics tools remain aligned with your business goals.

Best practices for GenAI data analysis

Generative AI for data analysis is maturing, and it’s clear that the industry is just getting started. Current tools—like OpenAI’s advanced data analysis or Claude’s analysis tool—offer a glimpse of what’s possible, but they often struggle with the complexity, accuracy, and governance challenges inherent in enterprise data environments. To move beyond experimentation and ensure meaningful adoption, you need to implement robust best practices and consider solutions that address the unique demands of your business domain.

Establish data quality and governance standards

Ensure that your data is clean, well-documented, and managed with proper version control. Generative AI depends heavily on data quality, and without proper governance, LLMs can produce misleading insights. Implement strict access controls and enforce human-in-the-loop validation to prevent errors and maintain trust with stakeholders.

Account for nondeterminism and hallucinations

LLMs do not guarantee deterministic outputs, which can lead to hallucinations or plausible-sounding but incorrect answers. To mitigate this, set up rigorous prompt-testing frameworks, monitor outputs against known benchmarks, and integrate guardrails.

Integrate continuous monitoring and feedback loops

Capture logs, metrics, and user feedback to continuously improve model performance. By recording prompt requests, model outputs, and associated user actions, you can identify where the AI falls short and adjust accordingly. However, ensure that you have appropriate consent from users to avoid violating data privacy rights. Over time, this iterative feedback loop refines both your models and your data pipelines, leading to more reliable and context-aware insights.

Enforce strong security and role-based controls

GenAI systems need secure data handling, role-based access permissions, and robust authentication measures. Improper governance can expose sensitive data or grant unauthorized access to critical insights. Pairing LLMs with a platform designed for enterprise security can ensure that models respect organizational boundaries and regulatory requirements.

Explore AI vendors

Generally, enterprise data often involves custom business logic, multiple data silos, and domain-specific taxonomies. While general-purpose tools provide a starting point, they may not suffice for complex, regulated environments. Specialized platforms such as WisdomAI are purpose-built to handle these challenges, ensuring you can connect all your data sources, analyze your data, and generate insights without compromising data security.

Last thoughts

By breaking down traditional barriers like rigid workflows, siloed data, and technical bottlenecks, GenAI empowers teams to use data more intuitively, efficiently, and effectively. However, there are challenges, and organizations must prioritize data quality, implement strong governance, and address security and accuracy concerns.

When these fundamental challenges are resolved—ensuring consistent data quality, enforcing clear data governance practices, and aligning AI initiatives with robust compliance frameworks—AI-driven analytics can yield far more reliable, trustworthy, and actionable insights. Furthermore, standardized governance protocols and clear data lineage help teams trace insights to their sources, reinforcing trust and enabling more reliable decision-making. Also, by adopting best practices such as continuous monitoring, feedback integration, and enterprise-tailored solutions, you can mitigate risks while maximizing the value of generative AI.

Businesses willing to embrace this innovation early will undoubtedly gain a competitive edge, driving smarter decisions and fostering a culture of agility and adaptability.

Continue reading this series

CHAPTER

AI for Business Intelligence: Best Practices For Leveraging AI

Learn how AI is transforming business intelligence from a reporting tool into a strategic partner for decision-making by combining advanced analytics, natural language processing, and industry-specific implementation strategies.

Read the guide

CHAPTER

Text-to-SQL Systems: Tutorial & Best Practices

Learn about the advancements and challenges of using Text-to-SQL technology, including the role of large language models, techniques for resolving ambiguity and optimizing join relationships, and security best practices.

Read the guide

CHAPTER

Overcoming Semantic Layer Limitations for AI-driven Analytics

Learn the key concepts and applications of semantic layers, bridging the gap between complex data structures and user-friendly interfaces in AI-driven analytics.

Read the guide

CHAPTER

AI for Data Analysis: Best Practices for Descriptive, Diagnostic, Predictive, and Prescriptive Analytics

Learn how generative AI techniques can revolutionize data analysis for businesses, breaking through traditional limitations and streamlining the process with natural language prompts and automation.

Read the guide

CHAPTER

Self-Service BI: AI-based Business Intelligence

Learn about self-service business intelligence systems, their role in revolutionizing data analysis, and the challenges and benefits of implementing them.

Read the guide

CHAPTER

Data Cleansing in SQL: Traditional vs Generative AI-based Techniques

Learn the role of data cleansing in SQL and the emergence of generative AI for analyzing structured data and improving the responses with descriptive metadata.

Read the guide

CHAPTER

Analyze CSV Data with ChatGPT: Tutorial, Challenges, and Limitations

Learn how to analyze CSV data with ChatGPT, including uploading, plotting graphs, generating dummy data, and utilizing AI platforms.

Read the guide

CHAPTER

Conversational Analytics: Best Practices for AI Agents

Learn about the power of conversational analytics, powered by AI agents, to provide real-time insights and actions from data through natural language conversations.

Read the guide

CHAPTER

AI Tools for Data Analysis: Data Workflows with LLMs

Learn how AI analytics tools can be categorized into descriptive, diagnostic, predictive, and prescriptive types, and how they can improve data analysis processes for businesses.

Read the guide

Insights at your fingertips with AI-powered analytics

Book a demo