Data Analysis

Data Cleaning Assistant

Describe your messy dataset. Get a step-by-step cleaning plan with Python or SQL code for handling missing values, duplicates, and formats.

By Arshad Hossain

Describe your data issues (column names, data types, problems). Get ready-to-run code for your preferred tool.

Act as a senior data analyst. I need help cleaning and preprocessing a dataset before analysis.

Dataset description:
- Source: [WHERE THE DATA COMES FROM]
- Number of rows: [APPROXIMATE]
- Number of columns: [NUMBER]
- Column names and types: [LIST THEM, e.g., name (text), date (mixed formats), revenue (numbers with $ signs)]
- Tool I am using: [PYTHON PANDAS / SQL / EXCEL / R]

Known issues:
[DESCRIBE YOUR DATA PROBLEMS, e.g.:
- Date column has mixed formats (MM/DD/YYYY and YYYY-MM-DD)
- Price column has some entries with $ signs and commas
- About 15% of the email column is blank
- Duplicate rows based on customer_id
- Some names have extra whitespace or inconsistent capitalization]

For each issue, provide:
1. What the problem is and why it matters for analysis
2. The recommended approach to fix it
3. The exact code to implement the fix (in my preferred tool)
4. A validation check to confirm the fix worked

Also provide:
- A summary statistics check I should run after cleaning
- A data quality report template I can reuse
- Suggestions for any additional cleaning steps I might have missed

Why "Data Cleaning Assistant" Works

"Data Cleaning Assistant" is built on a principle most AI users overlook: models perform dramatically better when given role assignment and sequential task breakdown rather than open-ended questions. Your output will be actionable analytical insights with methodology documentation and visualization recommendations - the difference between useful AI assistance and a response you immediately delete.

These data analysis tips will help you get stronger results when using "Data Cleaning Assistant" and similar prompts in this category.

When to Use "Data Cleaning Assistant"

"Data Cleaning Assistant" is particularly useful in these situations. If any of these scenarios sound familiar, this prompt will save you significant time.

What You Will Get from "Data Cleaning Assistant"

When you use "Data Cleaning Assistant" with ChatGPT, Claude, or Gemini, here is what to expect in the AI output.

How to Customize "Data Cleaning Assistant"

Adapt "Data Cleaning Assistant" to your specific situation by modifying these key areas. The more context you add, the better the results.