Step 1: Import the Excel files you want to work with.
If you’re using Google Sheets or Excel Online, it’s a good idea to pull the file using an API and save a local copy. This gives you time to manually review the cleaned data before re-uploading it.
Step 2: Decide what each column should contain.
Determine the data types and formatting rules for each column, and identify consistent fields across files, this helps with merging and linking files later.
Step 3: Clean the data.
Ask yourself: if I had to do this in Excel manually, what would I do? List each small step.
Examples:
- Capitalize the first letter of first and last names.
- Normalize phone numbers (e.g., remove country codes, parentheses) to keep things consistent, especially if you’re using WhatsApp automation.
Ensure each type of data is in its own column, and clean accordingly:
- Remove duplicates
- Fix inconsistent formatting
- Address missing or misplaced data
Step 4 (optional): Merge data from other sheets.
Use primary or foreign keys to link datasets. This is similar to using VLOOKUP or INDEX MATCH in Excel.
Step 5: Review everything.
Look over the final result to ensure accuracy and completeness. Handle edge cases as they appear.
Step 6: Upload the final files or submit your work.
If you’re using cloud storage or Google Sheets, this is when you’d push the cleaned version back.
Step 7: Document and improve.
Keep a running list of edge cases and refine your script each time. Once you’re happy with the results, create clear documentation so others can use your program. If possible, host it on an external server for broader access.