Advice for Data Cleaning - Standardizing and aggregating names

xp032766 · June 27, 2024, 6:35pm

Hey guys,

I have a question about constructing a prompt for identifying and aggregating data.
I have an Excel file with a list of raw names and want a new column with standardized names. Here’s a snapshot of the ideal solution that Julius might help with, where it could identify if names can be standardized/aggregated by checking adjacent entries and adding to a new column. For example, “albright, benjamin b.” has variations like “albright, benjamin” and “albright, benjamin b”, and I hope Julius could compare and output the more complete version, which is “albright, benjamin b.”. If a name doesn’t need aggregation, like “agrawal, arpana” in the snapshot, it should be copied to the new column as is. Does anyone have any good ideas for constructing an effective prompt for this?
Screenshot 2024-06-27 at 2.27.49 PM

Mahmed · June 28, 2024, 3:40am

Hi can you share the dataset in google drive please?

chrisdavis92 · June 28, 2024, 6:48pm

You actually may have a good prompt in this post: “…where it could identify if names can be standardized/aggregated by checking adjacent entries and adding to a new column.”
Have you tried asking Julius to do this with this prompt? For example, “can you identify names that can be standardized/aggregated by checking adjacent entries and adding them to a new column?”.

xp032766 · June 29, 2024, 5:33pm

Absolutely! Here’s the link:Loading Google Sheets
Thanks a lot!

xp032766 · June 29, 2024, 5:38pm

Yes, I’ve tried that but it ended up incorrectly aggregating all names with the same last name to one specific name, here’re some of the wrong entries it returned: adams, j.c. adams, elizabeth troutman
adams, leslie b. adams, elizabeth troutman
adams, marci adams, elizabeth troutman
adams, marie adams, elizabeth troutman
adams, molly adams, elizabeth troutman
adams, ursula adams, elizabeth troutman

Mahmed · June 29, 2024, 6:17pm

Ignore my previous file
Consider this please WeTransfer - Send Large Files & Share Photos Online - Up to 2GB Free

xp032766 · June 29, 2024, 7:11pm

I clicked the link and it directed me to the registration page for this website…

Mahmed · June 29, 2024, 11:02pm

Just click agree n proceed to load the file

xp032766 · June 29, 2024, 11:41pm

Wow, it worked perfectly! Would you mind sharing how you formulated the prompt?
Many thanks!

xp032766 · June 30, 2024, 4:21pm

Hey Mahmed,

Thank you for quickly providing this standardized dataset. I’d appreciate tips on constructing effective prompts to achieve this output.

chrisdavis92 · July 1, 2024, 2:18pm

Thanks for sharing this information Mahmed! This is super useful, I didn’t know that something like this existed.

Mahmed · July 1, 2024, 4:45pm

Most and always welcome

Topic		Replies	Views
Advice/resources for how to create effective prompts? General	4	269	September 6, 2024
Can we use this ai on excel General	4	218	July 1, 2024
Simple Workflow does not work Bug Reports	12	163	August 30, 2024
I wrote a starting guide for Data Analysis in Julius :innocent: General	3	1964	June 23, 2024
Progress / Information is lost after a few prompts General	2	66	January 17, 2025

Advice for Data Cleaning - Standardizing and aggregating names

Related topics