Guide: How can Julius WRITE Excel functions?

I got the idea for this post after I saw the one titled “Can Julius read and analyse spreadsheets with multiple tabs?”.

Besides reading spreadsheets (with multiple tabs), can Julius help you populate tabs with Excel functions? Absolutely!

This can be a bit more finicky so it’s important to keep a few considerations in mind when prompting. Julius writes Python to achieve all these tasks, and therefore operates within those bounds. Some basic knowledge of what to expect can go a long way in reducing the error rates.

The data I will be using here is the superstore data, easily found on Kaggle. It helps me make my point and it is easily accessible to anyone else.

Figure 1. Superstore Dataset

As always, I start by asking for only a part of my overall task to be completed, because Julius operates really well in Steps. I just ask for the Excel workbook to be read and displayed for inspection. An important part here is this:

Hint: don’t use the read_all_sheets method, it will throw an error because some tables are empty. Use some other approach that will let you retrieve all tabs and their names and also let you write functions back into the sheets later on.

Julius can use several Python libraries to handle Excel files and the choice can be very important depending on what you want it to do with this data and what the data looks like. In our case, I want Julius to write Excel functions for me that will actually work when the file is saved and I want it to load a few tabs where some tables might be empty and confusing to the data import process. This hint lets it know ahead of time what the expectation is. In the past I’ve had it tell me it doesn’t have the ability to write to Excel, but that’s because it already loaded the wrong library.

After that comes the next most important step in ensuring that Julius correctly writes Excel functions. After it imports the data from my first tab, it will only know column names but it won’t have a direct link to their positions in the table (like for example how in Figure 1 you can see that “Row ID” is column A). So I want to specifically ensure it creates this mapping so that it can use it later on, instead of doing it on it’s own. I have found that it often messes up column locations and renders most of the functions useless and Excel functions really depend on column locations.

Then, the final step, I ask Julius to fill the second tab, tell it kinda in general what I want there and remind it to use the mapping we just created.I can verify that it does indeed use the mapping that was just created by pulling up the code it writes and checking what the Excel functions look like.

Figure 1. Mapping


As you can see, mapping was used and all the required functions were created that way.

This is what my table looked like before Julius:
table_before

and this is after:
table_after

This can be further elaborated on, by asking Julius to round the numbers, concatenate, write VLOOKUPs, etc.

Tips for lowering the error rate:

  • Work in Steps
  • Let it know to skip the read_all_sheets() method to avoid errors
  • Map column names to Excel column locations to ensure data integrity

These steps can be added to the general instructions as well, I just opted to leave them in the conversation this time so it is clearer what I did.

You can read my original conversation here.

5 Likes

Wait this is really cool. We (the Julius team) had no idea you could get Julius to generate an Excel file with formulas built into the excel file.

We are talking about this in the office right now about how we knew Julius can write Excel formulas but this is the first time seeing it read in your Excel file structure and transform it with accurately mapped formula.

I wonder what else is possible with openpyxl . Seems like a really powerful module with lots of possibilities with Excel files.

1 Like

If this process can be somewhat optimized so it doesn’t feel so tricky to do, I bet you could just have seamless workflows between Julius and Excel.

1 Like

OMG this is a game changer! We need a follow-up! Can we have more examples of this??

1 Like

err, random question, under what conditions are you seeing read_all_sheets() breaking? if possible we’d want to fix it. we’re using that for speeding up reading larger excel files but debating if we should just get rid of it and let larger excel files just take longer to load for higher reliability.

1 Like

The examples in which it breaks were always only when trying to get it to write to empty tables, so I started skipping it then.

Example, I would bring in a completely empty table (except the headers) to try and get Julius to populate the excel in a few steps so that it can end up with a completely new VLOOKUP based table.

Here, Julius would have to first use Python to determine all distinct values in a categorical variable (like in Superstore data I would want it all to be Product Names so that I don’t have to do that part myself).

Then, Julius would save that workbook, and then open it again to generate VLOOKUPs, SUMIFs etc, based on these values. If I could get it to perform this in several steps it would. create more complex workbooks.

The issue was always when reading these empty tables to begin with, it would just throw an error on them and then devolve into a series of errors until it either gives up or abandons using this method.

3 Likes

i think i found the problem and pushed a fix. should be live in about 5 minutes.

2 Likes