Highly recommend taking the dataset into Power Query and matching there over an XLOOKUP. You could also clean up the data in PQ (remove zeros, remove nulls, unnecessary columns) to speed up the process.
I concur to all past present and future concurments to the aforementioned agreements to the agreement to which the specific solution was stated in proposition to the problem presented to the respected parties herewith of
I wholeheartedly agree to all past, present, and future agreements, including this one, to concur with the aforementioned agreements, which align with the agreement pertaining to the solution proposed in response to the problem presented to the esteemed parties herein.
I also agree with this. The timing doesn't correspond with the size of data. Something else is afoot, but PQ is designed for merges with larger data sets than Excel can handle, so you'll be better served either way.
PQ will also not do the lookup for each column, which I suspect might be the missing element. As searching 45k in 180k isn't bad, but searching 45k*30 or 1.35 million in 180k is where things could reach the perf OP is experiencing. If this is the case, a single match formula with 30 indexes referencing the single match result per row is the way to go in formula space.
Just wish the editor was faster
I don’t get why it feels the need to rerun the entire query every time I make a change
Even renaming a query prompts it to rerun
Other than that I agree it’s incredibly useful
After you make a change hit Cancel Refresh then Refresh Preview. The default behavior of PQ is a full refresh of all queries after you make a change. By cancelling and refreshing preview it forces PQ to just refresh the query you're looking at.
Also, don't use a network location to source files when designing, and use csv or txt files as your data sources over Excel files, it's like 10x faster.
[Merge queries (Power Query) - Microsoft Support](https://support.microsoft.com/en-us/office/merge-queries-power-query-fd157620-5470-4c0f-b132-7ca2616d17f9)
[Merge queries and join tables - Microsoft Support](https://support.microsoft.com/en-us/office/merge-queries-and-join-tables-cbd17828-7a50-4dc6-9aac-20af4ef6d8a6)
These 2 should get you started.
I've switched over from xlookup to PQ and it's a game changer, another understated benefit I don't see mentioned often is when you need multiple values returned. I still use xlookup for small quick things, but I've tried getting all things done in PQ now with minimal formulas so the workbook feels quicker and smoother.
I have an Excel sheet that’s 6.4 mb, and so heavy it slows down my computer just to have it open, but I’m starting to rely on it more.
I’ll have to look into rebuilding it with PQ to be lighter.
I am just getting started with PQ and it’s for this reason alone I know it’s worth my time to learn. My job would literally be WAY easier if I learned how to do this.
It’s slow going, but I’ve started with Master Your Data book whenever I get a chance.
Some guy at work got frustrated when I showed him PQ for merging. His complaint was that I was using excel as a database, which it was not, and then demonstrated how he would accomplish the same task is Access. I hate access though
Nowadays I'm using power BI when I run into this type of problem, I've never had good luck with using powerBI in Excel, but maybe I'm doing something wrong.
The source of in power query is the excel table ? And the resulting table is also sent to the same excel file ?
It's a data lookup tool that was added to excel. It acts more like SQL lookups. PowerBI's main data management is power query. It can handle large data volumes more efficiently than excel. It should be there already but needs to be turned on:
[https://www.simplilearn.com/tutorials/excel-tutorial/power-query-in-excel](https://www.simplilearn.com/tutorials/excel-tutorial/power-query-in-excel)
What formula are you using? For example, XLOOKUP can engage in a binary search which can improve performance on large datasets if the set is sorted.
I've also found that if you put something in the 'if not found' field of XLOOKUP, then that formula is always evaluated, even if the lookup was successful. This can slow things down, compared to wrapping the lookup in IFERROR.
If it is always the same column, have you tried looking up the whole array? Something like =XLOOKUP(value1:value45000, value in big table, target value) and let it SPILL
Never tried as I usually do merge small table to big table or I will PQ, but it might be worthwhule to take the time to sort the large table and use binary search option? Binary search should only take the log(n) for time instead of going through all the keys.
It’s not faster in my experience. I don’t know if I could say it is slower. I was pretty disappointed when I started using it. You’d think it would be much faster/more efficient than index match.
Unless you use double approximate INDEX/MATCH on sorted data, don't think there's significant speed difference.
PQ as mentioned will significantly cut down on time, depending on data set, 30sec or less. If you really need to speed it up... you could write VBA code utilizing, array and scripting.dictionary (100k+ rows doing LU on 50 records will be done in less than sec). But that's overkill in this case.
If you need further performance improvement... instead of doing merge in PQ. You can build relationship between two table in data model and use Power Pivot & DAX measures to build report.
--red is correct - Xlookup is \~40% slower than vlookup, and 30% slower than index match. This difference amplifies if you use double true vlookup (binary) up to \~100%
Source: ~~trust me bro~~ professor excel, your own vba code (it takes like 6 rows of code + a simple vbs run
Definitely underscore over space. Underscore are less likely to be interpreted as special character while spaces always have specific rules applied, are "invisible", often have variable length, can be destroyed by trim, etc...
Right sub. To complain about slow processing when you're handed a i7 processor is a bit of a stretch.
Different tools exist for different uses. Not every Excel formula is appropriate (or efficient) for the job you want it to do. Not everything is meant to be thrown in Excel.
"What's a better way to do this?" is a question worth asking more often, IMO.
Index and match always felt like it worked the quickest for me. Vlookup only if I need to do something quick. But actual reports it’s usually index and match.
Right click the table
In the drop down, click table, then format as range.
It'll remove the table format but leave the data, and won't freeze your computer as much ( will still take a min or two for 45k rows but not 30 min+)
Same issue as the OP. With a lot of data, it slows down a ton when formatted as a table. There are also issues if you have multiple columns with the same name, as the table tries to rename the columns. I can also easily format the data in my own with cell borders and coloring that let's me do more than a basic table
Yeah, I just don't get that. Are you going ham on Named Ranges then to better manage formula complexity? If you're not using structured references, how do you manage the data sets changing size? (i.e. row count) I would think needing OFFSET would erase any performance gains you might preserve by not using a table.
As a matter of routine, I work with large raw datasets across multiple tabs (dozen+ quite often) that usually need a lot work to clean and then stage for outputs; I can't imagine not using tables or PQ to make writing & auditing formulas easier and to handle all of the production of the initial cleaning & updating of reference tables with new unique values.
If your dataset is too large to drop into a table because it will cause performance issues, what could you possibly gain / need from being able to format cells more flexibly over the same exact data set? I am genuinely stumped.
If I could ask you the same, why do you need it as a table?
Usually what I'll find is tables have very defined references for a row and cell. So what would be d564 in a sheet is @ytd actual or some other name. Where this becomes an issue is we do a lot if model driven output that uses formulas to do calculations off those cells. It is much easier to edit and understand these formulas references when it's a cell I can easily go to, rather than having to go back into a formula with multiple table cell references and untangle the mess.
The formatting is usually only around the headers and top row of my table. It's very easy to add borders and multiple colors to that and keep the rest of the data as clean white.
But at the end of the day this is all personal preference. Do what works for you obviously
This is a very common scenario for me - I'm doing a lot of lookups, merging, and appending - it's typical for me to get six or seven client's data sets and then have to consolidate them with our internal data so we can holistically track & forecast performance. Each client sends things in their own format (often from CRM / Marketing systems) - we get daily files sent to us. However, automating the ingestion has a high upkeep cost - people change their data format all the time - it's easier to have a workbook that can be rapidly updated to account for changes in the raw data and then upload the prepped data to our DW, then it is to have reporting break a couple times a month and then need to wait for ticketing and escalations to run their course.
The complexity of the lookups and level of cleaning that goes on coupled with that there are multiple people who need access either to do the update or audit things, if we weren't using structured references with tables it would be a nightmare to maintain let alone train people up on.
Use the double Vlookup True, its instantaneous.
https://analystcave.com/excel-vlookup-vs-index-match-vs-sql-performance/
P.S.: Even VBA or Power Query looses to it. Try and let me know:)
Yeah, I've got an 8th gen i7 at work and while I have complaints I've never had lookups take literal hours. There must be something OP could do to either clean up the data or optimize the formula before blaming the hardware.
Most people use Data Tables for sensitivity analysis to recalculate different outputs for DCF models etc. Instead I use it to reperform a single formula logic across every row in a data set.
1. Add a "Row #" helper column with sequential numbers from 1 to 45K beside each row of data.
2. In a blank cell on the same sheet, create a trigger cell with "1" as your input.
3. Next to the trigger cell in step 2, create a single XLOOKUP formula that uses the trigger to return the correct XLOOKUP result for row #1. Test the formula by changing the trigger cell to 2..3...4...5 and make sure it works as intended.
4. Beside your Row # header you want this cell to point to the XLOOKUP formula you just created in step 3 .
5. Highlight the Row # and Xlookup columns and create a Data Table (Data Ribbon > What if Analysis > Data Table). Row input cell is blank, Column input Cell should be your trigger cell from Step 2.
6. Adjust calculation options to Automatic Except for Data Tables.
I use this for customer level forecasting by contract going out several years and it reduces file size bloat.
45+ minutes for 45k lookups feels like something is wrong with your data (i.e. it is struggling to match anything)
I've just done a `MATCH` on 167k rows into a dataset over ~325k rows (and then run a series of `INDEX`s on that), and the whole thing took, at most, a couple of minutes
I do find with `INDEX`/`MATCH` combos that if there is no `MATCH` return, the whole thing is a lot slower (but using `IFERROR` wrappers seems to help speed up by a fair margin)
I know - but point being, 45k `XLOOKUP`s taking >20-30x longer than 167k `MATCH`s (when the difficult part in either case is the lookup) is surely indicative of there being issue with the operation.
It took me longer to write the series of `INDEX`s (as they were not consistent / contiguous - for good reasons) than it did for it to calculate
Underscore over spaces.
Also YYYYMMDD so files are in order for frequently run. Often at the beginning of the file name. Situations are situational.
And I feel your pain. Especially when working remotely, going into office network, then the hospital that holds the data. And every connection has some level of crazy encryption to protect PHI.
I don't think this is a hardware issue. I see this a lot with files. My rule of thumb is as follows:
- Want to use Pivot Tables: Great, Excel will work fine for 800K rows. It'll be clunky, but it'll work;
- Do you need formulas to update rows individually for over 18K rows, and on multiple columns? Yeah, you're going to collapse your machine
Yeah, something else is wrong. My work PC is an 8th-gen laptop i5 (8365U) where half the processing power is consistently bogged down running security and document control software, and my Excel (32-bit because it needs to interface with obsolete software from the 90s) churns though tens of thousands of rows of XLOOKUP in just a few seconds.
Underscores. Also, I feel your pain my friend. We had a data set that was over 50k with roughly 5 tabs. Multiple vlookups, an ass ton of formulas in each sheet. It was just ridiculous to navigate.
PowerQuery is the way to go. It’s quicker, easier to make changes and frees up a ton of memory.
The other alternative is to use INDEX-MATCH instead of XLOOKUP.
One thing I'm not seeing mentioned.
Assuming one or both files are on your machine for this...does the machine have an SSD or is the machine using a disk hard drive for storage?
There can often be Excel situations with a lot of storage access including situations where there theoretically should not be much. If it needs to access a hard drive for those requests it can really slow things down.
90% of your CPU with over 45 minutes of churning through tables? That doesn't sound right. I don't think XLOOKUP should be taking a whole minute for one thousand rows.
I have an excel file with over 300 tables, ranging from 1 column with 20 rows to 5 columns with 1250 rows. I have roughly 20 XLOOKUPs which al pull up different kinds of information when I type something in one cell, with all of those having 7 other XLOOKUPs within them. I have around 30 more XLOOKUPs which pull data from another set of tables when I fill in something related to the first cell in 4-5 other cells. All of that is repeated 10 times in my file, so I can run multiple searches at once.
Even with the file searching through everything multiple times, it only takes one second to process.
I know my dataset doesn't come close to yours in terms of size, but I don't think XLOOKUP should take nearly an hour to process a request.
In your xlookup are you using the row reference with the @ character eg [@colum_with_lookup_value]. I think you would be getting a spill error if not but on mobile and can't test if that's the case in a table.
Where are they pulling information from? I had a book that was just sheets of lookups that would take hours if it didnt freeze up completely. I changed it to VBA and it takes 45 seconds or so now. If your reading from remote closed workbooks though, it would still take awhile.
I just started messing around with xlookup the other day so very new still but I did notice a big performance issue when trying to lookup a whole column vs just the range of the data….especially when using multiple conditions.
That seems normal for 45K rows, but it shouldn't take 45 minutes, but it'll definitely use 99% of your CPU.
Your data might have formulas in it, instead of hard coded data. That's the only likely explanation for a 45 minute calculation time.
Not an Excel expert but I would look at doing this with a macro (if it currently is just cell entries). Simply shutting off screen updating will save a lot of time.
Do python, learn about the jupyter notebooks or use vs code natively on your laptop. Explore pandas dataframes. I believe that way you can slice your data a lot more efficiently. Excel does use quite a bit of CPU for such a task of yours. Just a general suggestion, I am not expert, very casual user, but I have noticed that Python does things a lot more efficiently.
Delete all floats (round up decimals)
Also do not leave those lookup romulas in the file. Once looked up paste VALUES. Removing decimals in very large Excel files will drop file size like 30% and speed up.
If you have several columns doing a lookup in formulas in the background on 45k row itss be slow AF.
Very likely you may have other sheets or workbooks open with a lot of formulas. Excel recalculates all fornulas in a file everytime you update a cell.
Copy your 45k rows to another Excel, close all other open excels and work on this new file.
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:
|Fewer Letters|More Letters|
|-------|---------|---|
|[IFERROR](/r/Excel/comments/1ccunmt/stub/l1844bx "Last usage")|[Returns a value you specify if a formula evaluates to an error; otherwise, returns the result of the formula](https://support.microsoft.com/en-us/office/iferror-function-c526fd07-caeb-47b8-8bb6-63f3e417f611)|
|[INDEX](/r/Excel/comments/1ccunmt/stub/l1amubi "Last usage")|[Uses an index to choose a value from a reference or array](https://support.microsoft.com/en-us/office/index-function-a5dcf0dd-996d-40a4-a822-b56b061328bd)|
|[MATCH](/r/Excel/comments/1ccunmt/stub/l1amubi "Last usage")|[Looks up values in a reference or array](https://support.microsoft.com/en-us/office/match-function-e8dffd45-c762-47d6-bf89-533f4a37673a)|
|[NOT](/r/Excel/comments/1ccunmt/stub/l199g8z "Last usage")|[Reverses the logic of its argument](https://support.microsoft.com/en-us/office/not-function-9cfc6011-a054-40c7-a140-cd4ba2d87d77)|
|[OFFSET](/r/Excel/comments/1ccunmt/stub/l1b0k9d "Last usage")|[Returns a reference offset from a given reference](https://support.microsoft.com/en-us/office/offset-function-c8de19ae-dd79-4b9b-a14e-b4d906d11b66)|
|[PHI](/r/Excel/comments/1ccunmt/stub/l1898t7 "Last usage")|[*Excel 2013*+: Returns the value of the density function for a standard normal distribution](https://support.microsoft.com/en-us/office/phi-function-23e49bc6-a8e8-402d-98d3-9ded87f6295c)|
|[VLOOKUP](/r/Excel/comments/1ccunmt/stub/l17uh8p "Last usage")|[Looks in the first column of an array and moves across the row to return the value of a cell](https://support.microsoft.com/en-us/office/vlookup-function-0bbc8083-26fe-4963-8ab8-93a18ad188a1)|
|[XLOOKUP](/r/Excel/comments/1ccunmt/stub/l1amubi "Last usage")|[*Office 365*+: Searches a range or an array, and returns an item corresponding to the first match it finds. If a match doesn't exist, then XLOOKUP can return the closest (approximate) match. ](https://support.microsoft.com/en-us/office/xlookup-function-b7fd680e-6d10-43e6-84f9-88eae8bf5929)|
|[XMATCH](/r/Excel/comments/1ccunmt/stub/l1amubi "Last usage")|[*Office 365*+: Returns the relative position of an item in an array or range of cells. ](https://support.microsoft.com/en-us/office/xmatch-function-d966da31-7a6b-4a13-a1c6-5a33ed6a0312)|
**NOTE**: Decronym for Reddit is no longer supported, and Decronym has moved to Lemmy; requests for support and new installations should be directed to the Contact address below.
----------------
^(*Beep-boop, I am a helper bot. Please do not verify me as a solution.*)
^(9 acronyms in this thread; )[^(the most compressed thread commented on today)](/r/Excel/comments/1cmwto0)^( has 22 acronyms.)
^([Thread #32918 for this sub, first seen 25th Apr 2024, 15:58])
^[[FAQ]](http://decronym.xyz/) [^([Full list])](http://decronym.xyz/acronyms/Excel) [^[Contact]](https://hachyderm.io/@Two9A) [^([Source code])](https://gistdotgithubdotcom/Two9A/1d976f9b7441694162c8)
Highly recommend taking the dataset into Power Query and matching there over an XLOOKUP. You could also clean up the data in PQ (remove zeros, remove nulls, unnecessary columns) to speed up the process.
Agree with this.
Agree to agreement with the solution
I second the agreement to the agreement to the solution
I second the agreement to the agreement to the agreement of the solution
I concur to all past present and future concurments to the aforementioned agreements to the agreement to which the specific solution was stated in proposition to the problem presented to the respected parties herewith of
I wholeheartedly agree to all past, present, and future agreements, including this one, to concur with the aforementioned agreements, which align with the agreement pertaining to the solution proposed in response to the problem presented to the esteemed parties herein.
Uhhhh…. What they said.
Fully understood
10-4
4-oct
Roger
i third the agreement to the -- oh the hell with it. OP, do as directed
Howard Johnson is right!!!
This is not the solution you're looking for *<..waves arm airily..>* You can go on about your business.. *<..smiles enigmatically..>* Move along..
I also agree with this. The timing doesn't correspond with the size of data. Something else is afoot, but PQ is designed for merges with larger data sets than Excel can handle, so you'll be better served either way. PQ will also not do the lookup for each column, which I suspect might be the missing element. As searching 45k in 180k isn't bad, but searching 45k*30 or 1.35 million in 180k is where things could reach the perf OP is experiencing. If this is the case, a single match formula with 30 indexes referencing the single match result per row is the way to go in formula space.
My guess is more than one (and possibly a lot more than one) formulas *45k rows
More people need to use PQ! Especially for large datasets. Once you use it, it's hard to go back.
Just wish the editor was faster I don’t get why it feels the need to rerun the entire query every time I make a change Even renaming a query prompts it to rerun Other than that I agree it’s incredibly useful
After you make a change hit Cancel Refresh then Refresh Preview. The default behavior of PQ is a full refresh of all queries after you make a change. By cancelling and refreshing preview it forces PQ to just refresh the query you're looking at. Also, don't use a network location to source files when designing, and use csv or txt files as your data sources over Excel files, it's like 10x faster.
I’ll keep those tips in mind Thanks
>Taking the dataset into Power Query and matching there over an XLOOKUP Any tutorials on how to do that?
[Merge queries (Power Query) - Microsoft Support](https://support.microsoft.com/en-us/office/merge-queries-power-query-fd157620-5470-4c0f-b132-7ca2616d17f9) [Merge queries and join tables - Microsoft Support](https://support.microsoft.com/en-us/office/merge-queries-and-join-tables-cbd17828-7a50-4dc6-9aac-20af4ef6d8a6) These 2 should get you started.
I've switched over from xlookup to PQ and it's a game changer, another understated benefit I don't see mentioned often is when you need multiple values returned. I still use xlookup for small quick things, but I've tried getting all things done in PQ now with minimal formulas so the workbook feels quicker and smoother.
I have an Excel sheet that’s 6.4 mb, and so heavy it slows down my computer just to have it open, but I’m starting to rely on it more. I’ll have to look into rebuilding it with PQ to be lighter.
I am just getting started with PQ and it’s for this reason alone I know it’s worth my time to learn. My job would literally be WAY easier if I learned how to do this. It’s slow going, but I’ve started with Master Your Data book whenever I get a chance.
Some guy at work got frustrated when I showed him PQ for merging. His complaint was that I was using excel as a database, which it was not, and then demonstrated how he would accomplish the same task is Access. I hate access though
This is the way
Does PQ have better performance for large datasets than xlookup?
Yes, it is what it was designed to do.
Good to know this.
Nowadays I'm using power BI when I run into this type of problem, I've never had good luck with using powerBI in Excel, but maybe I'm doing something wrong. The source of in power query is the excel table ? And the resulting table is also sent to the same excel file ?
I just learned this feature 2 weeks ago and it’s really blowing my mind.
Me from the future agree to all these agreements
What’s Power Query, I’ve never heard of that.
It's a data lookup tool that was added to excel. It acts more like SQL lookups. PowerBI's main data management is power query. It can handle large data volumes more efficiently than excel. It should be there already but needs to be turned on: [https://www.simplilearn.com/tutorials/excel-tutorial/power-query-in-excel](https://www.simplilearn.com/tutorials/excel-tutorial/power-query-in-excel)
What formula are you using? For example, XLOOKUP can engage in a binary search which can improve performance on large datasets if the set is sorted. I've also found that if you put something in the 'if not found' field of XLOOKUP, then that formula is always evaluated, even if the lookup was successful. This can slow things down, compared to wrapping the lookup in IFERROR.
=XLOOKUP(value in small table, value in big table, target value)
Maybe False for range lookup (fourth argument) if you don't need it? Sorry, I'm thinking about lookup.
If it is always the same column, have you tried looking up the whole array? Something like =XLOOKUP(value1:value45000, value in big table, target value) and let it SPILL
Never tried as I usually do merge small table to big table or I will PQ, but it might be worthwhule to take the time to sort the large table and use binary search option? Binary search should only take the log(n) for time instead of going through all the keys.
Sort the lookup table. Tell Excel to perform the lookup in binary mode.
If xlookup is slower than index/match and even it's error handling is slower than iferror, then why should we even use xlookup anywhere?
Ain't no way xlookup is slower than index match
It’s not faster in my experience. I don’t know if I could say it is slower. I was pretty disappointed when I started using it. You’d think it would be much faster/more efficient than index match.
Uh hell yes.
Unless you use double approximate INDEX/MATCH on sorted data, don't think there's significant speed difference. PQ as mentioned will significantly cut down on time, depending on data set, 30sec or less. If you really need to speed it up... you could write VBA code utilizing, array and scripting.dictionary (100k+ rows doing LU on 50 records will be done in less than sec). But that's overkill in this case. If you need further performance improvement... instead of doing merge in PQ. You can build relationship between two table in data model and use Power Pivot & DAX measures to build report.
--red is correct - Xlookup is \~40% slower than vlookup, and 30% slower than index match. This difference amplifies if you use double true vlookup (binary) up to \~100% Source: ~~trust me bro~~ professor excel, your own vba code (it takes like 6 rows of code + a simple vbs run
That's an interesting detail about the IFERROR use. Thanks!
Definitely underscore over space. Underscore are less likely to be interpreted as special character while spaces always have specific rules applied, are "invisible", often have variable length, can be destroyed by trim, etc...
Mudspace was the bane of my existence. Pro tip: regex doesn’t count it as a space in the vba engine!
Do you have 45k columns to go with those rows? That's not a lot of rows for XLookup to be churning through and a 10th gen i7 is very capable.
So the XLOOKUP is in a 45K row/25 column table searching through a 180K row/30 column imported query table. Were well on our way to 2 hours now.
Forgive me, what are you actually trying to do? That performance is feeling pretty pathological.
Searching for the value in the small table in the large table, and returning the corresponding value.
OK. More dumb questions from me: are you using a binary XLOOKUP? Also, how many columns are you looking up values in? Just the one or multiple?
Ahhh, OK, I gotcha now.
I don't think they're searching 45k rows - I think they have 45k rows searching
You need a database...
Wrong sub
Right sub. To complain about slow processing when you're handed a i7 processor is a bit of a stretch. Different tools exist for different uses. Not every Excel formula is appropriate (or efficient) for the job you want it to do. Not everything is meant to be thrown in Excel. "What's a better way to do this?" is a question worth asking more often, IMO.
OP is in the wrong sub. He is using Excel for a use case it was not designed for, and that's why he's having trouble.
Bingo
*”I’m using a spreadsheet when I should be using a database HELP!”*
I've personally found XLOOKUP to be much slower than VLOOKUP. I only use XLOOKUP if i need the additional functionality that the feature offers.
Index and match always felt like it worked the quickest for me. Vlookup only if I need to do something quick. But actual reports it’s usually index and match.
What's the reason for xlookup being slower? Any observations in which scenarios it's slower vs vlookup?
https://deployn.de/en/blog/xverweis-schneller-als-sverweis/ here is one speed test
Index match should be quicker than xlookup in this case - maybe try that?
Is it formatted as a table? 45k is a lot but not something that should take more than 3-4 minutes. However, my excel ALWAYS freezes if it's a table
It is, yes.
Right click the table In the drop down, click table, then format as range. It'll remove the table format but leave the data, and won't freeze your computer as much ( will still take a min or two for 45k rows but not 30 min+)
I would personally never not use tables.
Exact opposite, I despise them
Genuinely: why?
Same issue as the OP. With a lot of data, it slows down a ton when formatted as a table. There are also issues if you have multiple columns with the same name, as the table tries to rename the columns. I can also easily format the data in my own with cell borders and coloring that let's me do more than a basic table
Yeah, I just don't get that. Are you going ham on Named Ranges then to better manage formula complexity? If you're not using structured references, how do you manage the data sets changing size? (i.e. row count) I would think needing OFFSET would erase any performance gains you might preserve by not using a table. As a matter of routine, I work with large raw datasets across multiple tabs (dozen+ quite often) that usually need a lot work to clean and then stage for outputs; I can't imagine not using tables or PQ to make writing & auditing formulas easier and to handle all of the production of the initial cleaning & updating of reference tables with new unique values. If your dataset is too large to drop into a table because it will cause performance issues, what could you possibly gain / need from being able to format cells more flexibly over the same exact data set? I am genuinely stumped.
If I could ask you the same, why do you need it as a table? Usually what I'll find is tables have very defined references for a row and cell. So what would be d564 in a sheet is @ytd actual or some other name. Where this becomes an issue is we do a lot if model driven output that uses formulas to do calculations off those cells. It is much easier to edit and understand these formulas references when it's a cell I can easily go to, rather than having to go back into a formula with multiple table cell references and untangle the mess. The formatting is usually only around the headers and top row of my table. It's very easy to add borders and multiple colors to that and keep the rest of the data as clean white. But at the end of the day this is all personal preference. Do what works for you obviously
This is a very common scenario for me - I'm doing a lot of lookups, merging, and appending - it's typical for me to get six or seven client's data sets and then have to consolidate them with our internal data so we can holistically track & forecast performance. Each client sends things in their own format (often from CRM / Marketing systems) - we get daily files sent to us. However, automating the ingestion has a high upkeep cost - people change their data format all the time - it's easier to have a workbook that can be rapidly updated to account for changes in the raw data and then upload the prepped data to our DW, then it is to have reporting break a couple times a month and then need to wait for ticketing and escalations to run their course. The complexity of the lookups and level of cleaning that goes on coupled with that there are multiple people who need access either to do the update or audit things, if we weren't using structured references with tables it would be a nightmare to maintain let alone train people up on.
There’s a few instances in which tables work really well, but I’ve had endless headaches if my use case isn’t just right.
Use the double Vlookup True, its instantaneous. https://analystcave.com/excel-vlookup-vs-index-match-vs-sql-performance/ P.S.: Even VBA or Power Query looses to it. Try and let me know:)
Seeing terrible job-provided hardware next to 10th gen i7 made me cry.
Yeah, I've got an 8th gen i7 at work and while I have complaints I've never had lookups take literal hours. There must be something OP could do to either clean up the data or optimize the formula before blaming the hardware.
IT here, making do with do with a company provided 8th Gen i5. I would feel lucky to have the 10th Gen i7.
You can use Data Tables to perform reiterating XLOOKUPs without bloating your excel file and only using one formula.
Can you explain this more?
Most people use Data Tables for sensitivity analysis to recalculate different outputs for DCF models etc. Instead I use it to reperform a single formula logic across every row in a data set. 1. Add a "Row #" helper column with sequential numbers from 1 to 45K beside each row of data. 2. In a blank cell on the same sheet, create a trigger cell with "1" as your input. 3. Next to the trigger cell in step 2, create a single XLOOKUP formula that uses the trigger to return the correct XLOOKUP result for row #1. Test the formula by changing the trigger cell to 2..3...4...5 and make sure it works as intended. 4. Beside your Row # header you want this cell to point to the XLOOKUP formula you just created in step 3 . 5. Highlight the Row # and Xlookup columns and create a Data Table (Data Ribbon > What if Analysis > Data Table). Row input cell is blank, Column input Cell should be your trigger cell from Step 2. 6. Adjust calculation options to Automatic Except for Data Tables. I use this for customer level forecasting by contract going out several years and it reduces file size bloat.
Need 45 minutes to wrap my head around this.
Do you use the arguments of xlookup as intended? =xlookup(A1, B1:B108000, D1:D108000,, 0) A1 is in the small table, B and D are in the big table.
45+ minutes for 45k lookups feels like something is wrong with your data (i.e. it is struggling to match anything) I've just done a `MATCH` on 167k rows into a dataset over ~325k rows (and then run a series of `INDEX`s on that), and the whole thing took, at most, a couple of minutes I do find with `INDEX`/`MATCH` combos that if there is no `MATCH` return, the whole thing is a lot slower (but using `IFERROR` wrappers seems to help speed up by a fair margin)
> and then run a series of INDEXs on that super minor point but INDEX is trivially fast. MATCH does all the heavy lifting in that pair
I know - but point being, 45k `XLOOKUP`s taking >20-30x longer than 167k `MATCH`s (when the difficult part in either case is the lookup) is surely indicative of there being issue with the operation. It took me longer to write the series of `INDEX`s (as they were not consistent / contiguous - for good reasons) than it did for it to calculate
Is the XLOOKUP using full columns A:A or a range A1:A50000. the latter uses less CPU.
No, it's formatted as a table, so just using the named ranges.
Wrong tools for the job. Power query, data model, cube functions, etc.
Underscore over spaces. Also YYYYMMDD so files are in order for frequently run. Often at the beginning of the file name. Situations are situational. And I feel your pain. Especially when working remotely, going into office network, then the hospital that holds the data. And every connection has some level of crazy encryption to protect PHI.
You can upload you workbook to [https://rowzero.io](https://rowzero.io) - the XLOOKUPs will be instant for multimillion row data sets.
I don't think this is a hardware issue. I see this a lot with files. My rule of thumb is as follows: - Want to use Pivot Tables: Great, Excel will work fine for 800K rows. It'll be clunky, but it'll work; - Do you need formulas to update rows individually for over 18K rows, and on multiple columns? Yeah, you're going to collapse your machine
Something sounds wrong. This should take a couple minutes. Anything volatile?
Yeah, something else is wrong. My work PC is an 8th-gen laptop i5 (8365U) where half the processing power is consistently bogged down running security and document control software, and my Excel (32-bit because it needs to interface with obsolete software from the 90s) churns though tens of thousands of rows of XLOOKUP in just a few seconds.
Try it with an i5 which is my lovely work computer. I have to say I don’t find PQ much faster.
Install dbeaver. Use SQLite
Was going to suggest this, this would run in 1-2 seconds in a SQL db
Seems overkill when you can just use Power Query
Excel is not database software. Why is this not a database? You wouldn't experience this with a database, even running locally on your "terrible" CPU.
How much data are we talking here. There's something out of line if 45k lookups are taking anywhere near that long.
Is your formula using column reference? I.e. A:A. Then it's not looking at 45k rows but rather millions.
Underscores. Also, I feel your pain my friend. We had a data set that was over 50k with roughly 5 tabs. Multiple vlookups, an ass ton of formulas in each sheet. It was just ridiculous to navigate.
Underscores definitely
Index/match instead
I would be glad if it's using 90% and not 9%, that would take 10 times longer!
PowerQuery is the way to go. It’s quicker, easier to make changes and frees up a ton of memory. The other alternative is to use INDEX-MATCH instead of XLOOKUP.
One thing I'm not seeing mentioned. Assuming one or both files are on your machine for this...does the machine have an SSD or is the machine using a disk hard drive for storage? There can often be Excel situations with a lot of storage access including situations where there theoretically should not be much. If it needs to access a hard drive for those requests it can really slow things down.
Power query or python are the answer for you.
90% of your CPU with over 45 minutes of churning through tables? That doesn't sound right. I don't think XLOOKUP should be taking a whole minute for one thousand rows. I have an excel file with over 300 tables, ranging from 1 column with 20 rows to 5 columns with 1250 rows. I have roughly 20 XLOOKUPs which al pull up different kinds of information when I type something in one cell, with all of those having 7 other XLOOKUPs within them. I have around 30 more XLOOKUPs which pull data from another set of tables when I fill in something related to the first cell in 4-5 other cells. All of that is repeated 10 times in my file, so I can run multiple searches at once. Even with the file searching through everything multiple times, it only takes one second to process. I know my dataset doesn't come close to yours in terms of size, but I don't think XLOOKUP should take nearly an hour to process a request.
This is not a software or hardware limitation, this is user skillset limitation. You should NOT be doing it this way.
>I'm just complaining about my terrible job-provided hardware (10th Gen i7) Ooooh fancy gen 10 i7. I'm stuck on gen 7 i5 and 32 bit excel :'(
Use python
In your xlookup are you using the row reference with the @ character eg [@colum_with_lookup_value]. I think you would be getting a spill error if not but on mobile and can't test if that's the case in a table.
I use PQ a little bit, but I'd just do what I know how to - pull the required columns into vba and do the work there. Formulas take too long to calc.
Where are they pulling information from? I had a book that was just sheets of lookups that would take hours if it didnt freeze up completely. I changed it to VBA and it takes 45 seconds or so now. If your reading from remote closed workbooks though, it would still take awhile.
You have an i7?!? I have an i5 for this same type of work!!! Cue the “We’re the Millers” meme…
Time to upgrade to a real database.
I just started messing around with xlookup the other day so very new still but I did notice a big performance issue when trying to lookup a whole column vs just the range of the data….especially when using multiple conditions.
People will go to insane lengths to avoid learning how to set up a real databae
That seems normal for 45K rows, but it shouldn't take 45 minutes, but it'll definitely use 99% of your CPU. Your data might have formulas in it, instead of hard coded data. That's the only likely explanation for a 45 minute calculation time.
Not an Excel expert but I would look at doing this with a macro (if it currently is just cell entries). Simply shutting off screen updating will save a lot of time.
Folks recommending INDEX/MATCH instead of XLOOKUP – how would INDEX/XMATCH compare?
I don’t use xlookup because it’s a memory killer Index match is just as flexible and less memory
Is your file on auto calculate?
Do python, learn about the jupyter notebooks or use vs code natively on your laptop. Explore pandas dataframes. I believe that way you can slice your data a lot more efficiently. Excel does use quite a bit of CPU for such a task of yours. Just a general suggestion, I am not expert, very casual user, but I have noticed that Python does things a lot more efficiently.
Underscores for me. Anything >1000 rows and i am heading to R or python. My god, 45k rows.
oh 4 Oct is my BD, I agree too on this
DM me I'll automate it for ya.
If you have such a constraint why not using Python to make it faster? You can use Polaris package and you can use it with excel also
Excel front-end to Python running on AWS parallel p5.48xlarge instances. ;-) https://preview.redd.it/ixr15x553swc1.png?width=836&format=png&auto=webp&s=c4e99e884e197d14832bf982fffba7ced9998c42
"terrible job-provided hardware (10th Gen i7)". (sic)
This is not an Excel or hardware problem. It's a use-case problem. Excel is not a database.
Delete all floats (round up decimals) Also do not leave those lookup romulas in the file. Once looked up paste VALUES. Removing decimals in very large Excel files will drop file size like 30% and speed up. If you have several columns doing a lookup in formulas in the background on 45k row itss be slow AF.
What is the formula you're using... I know xlookip... But specifically can you paste it in, exactly as it's written
Run powerBi it it designed for big data like this but uses way way less cpu
Its excel and xlookup. If you use the appropriate lookup( h, v, or index match) itll be way faster.
Very likely you may have other sheets or workbooks open with a lot of formulas. Excel recalculates all fornulas in a file everytime you update a cell. Copy your 45k rows to another Excel, close all other open excels and work on this new file.
Pandas
Bros faang company is in a excel file
Save the file as binary
Mine has issues with memory. Massive company that refuses to upgrade to 64 bit excel. We are stuck with 32 bit till IT gets its head out of its ass.
Sort the two tables and do binary search.
Underscores all the way. Also just about anything is faster than excel lookups. 45k is nothing.
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread: |Fewer Letters|More Letters| |-------|---------|---| |[IFERROR](/r/Excel/comments/1ccunmt/stub/l1844bx "Last usage")|[Returns a value you specify if a formula evaluates to an error; otherwise, returns the result of the formula](https://support.microsoft.com/en-us/office/iferror-function-c526fd07-caeb-47b8-8bb6-63f3e417f611)| |[INDEX](/r/Excel/comments/1ccunmt/stub/l1amubi "Last usage")|[Uses an index to choose a value from a reference or array](https://support.microsoft.com/en-us/office/index-function-a5dcf0dd-996d-40a4-a822-b56b061328bd)| |[MATCH](/r/Excel/comments/1ccunmt/stub/l1amubi "Last usage")|[Looks up values in a reference or array](https://support.microsoft.com/en-us/office/match-function-e8dffd45-c762-47d6-bf89-533f4a37673a)| |[NOT](/r/Excel/comments/1ccunmt/stub/l199g8z "Last usage")|[Reverses the logic of its argument](https://support.microsoft.com/en-us/office/not-function-9cfc6011-a054-40c7-a140-cd4ba2d87d77)| |[OFFSET](/r/Excel/comments/1ccunmt/stub/l1b0k9d "Last usage")|[Returns a reference offset from a given reference](https://support.microsoft.com/en-us/office/offset-function-c8de19ae-dd79-4b9b-a14e-b4d906d11b66)| |[PHI](/r/Excel/comments/1ccunmt/stub/l1898t7 "Last usage")|[*Excel 2013*+: Returns the value of the density function for a standard normal distribution](https://support.microsoft.com/en-us/office/phi-function-23e49bc6-a8e8-402d-98d3-9ded87f6295c)| |[VLOOKUP](/r/Excel/comments/1ccunmt/stub/l17uh8p "Last usage")|[Looks in the first column of an array and moves across the row to return the value of a cell](https://support.microsoft.com/en-us/office/vlookup-function-0bbc8083-26fe-4963-8ab8-93a18ad188a1)| |[XLOOKUP](/r/Excel/comments/1ccunmt/stub/l1amubi "Last usage")|[*Office 365*+: Searches a range or an array, and returns an item corresponding to the first match it finds. If a match doesn't exist, then XLOOKUP can return the closest (approximate) match. ](https://support.microsoft.com/en-us/office/xlookup-function-b7fd680e-6d10-43e6-84f9-88eae8bf5929)| |[XMATCH](/r/Excel/comments/1ccunmt/stub/l1amubi "Last usage")|[*Office 365*+: Returns the relative position of an item in an array or range of cells. ](https://support.microsoft.com/en-us/office/xmatch-function-d966da31-7a6b-4a13-a1c6-5a33ed6a0312)| **NOTE**: Decronym for Reddit is no longer supported, and Decronym has moved to Lemmy; requests for support and new installations should be directed to the Contact address below. ---------------- ^(*Beep-boop, I am a helper bot. Please do not verify me as a solution.*) ^(9 acronyms in this thread; )[^(the most compressed thread commented on today)](/r/Excel/comments/1cmwto0)^( has 22 acronyms.) ^([Thread #32918 for this sub, first seen 25th Apr 2024, 15:58]) ^[[FAQ]](http://decronym.xyz/) [^([Full list])](http://decronym.xyz/acronyms/Excel) [^[Contact]](https://hachyderm.io/@Two9A) [^([Source code])](https://gistdotgithubdotcom/Two9A/1d976f9b7441694162c8)
I've heard that index match is the least processing power query. Try using that instead and see how it improves
Index Match > XLOOKUP, faster too!
Index match it