Recently I started building Excel addons with custom formulas (e.g. =company.inventory(upc, WAREHOUSE), which call an api. This has blown the minds of non-tech folks among my clients. They equate this to magic.
I was forced into doing this because after a year of digging to find out what reporting they wanted in the dashboard (“oh a thousand things… where’s the Excel spreadsheet export button?”), I gave up and now default to giving them all reporting via Excel, using these custom formulas.
This is probably the second-most-proud-of thing I worked on in my career, based on an off-hand comment from one of our salespeople: "Why can't our customers use Excel to view their data?" Another developer and I wrote the first version in a couple of weeks - an Excel VBA add-in that used IBM SNA APPC to communicate with their AS/400 (I later upgraded it to use TCP/IP sockets).
Because the salespeople didn't know how to use or sell it yet, I got to be the one to travel to our first customer and train them on it. After showing their CFO how to add our formulas to a couple of cells, make a chart from it, and then hit Shift+F9 to refresh the formulas and get current data from their AS/400, he pushed me away from the keyboard and quickly entered several more formulas to bring in additional data. And then told me I had just saved him three day's work every month when he created charts for his reports to the board.
I eventually left the company, and they got bought a couple of times by ever-bigger software companies. But the product is still out there and is very versatile, having been extended to talk to many different ERP & accounting systems. It even still has the same product name: Spreadsheet Server.
Isn’t this exactly what Microsoft access was invented for? I’ve seen technical but non-cs people built magical things in forms by (or views). And when access got the ability to talk to MSSQL it really blew up. The quintessential low-code environment. I wonder why it never caught on.
I was hired a couple decades ago to replace a Microsoft Access system that was written by a non-software engineer and was running an entire company with around 50 users with all sorts of complex forms.
It got a little too big and I rewrote it as a C# desktop application talking to a SQL Server, but it was impressive what this person was able to create with no real programming knowledge and just Microsoft Access.
This is exactly what I meant. For real world situations and when you don’t have SWE “to do a proper” job you can get very far fast with just access. Especially if you hook it up to a running db instead of using the internal one.
Which gains did you or the users get when rewriting to C#? What were the downsides?
> Which gains did you or the users get when rewriting to C#? What were the downsides?
The gains were performance. The MS Access application was not backed by a proper database, I forget the exact setup but it was more of just a shared MDB file over the network? Whatever Access was capable of at the time. There were contention/locking issues and all the other sorts of problems you'd expect given the setup. My memory of it is a bit hazy, but I know it was lacking a true database outside of its own MDB format.
Funny story, after the rewrite the software was so succesful internally that we decided to start selling it, and it became one of the industry leaders. The person who wrote it originally was highly knowledgable in their field, and I happened to have decent enough programming knowledge. The two put together ended up with something special.
Microsoft essentially abandoned it. They hoped that a bunch of random Azure services would be its replacement, but they're too scattered and incomplete compared to Access.
It's not abandoned. Access still has an engineering team working on it, and that team still goes to conferences to talk with their users. Source: I work in the same org as them, and ate lunch daily with a few of them before the pandemic hit.
My theory is that InfoPath killed Access. People wanted to use the "easy interface for Nancy in Marketing to submit a purchasing request" and got bogged down with XPath and SharePoint/ActiveDirectory fuckery, and weren't quite to the level where they could grasp MSSQL (or IT had disabled this or that, forcing the one ambitious "computer guy" in Finance to cludge together something with the tools available). Using Access would have been ideal but it was a time when MS was pushing things like InfoPath and Sway and the like, and simultaneously there were issues with concurrency and "omg macro virus" cybersecurity focus. Access got better but it was already tainted.
Access is a bunch less flexible than excel. You need to plan how to query data more carefully and as soon as you want to do something weird, you basically need to dump your query result into excel anyway.
What is your language of choice for building custom formula?
Python + xlwings: most successful (relatively speaking) to build custom finance functions in my line of work. Downside is the slowness and deploying on other computers.
Rust + xladd: really enjoyed this but feels immature still. Better performance than python and easier to distribute as single dll.
VBA: options above make this almost obsolete, however can’t beat being embedded.
For the "task pane", if you need one, you can use React or Angular, but I prefer to use Vue (or no front-end framework is fine too).
For the backend logic, just straight up Node + webpack. You could probably do this better with Vite, but I think Microsoft's starter templates all use webpack.
You end up distributing a manifest file that tells Excel which server your assets and API live at.
Warning: hot reload and dev debugging is quite a pain on macOS.
Yep, we have a expensive reporting tool, can too all kinds of things, query every type of database we have...
Number one function used. Export to Excel, and normally it is on reports of basic table join queries, none of the advanced things it can do with the data
This is because businesspeople like to fiddle with formulas, graphs, and filters untill they isolate (i.e. cherry-pick) the subset of data that [makes them look good | supports their agendas]. They don't want canned reports. Excel makes this easy because they already know how to do those things in Excel.
People get really mad when you call this p-hacking, btw. I used to try and make this point to ”analysts” that I worked with, that science involves coming up with a hypothesis and then looking at the data. The response I got often enough that I’m sure it’s being taught somewhere was that you “need to let the data tell it’s own story”.
Turns out lots of people have decided what the conclusion is and are taking that conclusion and the data, and then blindly munging until they get a path which connects them.
I think a lot of times unless you are operating at hyperscale and bps matter, most business data analysis is more art than science. Ex, if you have < 200 sales in a month and you want to report this, it's totally normal to want to pull it into excel and fiddle with the numbers, make ad-hoc corrections, divide those 200 sales up in very subjective ways, etc. The smaller the company the more the reporting layer is only a weak approximation of reality, there isn't enough there to create a narrative around. "Hypothesis testing" such data is honestly a waste of time.
Of course you want to be intellectually honest about it, and I agree that cherry-picking the "right" data can be a big problem in the wrong kind of organization. I remember one time a Jr analyst I managed was asked by the CEO to create a certain chart. I was not looped in. The CEO then used that chart to convince himself and many others on the executive team to make a huge product change that ended up being a complete disaster.
If you have data that's so 'dirty' that you can't decide on the filtering rules in advance (or based on only historic data), then what you have is garbage, not data. Therefore, we could call the art of shaping this into meaningful stories garbage science.
Tell me in a comment you have never worked with business data in your life.
Business data is full of minor inconsistencies which are not obvious until you sit in front of it. Products are sold by different units. Reporting ranges and aggregates are slightly different. Subsidiaries use categories which are close but not exactly identical.
There is generally plenty of massaging to do before you can get the information you need.
As one of those business people, I find I’m not very good at requesting reports. There is a gap between what I think is interesting and what actually is interesting. I use spreadsheets (and sometimes visualization tools) to help bridge the gap and make a better on going request to our reporting team.
Often, this is learning for me. I have a bunch of Stephen Few’s books and use exported CSV files to figure out which reports are useful to me and my org. When I find them, I do make a request for standardized reporting. These often become the basis for regular review meetings. In those meetings, we still come up with instances where we need to export to CSV and get into the data to understand what we are looking at.
Our work is going through a big shift this year that means our historic data is not helpful in predicting trends. That’s increased the need for this kind of engagement with the underlying data.
It is sometimes more possible to determine if your analysis is wrong in excel because you can look at the numbers and the numbers have meanings. It can be harder to debug things if you can’t see intermediate computations. I guess I would say that excel prevents some kinds of bad problems with faulty analyses but opens you up to other problems like overfitting or the cherry-picking you describe. But it also seems impossible to produce a reporting tool that both gives useful analyses and doesn’t do incorrect analyses.
Excel also makes it easier to experiment. Pandas requires you to change an entire column which can frequently be processor-intensive, take forever, and be hard to understand. With Excel, you can just take a random cell, try something out, and if it comes up with nonsense, undo it
Is it fast? For instance there is a large overhead to using VBA UDF, whereas XLL (and ExcelDNA) is pretty fast (matters if the formula is repeated over and over on a spreadsheet).
=IFERROR(IF(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(SEARCH("Banner",AC5),SEARCH("EBL2",AC5)),SEARCH("Movie Art",AC5)),SEARCH("Use as is",AC5)),SEARCH("TTT",AC5)),SEARCH("Generic",AC5)),LEN(AC5)+1)<IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(SEARCH("Banner",AC5),SEARCH("EBL2",AC5)),SEARCH("Movie Art",AC5)),SEARCH("Use as is",AC5)),SEARCH("TTT",AC5)),SEARCH("Generic",AC5)),LEN(AC5)+1),MID(AC5,IFERROR(IFERROR(IFERROR(SEARCH("Customs",AC5),SEARCH("Custom",AC5)),SEARCH("Generic",AC5)),LEN(AC5)+1),IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(SEARCH("Banner",AC5),SEARCH("EBL2",AC5)),SEARCH("Movie Art",AC5)),SEARCH("Use as is",AC5)),SEARCH("TTT",AC5)),SEARCH("Generic",AC5)),LEN(AC5)+1)-IFERROR(IFERROR(IFERROR(SEARCH("Customs",AC5),SEARCH("Custom",AC5)),SEARCH("Generic",AC5)),LEN(AC5)+1)),MID(AC5,IFERROR(IFERROR(IFERROR(SEARCH("Customs",AC5),SEARCH("Custom",AC5)),SEARCH("Generic",AC5)),LEN(AC5)+1),IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(SEARCH("Banner",AC5),SEARCH("EBL2",AC5)),SEARCH("Movie Art",AC5)),SEARCH("Use as is",AC5)),SEARCH("TTT",AC5)),SEARCH("Generic",AC5)),LEN(AC5)+1)-IFERROR(IFERROR(IFERROR(SEARCH("Customs",AC5),SEARCH("Custom",AC5)),SEARCH("Generic",AC5)),LEN(AC5)+1))),"")
For fun, I took this formula and asked ChatGPT to convert it to a VBA Function (with comments):
Function extractText(cell As Range) As String
' Declare an array of search terms
Dim searchTerms As Variant
searchTerms = Array("Banner", "EBL2", "Movie Art", "Use as is", "TTT", "Generic")
' Initialize start and end positions to 0
Dim startPos As Long
startPos = 0
Dim endPos As Long
endPos = 0
' Loop through search terms and find the first occurrence of any of them in the cell
For Each searchTerm In searchTerms
startPos = WorksheetFunction.Search(searchTerm, cell)
If startPos > 0 Then
Exit For
End If
Next searchTerm
' If none of the search terms were found, try finding "Customs" or "Custom"
If startPos = 0 Then
startPos = WorksheetFunction.Search("Customs", cell)
If startPos = 0 Then
startPos = WorksheetFunction.Search("Custom", cell)
End If
End If
' If none of the above were found, try finding "Generic"
If startPos = 0 Then
startPos = WorksheetFunction.Search("Generic", cell)
End If
' If none of the search terms were found, return an empty string
If startPos = 0 Then
extractText = ""
Else
' Loop through search terms and find the next occurrence of any of them after the start position
For Each searchTerm In searchTerms
endPos = WorksheetFunction.Search(searchTerm, cell, startPos + 1)
If endPos > 0 Then
Exit For
End If
Next searchTerm
' If none of the search terms were found after the start position, set the end position to the end of the cell
If endPos = 0 Then
endPos = Len(cell) + 1
End If
' Extract the text between the start and end positions
extractText = Mid(cell, startPos, endPos - startPos)
End If
End Function
Edit: Sadly this doesn't work at all lol, and after half an hour of prompting ChatGPT can't figure out why, it just gets stuck in a loop :-(
That is one of the big issues with AIs like ChatGPT though. It gives you a plausible sounding answer, but that answer might be total BS. It doesn't tell you what degree of confidence it has in the answer. Maybe the formulas it gives are right 100%, 99% or 0% of the time.
Yes, and I think that in order for helper AIs like ChatGPT to really break through, people (especially in marketing) will have to learn to embrace confidence intervals, or at least some quantitative concept of confidence. The AI already knows how confident it is in its prediction, it's just a UI question. If plain numbers are too nerdy or Spock-y, at least tell me something like "I'm fairly sure" or "I'm not sure, but..."
I want a pop-out formula editor window, and a menu along the side of the formula editor that lets me search for functions by name, and drag/drop them into the formula editor window.
And [Home] without jumping to select column A, and the arrow keys should work consistently instead of sometimes moving the cursor and sometimes inserting a cell selection.
@airstrike gave me the following very helpful clue when I complained about this a couple years ago: "Just hit F2 while editing a formula to toggle between Edit and Enter modes, one of which will behave as you expect. The other mode, which you hate, is very useful when you want to add references to other cells into your formula"*
There is a designation in the lower left of the window to show whether you're in Edit or Enter mode.
Ah, I usually begin editing a pre-existing cell by tapping F2, but sometimes when I start entering a formula from scratch I just hit the "equals" sign. It feels like I'm editing a line of text but I'm actually in "enter" mode.
Also, the little designation in the lower left has been there, roughly 30 inches from my eyeballs, for hundreds or possibly thousands of hours. It's changed state thousands if not millions of times. How have I only just now seen it?
Haha. I'm in the same boat. When writing my comment, I opened Excel, clicked on a cell, and tapped F2 repeatedly, just to see if there was anything on the screen that changed...
THANK YOU. This doesn't fix my original gripe, but it solves another one I've had. When doing conditional formatting rules, I thought I was stuck with a "dangerous" arrow key. F2 works in those boxes as well to let me position the cursor instead of clobbering my formatting rule with cell references.
Not consistent with any program I know onWindows, Linux, macOS, or even ye olde Macintosh System Software, unless you count spreadsheet programs that are trying to be more like Excel.
Not even consistent inside Excel, because there are many edit fields that default to "evil mode" (my personal feeling about "insert cell references when arrows keys are pressed, and disable Undo", while some default to "normal text editing mode", and some cannot be placed in "evil mode".
I'd like a visual indicator on or adjacent to the text box, and setting to force it to default to one mode or the other.
Yeah, I do the same, but I wish I didn't have to. It's opposite of how I write text in all other contexts, and is really annoying when I forget to hold down [Alt] and get a modal admonishing me for having written a shit formula. Then I have to dismiss that modal, click on the cell again, and get back to where I was.
Even worse, I spend a good chunk of time in Power BI, and it has a similar formula field (for DAX expressions), that mimics Excel a bit, but there you use [Shift] to insert newlines. So I'm always using the wrong modifier key and spewing insults at my computer.
Right they explicitly mentioned it in the comment you replied to. They're asking for a regular normal human multiline text entry box that doesn't require a special mode where all of the keys mean different things than they're used to in code editors.
> Please let me use [Tab] and [Enter] in the formula bar without trying to commit the change
Exactly, there should be a contextual difference between editing in the cell directly or via the formula bar, when in the formula bar tab and enter should insert a tab or new line. Comment/Ctrl Enter (committing the change) or Escape (reverting the change) should be the only way to exit the formula bar via the keyboard.
You can insert a tab character by pasting it. I think you could use AutoHotKey to detect when you're in the formula bar (the class of the active control is "EXCEL<1"), and then have it map Enter to Alt+Enter and Tab to pasting a tab character.
Oh, right, I missed that. I generally prefer spaces, but tab should be possible. It might be worth a try to ControlSend a tab directly to the control; that could conceivably work without having to clobber the clipboard.
If you have multiple lines of formula, you should probably break up the formula across a few cells. This is similar to breaking up a long script into modules... each cell should have 1 purpose. Allows better testing of the formulas too.
I've used both VBA functions as well as the new LAMBDA() function, and they have their place. But there are legitimate reasons I don't want "magic" cells in my worksheet that exist only to be referred to by other cells. That kind of indirection comes with its own headaches. I try to make each cell useful for someone looking at that cell. Sometimes it makes sense to show the user the intermediate calculations. That's kind of the fundamental reason for spreadsheets—the paper kind!—in the first place. But I don't think it's right to spread a specific calculation out over many cells solely to avoid a complex function. Keeping it in the single cell—and using line breaks and indentation to make it more readable—is easier for me to maintain later, rather than bouncing around different locations in the sheet trying to reason about a given formula.
Here's a real example where I'm listing the unique items from a data table that meet user-supplied threshold criteria:
Those three filter criteria are booleans that are multiplied together. (Huh, should I have used AND() instead?) If all three are true, then the resulting list is UNIQUE'd and shown on the report page.
[Ctrl]+[Enter] would be awesome. Or just a double enter would work. Or clicking on the sheet somewhere.
I don't want Excel to change the default behavior. The way it works now is the right way for most people. I just want to be able to enter a mode where I get to freely edit the formula as if it were in a text editor, then exit that mode when I'm satisfied. Whether that is some checkbox option buried in the settings ("Options > Formulas > Working with formulas"), or an F-key, I don't care.
CTR+ENTER is already taken. Select a range of cells, press F2, enter your formula, CTR+ENTER applies and fills that formula to the whole range (very useful).
Oof. I actually use that frequently as well. Okay, okay, How about this? My wished-for option would just be to swap the behavior of [Enter] and [Alt]+[Enter].
Normally the first one commits the formula, the second one inserts a newline. I want to reverse that and make a naked [Enter] insert the newline, and the [Alt]+[Enter] commit the formula.
I think shift+enter for new line would be a better choice if you were to highjack an existing shortcut. Probably barely used by anyone and it would be consistent with a line break shortcut in word/powerpoint and elsewhere.
Committing a formula needs to be a simple shortcut because it has to be used by the least technical users. You don't want to create a "how to exit VIM" mess in a retail product.
You're right, I don't want to change defaults. In fact, this whole back and forth reminded me that AutoHotKey exists, and I'm actually running it already, and I have the power to make this change myself. So I did!
#if IsExcelFormulaBox() ; Whenever the formula edit box has focus
Tab::Send {Space}{Space}{Space}{Space} ; insert four spaces when I hit [Tab]
$!Enter::Send {Enter} ; commit the formula with [Alt]+[Enter]
$Enter::Send !{Enter} ; insert a newline with bare [Enter]
#if
does what I want, with the helper function:
IsExcelFormulaBox() {
ControlGetFocus, F, A
return (F="EXCEL<1")
}
If I had to ask for one killer new feature in Excel, it would be to adopt Apple's numbers approach to the design of a spreadsheet. I.e. not one grid per tab, but one canvas per tab, and that canvas can contain multiple elements which can be grids, charts, pivot tables, something else.
Your grids can overflow with a scroll bar, so I can put one table above another one without them colliding when the top one expands.
You can do that in a backward compatible way, if a canvas is not defined on an old spreadsheet, just assume one canvas that contains one grid set to full screen.
It helps presentation, it helps splitting the logic of your spreadsheet in discrete components, I only see upside.
As noted, Google Sheets introduced this over a year ago [1].
But this is exactly what competition is about -- I love seeing this come to Excel precisely as an answer to Google's version. You have to wonder if Microsoft would have tried it otherwise, since Excel is so entrenched there's less profit motivation for innovating.
Sometimes it feels like "office" software hasn't changed much since the 90's, but when you look at cloud, collaboration, and machine learning, it's still constantly reinventing itself even if the interface still looks largely the same.
Quick shout out to Ellx[0] (I presume, Excel pronounced backwards), which I discovered on HN a couple of years ago when discussing where I'd like to see spreadsheets evolve. It's basically functional reactive spreadsheets in JavaScript, with a storage format that's friendly toward code review, source control, diffs, etc. It automatically performs function lifting, so functions over single values become time-varying functions when applied to time series inputs.
It's pretty impressive, especially since my impression is that it's mostly one person's hobby project.
Creator or Ellx here. The reason is quite prosaic: I haven't maintained the site for almost a year :/ For various reasons. However, I'm getting back at it now. Ellx as a framework has evolved during this period, and is capable of much more than just a spreadsheet now, but this progress hasn't made it to ellx.io just yet.
May I ask what you liked most about Ellx? What is your use case?
My use case is a tool that produces living documents mixing writing with charts, data, and tabular results that can be delivered to nontechnical clients and updated and modified by staff with excel knowledge.
I’ve found various JS notebooks like starboard and observable JS to be the closest thing, but they’re really not there. PowerBI is a Microsoft solution but it’s dashboard focused.
This sounds pretty much indeed like something I made Ellx for. When you say the notebooks like Observable are "really not there", what exactly is the show stopper for you?
Btw, please ping me on Ellx Discord: let's talk!
One thing is that it's possible that Ellx uses exceptions to detect time series and lift regular functions into time series functions. I'm not sure the implementation details. There are all kinds of ways exceptions can be abused to implement near-magic. (I believe there's a library that uses exceptions in OCaml to implement coroutines.) I wouldn't be surprised if Ellx used some try-catch near-magic that doesn't quite work on Safari.
Sounds like a frustrating waste of time. Autocorrect/suggest is already one of least useful aspects of Excel. I would much rather they focus on improving the UI of writing formulas - perhaps a dedicated sidebar that let you blow up heavily nested equations?
Excel is a generic spreadsheet program that's designed to be used by 90% of people for 90% of use cases. Accommodating for this niche use case will definitely break things for many existing users. The real solution is for geneticists (and other data scientists) to use software designed more specifically for their use cases, such as CSV files and pandas or something more GUI-based but meant for researchers.
CSV files are unsuitable for any serious use and especially not a ‘better replacement for Excel files’. It’s an ill defined, locale dependent format of fail.
Sounds great, but I hope they bring this feature to the local versions of Excel too. Using feature exclusivity to lure people into digital serfdom isn't cool.
My #1 wish for Excel is named aliases in a formula so that I don't end up with repetitive incomprehensible formulas that are impossible to follow because I'm using the same VLOOKUP in three different places for a calculation.
My #2 wish is a way for it to automatically break down and indent nested function calls for readability. The color coded brackets help, but only a little.
One aspect that Excel almost seems to enforce for any significantly complex computation is... testing.
If you don't known have test input (e.g. an array of values) for which you have known output to observe for such an opaque programming language, you will be bitten! This innovation just makes it even more important.
That said Excel lets non-programmers create visual representations of data on a regular basis, and everyone has it on their desktop so there's no, "I can't use this" excuse.
I tend to deal with that by using intermediate calculations. I can verify subsets of the formula and then use the results of those subsets in the next stage of the overall formula
One of the best ways Microsoft has made Excel formulas better is by allowing line breaks (using alt-enter) and tab/spaces alignment. So instead of seeing something like this:
For anyone who writes Excel formulas, it is one of the use cases for ChatGPT (a currently free chatbot that appears highly intelligent in some contexts) that users rave the most about. The next time you're writing an excel formula, give chatgpt a try. It might just get it right the first time.
Or it might get it wrong, but not obviously so. Who's job is it to debug/prove the ChatGPT-authored formulas that business people will certainly start using (if they haven't already). I have enough trouble debugging code I write myself. Debugging other people's code is much harder.
When a company makes a multi-million dollar error because an analyst used ChatGPT formulas in Excel, I expect we'll get the answer.
Why do you think it isn’t due to the programming by example work that MSR have been doing for over 5 years and which already partly made it into excel as flash fill?
You're right, "almost certainly" is far too strong. I was unaware of that research, and read a lot into that this is only being released in the web version. And am just generally anticipating a wave of LLM applications to eat everything that could be even remotely, abstractly reframed as an "autocomplete" task. That it only works in English is also evidence against, with GPT-3 you'd get every language it knows pretty much for free.
Is writing a headline as if it were promotional marketing copy just a reflex? I don't think that The Verge is owned by Microsoft (I could be wrong), but otherwise I don't get the tone.
edit: maybe that tone is necessary to convince people that this is a story at all.
Advertorial content is essentially a press release written by the business and passed off to friendly, connected journalists. Sometimes they only cut it for length and don't even change the headline.
https://superintendent.app (paid with free trial) enables you to load a bunch of CSVs and write SQL on those CSV files.
It's a much faster to work with if you know SQL well. It can also handle millions of rows easily (e.g. Loading 1GB CSV file takes 10s on Macbook Pro). Excel can't load a CSV larger than 1M rows.
I initially built it because I had to identify the mismatched transactions between 2 giant CSVs using. Using "full outer join" with Superintendent.app took only a minute to do.
You are eventually gonna have to write comments about other things than just plugging your own projects, especially when they are just barely on topic.
I was forced into doing this because after a year of digging to find out what reporting they wanted in the dashboard (“oh a thousand things… where’s the Excel spreadsheet export button?”), I gave up and now default to giving them all reporting via Excel, using these custom formulas.