It's only Monday and I'm already done with this week. I watched a work issue turn increasingly acromonious over either miscommunication, misunderstanding procedure, or being too wed to a tool to admit it's problematic. I drafted about half a sentence in chat, then stopped because I didn't think I was adding anything necessary and didn't want to make conflict any worse.

I'm increasingly convinced that Excel spreadsheet files should self-destruct or become permanently read-only after 90 days. That's plenty of time to migrate data into a proper database or to migrate calculations into a programming language amenable to auditing, verification, and change control. Excel should be treated as an attractive hazard with a time limit on how long an unverifiable and hostile-to-revision-control worksheet should be suffered to live.

Sadly, these two thoughts are related. Spreadsheets in their current form are too dangerous to be used for engineering work.

If you are questioned by a regulator about how your spreadsheet of engineering calculations/data complies with QA requirements on engineering software, I cannot help you. I'd love to be able to suggest a solution but there simply isn't one beyond migrating to a more appropriate, safer tool. And this isn't just about Excel; it applies to every calculational worksheet system (MathCAD, etc.) with similar characteristics of being effectively unauditable and unverifiable with no easy way to check versions or compare revisions. Spreadsheets are software.

Show thread

I will never forgive Soace Karen for wrecking Twitter. That bastard broke a massive number of links I had to varied and sundry experts and professionals. One of the people I most need to reconnect with is Martin Nyx Brain who I believe was a researcher into provable(ish) computing. We had developed an intricate and very niche inside joke about a mythical startup we formed to make Excel verifiable complete with software products, consulting services. It went under the name of #JägerXL. It was a fine joke and about 40% serious which put it close to that gray zone of humor where nobody is quite sure if people are joking, including the participants.

Show thread

People die because of bad spreadsheets. We've known about flaws in Excel's automatic type conversion for decades, silent limits on worksheet size, silent precision loss, the intractability of review and change detection/control. In every other programming environment cut-and-paste code is considered unmaintainable and bad implementation. Microsoft knows and they do nothing. Nobody can innovate in this space - it's business suicide to take on the 800-lb gorilla. With publish-or-perish being the law of academia, no progress can be made there - there's no prestige in tool development, it's not sufficiently novel, and novelty is what gets published and funded, not utility and need. Where does safer software come from? Nowhere. There's no way to create it outside a place like Microsoft and those organizations haven't shown an interest in it for at least the past three decades - why would anyone expect that to change now?

Show thread

@arclight on the bright side, I've been seeing more and more students coming in the lab and using and documenting work in from day 0, instead of using Excel. How did this happen? Well we taught them during their undergraduate years.
We're far from an ideal situation, but in a better place than we were 5 or 10 years ago.

@nicolaromano It's difficult to deploy R programs in a controlled production environment but as an interactive calculational worksheet for personal use, it's head and shoulders above Excel. It's straightforward to review and has very good documentation tools. I've traditionally used Gnuplot for making graphs but I'm considering moving that to R because R is more available internally. Having basic plotting scripts to share would help displace Excel for plotting (a major use and also something Excel has always been mediocre at). Python is popular here but I wouldn't wish matplotlib on anyone.

Follow

@arclight Sure it all depends on use cases and I agree that R can be tricky in production (but so is Python!), although there are some solutions depending on what you need to do. Still, no matter the tool I think the main point is to teach concepts like reproducibility, traceability, version control, and well, professionalism in general! :)

@nicolaromano Exactly! The specific language is less important than the ability to write dependable and understandable code. R wasn't developed as a general purpose programming language and it's not fair to judge it that way (Python on the other hand...). Even then, the R community adapted the language and practice to better support making standalone applications, to make documentation and testing easier, etc. There's at least movement in a good direction. Excel just keeps adding less and less useful features without addressing any of its core shortcomings. Excel doesn't improve, it just gets bigger and its UI keeps churning.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.