A customer asked, “Please tell us about the CF_SYLK
and CF_DIF
clipboard formats and if you have information on the specific format to use.”
Okay, let’s start with the first part of the question: What are these things?
The Data Interchange Format is a file format introduced by VisiCalc. VisiCalc was the first electronic spreadsheet for personal computers, a landmark product which expanded the appeal of personal computers into the business world. It was arguably the first “killer app”, the program so compelling that it by itself served as justification for buying a computer. In 1983, VisiCalc ended up being overshadowed by Lotus 1-2-3 and quickly faded into history.
The SYLK format is a file format introduced by Microsoft Multiplan, one of the company’s early forays into integrated productivity software suites. Multiplan didn’t last for long, but it lasted long enough for Windows to add a clipboard format for it. Somewhat ironically, the Microsoft product that replaced Multiplan was Excel, which also served as one of the “killer apps” for that new operating system called Windows.
Now the second part of the question: Which one should we use?
That’s easy: Neither.
There’s no point adding support for these clipboard formats. You’d be interoperating with programs that nobody uses any more.
I asked why the customer was interested in these data formats in particular. After all, there are thousands of data formats out there. Why are they so interested in learning about DIF and SYLK? The customer seems to be running around looking for things, finding out what they are, and then deciding whether those things will help their program.
I can understand reaching DIF when starting with the requirement to support interoperability with VisiCalc: “We need to interoperate with VisiCalc. What data format does it use? Oh, DIF, thanks.”
But starting with DIF and asking what it is used for is going about things backward. Start with product requirements, and then identify the things that will help you achieve those requirements. “Will the DIF data format help us achieve the goal of X?” is a reasonable question. But “Tell me about the DIF data format” is an open-ended question that takes a lot of time to research, and it’s pointless to make someone do all that research only to say, “Oh, then I’m not interested.”
Its like saying, “The documentation says that if I have a document written in the Hittite language, I can tag it with CF_HITTITE
. Where can I find documentation on the Hittite language?”
Why are you writing a document in Hittite? The last native speaker of Hittite died over 3000 years ago. There may be some academic scholars who can read Hittite, but if your intent is to communicate with those scholars, you probably should use a language they are more fluent in. If you’re going to send the document to, say, the Cuneiform Studies department at the University of Chicago, then you probably want to write your document in English.
What is your use case that led you to want to add DIF support? Do you have any customers who need DIF support? The last native speaker of DIF died in 1983.
One reason to support DIFF and SYLK is because i want to be a good developer and pay my programmer taxes.
If i'm trying to make data available to other applications through an `IDataObject`, and i already support:
- HTML
- Text
- Unicode Text
- PNG
- Bitmap handle
If it's trival enough for me to include other formats that might help out another application: i'd like to do that:
- CSVRead more
- Diff
- Sylk
- FileDescriptor
I used to like DIF files. I was a bit sad when CSV files displaced them as the lowest common denominator for this sort of data interchange.
I seem to recall that the slightly weird format was so that it would work with the file IO limitations of some very primitive versions of BASIC (specifically if I recall correctly they would get upset if you were expecting a string delimited by "" and got a...
Amusingly
a) Sylk is also the brand name of a product you may not want to search for
b) within living memory (well into the 2010s though not in Office 365 ProPlus which is what I’m using) having a CSV file (with the CSV suffix) beginning ID, in the first line would give you a weird message about SYLK files.
Again, not the clipboard format, but for a project at a company that I used to work for, we needed to be able to generate a file that would automatically open in Excel for the user. As I recall, CSV wasn’t quite automatic enough to keep the user happy, so we ended up creating the file in (I think) SYLK format.
Both are documented in this not-so-long list of standard clipboard formats: https://docs.microsoft.com/en-us/windows/win32/dataxchg/standard-clipboard-formats
Emphasis on “documented” and “standard”.
Both have the least description among others. Looks like the customer was going through this list while implementing clipboard support and had a valid request of giving more information on the parts that are not well described, but worded it poorly.
What sort of additional description would have helped the customer? Should the documentation say something like “This format is obsolete”? Who’s to say that it’s obsolete? Windows itself never used the format, but some spreadsheet programs still do (probably only reluctantly).
How about something like "Data Interchange Format, a spreadsheet format introduced by Software Arts' VisiCalc"? Key words being "spreadsheet" and "VisiCalc" (probably a more recognizable name than "Software Arts").
I'd then expect developers browsing the list to have one of three reactions:
1. My program has nothing to do with spreadsheets, so I'll ignore it.
2. VisiCalc? Now, there's a name I haven't heard in years! Surely nobody uses that format anymore.
3. Well,...
Not sure about the specific customer, I'm just giving the benefit of a doubt here.
The main issue, I believe, is that a vendor/application specific format has a constant value in the documentation instead of being in an application-specific range (CF_PRIVATEFIRST...CF_PRIVATELAST ?), so it must be important or a special case, right? Adding that it's a spreadsheet could help. When implementing clipboard support you want to cover all relevant formats: I want the user to...
Trawling the documentation for clipboard data formats to support seems like a bad idea. The clipboard formats to support should follow from the purpose of the app. If it's to work with audio files, then you'd probably want to support CF_WAVE, maybe CF_RIFF, and anything you want pass along in the CF_PRIVATE range. If support for the DIF and SYLK format didn't arise from the development of the program (or from it's...
Doing a bit of searching generally solves the problem for me: Wikipedia has a fairly comprehensive article on both DIF and SYLK where it’s pretty clear what the formats are. There are cases where the search results are ambiguous or non-existent, but I’d probably check on Stack Exchange or similar before spending one of my limited Microsoft support incidents.
Or maybe it’s an issue of poor naming and poor documentation rather than silly customers doing things backwards.
“Data Interchange Format” and “Symbolic Link format” sound pretty useful when it comes to clipboard operations. I don’t know that I’d have identified them as decades dead spreadsheet formats either…
In order to leave this comment I had to answer a disturbing question as “Yes”
That said,
https://outflank.nl/blog/2019/10/30/abusing-the-sylk-file-format/
In less malicious news, I also heard about DIF being used to this day as a trivially generable format, still readable by pretty much every spreadsheet software out there.
Great post! Thanks for sharing it.
I suppose it’s possible the client wondered what their program ought to do if it was asked to paste from the clipboard and encountered data in one of those formats.
If the program encounters data in these formats and neither it nor the programmers know anything about these formats, then the program should do nothing. That's what I would expect of well-written programs when they encounter data in a format that they don't understand. Better that than copy/interpreting/parsing/mangling data in an unknown format. things go quickly from bad to worse once you start down that road.
On the other hand, if the program...