What happened if you tried to access a network file bigger than 2GB from MS-DOS?
One of my friends is into retrocomputing, and he wondered what happened on MS-DOS if you asked it to access a file on a network share that was bigger than what FAT16 could express.
My friend was under the mistaken impression that when MS-DOS accessed a network resource, it was the sector access that was remoted. Under this model, MS-DOS would still open the boot sector, look for the FAT, parse it, then calculate where the directories were, read them directly from the network hard drive, and write raw data directly to the network hard drive.
This is not how it works.
For one thing, if it worked like that, then if two clients both accessed a network hard drive, they would corrupt each other. Each one has its own locally-cached copy of the FAT, and when it came time to allocate a new cluster, each one would pick a cluster (probably the same one), and assign that cluster to the new data.
What actually happens is that the file system operations themselves are sent remotely, rather than the low-level disk operations. You would send send requests to the server like “Please open a file called
AWESOME.TXT in write mode” or “Please tell me how big the file
README.DOC is.” The remote server translated these requests into its own native file system, performed the operation, and sent the results back to the MS-DOS client.
The server need not be running a FAT file system. In practice, it was probably running Novell NetWare.
The next question was, “But what happens if the file is bigger than 2GB?”
A file bigger than 2GB? What planet are you from?
We’re talking 1984 here. A 20 megabyte hard drive costed around $1800. To get to 2GB, you’d first have to invent RAID. Then create an array of 100 drives, which would put you at $180,000. Though you could probably get a bulk discount. And you’d have to be able to connect them and power them all. And then to create that file, you’d need to push 2GB of data over a T1 line, which would take about three hours.
My friend explained, “Well, let’s say that the super-huge file is on a supercomputer somewhere. You’re not downloading the file, but rather seeking to selected portions and reading little bits. How would that work for a file bigger than 2GB?”
The response to the request for file attributes had a 32-bit value for the file size. So if your file was 2GB, that would still fit.
Read requests took the form of a 32-bit file offset an a 16-bit size. If your file was bigger than 4GB, you would have no way to access any bytes beyond 4GB because it wouldn’t fit in the 32-bit file offset.
Of course, if you’re retrocomputing, then your poor 1984 MS-DOS system is going to be seeing wild and crazy things from the future, like hard drives bigger than 20MB and processors that can count to a billion in less than 55ms. Its brain might explode (or divide by zero). But that’s part of what makes retrocomputing fun, I guess.
Vaguely related to MS-DOS retrocomputing: I designed (but did not implement) a TSR at one point that would implement a user account system and (try to) enforce access control on disk files. It’s not like it would have actually stopped anyone from, say, bypassing DOS and reading the disk directly, but it was one of those “wouldn’t it be cool if” projects.
I would be surprised if *any* remoting system would remote the low-level operations at all… BTW, a recent episode of XKCD also talks about cute things that happen when retro-software is emulated on a modern computer.
There are definitely distributed storage systems that work with concurrent block level access. Something like a StorNext SAN deployment works by connecting a big pile of disks to a bunch of systems over fibrechannel, and each system connected to the disks can do low-level block I/O basically as if it was talking to a private internal drive. You need a dedicated metadata server (cluster) that the clients all talk to to coordinate where exactly files are supposed to live, and who owns which blocks, etc. It certainly wouldn’t have been impossible to make that sort of a network storage sharing setup work on PC’s in the 80’s. It just would have been… odd.
Basically, to try and do a SAN style distributed block filesystem in the 1980’s on DOS, you probably would have had one of the two machines acting as the metadata server cluster. (Since, it’s not like you can spare a whole computer just for that in 1984!) So the client would ask “what is the address of hello.txt” and then ask “what is the data at block address 12345.” But it would be asking both questions of the same server, which is silly and redundant, so it would have been a performance anti-optimization with the hardware and workloads of the time.
I wonder if the flag in the INT 21/6C00h call that allow access to 4GB files had any effect on DOS redirectors.
Though SMB is not the same as HTTP. One typically runs on 10Mbit Ethernet, while HTTP did run on T1 lines. This is why it took until IE8 for files larger than 4GB to be added there.
On the other hand, “what would happen if MS-DOS running under virtual machine tried to access a network file bigger than 2GB?” is a valid question, and you can just tell him to set up a VM or two to check. 😛
I suppose you could, if you like daisy chaining VMs. We discovered that DOS didn’t like talking to Windows Server 2003 when it came out, so anything newer is right out.
Was this a domain controller? It might insist on NTLM2 authentication or SMB signing, which would prevent DOS clients from connecting.
I think you’ll want to enable IPX/SPX for it to work, although latest DOS network kit should support TCP/IP already.
In the reinventing, it might gain a new acronym: RAVED – Redundant Array of Very Expensive Disks?
Wasn’t some form of cheap digital tape mass storage available in the mid 80’s to the PC? At any rate the need for a 32 bit file pointer is fairly obvious, because 64K isn’t enough for everyone, and nobody wants to waste RAM by reducing resolution, reading in whole blocks or whatever.
Well, except one good use case of such a large file would be for a clever algorithm that streamed a stored constant from a multi gb mainframe tape file (RLYBIGPI.TXT, for example) to efficiently piece-meal calculate a very accurate result (like the perimeter of a circle) for display on a CGA monitor.
Extremely nitpicky, and not actually a problem with the concept, but less than 40 digits of pi is enough precision for even ridiculously extreme precision in enormous calculations. NASA JPL uses 15 digits in their interplanetary navigation calculations. But the point still stands; it’s reasonable that someone would stream a very large amount of data from a single file on a mainframe.
I can’t remember what actual size, that was avaliable on tape storage. Yet in 1991/92’ish, we had QIC-80 tapes, for the private market, and they stored 128 megabyte non-compressed. And if you compressed the data you backed up, you could essentially store what was around 250 megabyte non-compressed. That said, if the files were zipped or other form of compressed, then you were still only able to store 128 megabyte.
That was in the early 1990’s, and though the tools were on Dos for the drive, you were not able to use it as a network drive. Tapes were backup solutions and not drives as such. Two really different type of storage solutions.
The same question applies to Win 9x? SetFilePointer and friends do not support 64-bit offsets on these systems but is this limit high up in kernel32 or down in the local FAT filesystem driver?
I think no version of DOS or DOS-based Windows supports 64-bit offsets anywhere.