Welcome to the OzoneAsylum FaqWiki
Frequently Asked Questions
Software
FTP software
General FTP

Why do I get a lot of whitespace in my files/code when I uploaded and download files? Pages that link to <a href="https://ozoneasylum.com/backlink?for=5397" title="Pages that link to Why do I get a lot of whitespace in my files/code when I uploaded and download files?" rel="nofollow" >Why do I get a lot of whitespace in my files/code when I uploaded and download files?\

This has to do with the various types of hard returns/line feeds used by different Operating Systems and occurs when users are FTPing or sharing files (usually of a plain text format) that were created on different OSes.

This is a reply given to JKMabry's request for help on his host's board:

quote:
Windows use a CR/LF pair for the end of a line, while Linux uses just a LF. The extra lines are added when you upload a text file from a Windows computer to a Linux computer in binary mode, and download it back to the Windows computer in ASCII mode. The text file becomes corrupted because for each LF character in the file, ASCII mode puts a CR before it, whether or not there are any CR characters already before the LF character. The file remains corrupted until you remove the extra CR characters. To remove the extra CR characters, download the text file from the Linux computer as BINARY then upload it to the Linux computer as ASCII, and download to Windows computer again as ASCII.

ie
Windows "textCRLF" (BINARY mode)-> Linux "textCRLF" = OOPS!
Linux "textCRLF" (ASCII mode)-> Windows "textCRCRLF" = DOUBLE OOPS!
Windows "textCRCRLF" (BINARY mode)-> Linux "textCRCRLF" = TRIPLE OOPS!
Linux "textCRCRLF" (BINARY mode)-> Windows "textCRCRLF" = NOT OK YET
Windows "textCRCRLF" (ASCII mode)-> Linux "textLF" OK on Linux
Linux "textLF" (ASCII mode)-> Windows "textCRLF" = OK on Windows

And, to add to the confusion, Apple systems (from the ancient Apple II to the latest MacOS) use CR by itself as the line break character, so that you can have even more wild and wonderful forms of file corruption if a Mac has been somewhere in the file's history. (I think the Commodore 64 also used CR alone, along with even weirder perversions of ASCII, but that's ancient history.)

Surprisingly, of all of these systems, it's actually the ones from Microsoft that got the standards right (a rare thing for MS)... the traditional line break sequence, dating back to ancient Teletypes and continuing on mainframe systems like the DECSYSTEM-20, is CR+LF, with the CR signifying that the cursor should move to the start of the line and the LF indicating that it should jump down to the next line. The decoupling of the two operations instead of having one imply the other was one of those "seemed like a good idea at the time" things for the early standards-makers, as it enabled some special effects on Teletype terminals like using a CR without LF to overstrike things on the current line, but later programmers though it wasteful and decided the newline should be a single character, but couldn't be consistent about which one, unfortunately.




-------------------------------
Relevant threads:

extra hard returns in code after FTP transfer?

_______________________
Emperor

(Added by: Emperor on Wed 06-Nov-2002)

« BackwardsOnwards »

Show Forum Drop Down Menu