Bulk Fix for CSV (Macintosh) Excel Files On Windows
I have a bunch of CSV files that I received from someone using Excel on Mac. When she saved them, she did not choose a special CSV format that's Windows-friendly. As a result, if I open one of her files, make a small change, save the file, and reopen it later, the columns all end up jammed together. The program that is ultimately supposed to consume the CSV data also has a problem reading the files that were saved on a Mac.
Using Notepad++, I inspected the line-endings and the file encoding. The line-endings were correct for Windows (CRLF), but I noticed that the file encoding was "UTF-8 BOM". The BOM stands for byte-order mark. When I opened CSV files saved from Excel for Windows, the file encoding was just "UTF-8" -- no byte-order mark.
I wanted a way to bulk-edit all the files, switching the encoding to UTF-8. I thought about writing a program in C#, but it seemed likely that I could use existing command-line tools. With a little searching, I found this sed command on Stack Exchange. That was great for updating a single file, but I wanted to process well over a hundred files -- many of them with spaces in their names. Using find and piping the results to xargs, I thought this would be easy, but it turned out that find doesn't wrap the resulting file paths in quotes. I also discovered that sed for Cygwin writes out files using the Unix default line-ending, "LF".
What I ended up with is this:
find myTargetDirectory -name "*.csv" | sed -e 's/^/"/g' -e 's/$/"/g' | tr '\n' ' ' | xargs sed -b -i '1s/^\xEF\xBB\xBF//'
The find command is straightforward enough, but the results are piped to sed and tr in order to wrap the file paths in double-quotes, using an approach I lifted from this Stack Overflow answer. And I had to add the -b flag to the BOM-removal sed in order to get sed on Cygwin to leave the Windows end-of-line characters (CRLF) as-is.
End result: defused the BOM.
I've noted this here for my own future reference. I hope it helps someone else, too!
If you found this helpful, you'd like to thank me, and you enjoy reading science fiction, consider buying my book! More info here.