文件名亂碼轉換器


今天上網下點歌曲,結果發現是亂碼!! 這可怎么辦呢?

打開文件,自然可以聽出是什么歌,但是問題關鍵在於我不止下載了一首,而且一首歌一首歌聽過來也未免太浪費時間了.

作為一個程序猿,自然想到了編程解決這個問題:

我選擇的是用.net的文本編碼類來做這個編碼轉換,因為對.Net比較熟悉

來曬曬代碼:http://files.cnblogs.com/pcy0/chgEncode.zip  (話說園子居然不讓上傳.tgz的文件,害得我又重新學了下zip的命令)

核心轉換算法(汗):

將字符串先按A編碼解成碼流,再按B編碼組織為字符串,這樣完成從A到B的轉換

	// Convert @s from encoding @A to @B
	public static string ConvertEncoding (String s, Encoding A, Encoding B)
	{
		try {
			var bytes = A.GetBytes (s);
			return B.GetString (bytes);
		} catch(Exception ex) {
			debug (ex.ToString());
			return ">>>FAILED:" + ex.Message + "<<<";
		}
	}

 處理入參等邏輯(C語言寫慣了,都把C#寫成過程式的了):

主要實現了一下功能:

1. 查看當前可選的N種編碼類型(命令行 -l / --list-all-encoding參數)

2. 嘗試N -> N的編碼轉換,並打印出轉換后的文件名(命令行-t / --try-all-encoding參數)

  -- 最終發現一般中文編碼是GBK/GB2312/GB18030等GB開頭的,而linux下目錄的編碼都是ISO-8859-X,不過N*N個編碼轉換也很容易通過CTRL+F來找到合適的編碼類型

3. 確認編碼類型后,轉換並重命名文件(交互式的哦~)

 

	// Complie: mono-csc chgEncode.cs
	// Example: chgEncode.exe ~/Music iso-8859-9  GB18030
	public static int Main (String[]args)
	{
		if (args.Count () < 1) {
		} else if (args.Length == 1 
			&& (args [0] == "--list-all-encoding"
			|| args [0] == "-l")) {
			
			var encodings = Encoding.GetEncodings ();
			debug ("Found " + encodings.Count () + " encoding(s)");
			println ("name\tcode-page\tdisplay-name");
			foreach (var e in encodings) {
				println (e.Name + "\t" + e.CodePage + "\t" + e.DisplayName);
			}
			return 0;
		} else if (args.Length == 2 && (args [0] == "--try-all-encoding" || args [0] == "-t")) {
			var dir = args [1];
			debug ("Input dir is " + dir);
			var files = Directory.EnumerateFiles (dir);
			debug ("Found " + files.Count () + " file(s)");
			var filename = files.First();
			var encodings = Encoding.GetEncodings ();
			debug ("Found " + encodings.Length + " encoding(s)");
			foreach (var e in encodings) {
				foreach (var f in encodings) {
					try{ // I don't know the cause but it always failed hear! So I add the try... catch.
						println ("file " + filename  + "  " + e.Name + " -> " + f.Name + " : "
					         + ConvertEncoding (filename, e.GetEncoding (), f.GetEncoding ()));
					}
					catch{}
				}
			}
		} else if (args.Length == 3) {
			var dir = args [0];
			debug ("Input dir is " + dir);
			var files = Directory.EnumerateFiles (dir);
			debug ("Found " + files.Count () + " file(s)");
			var encodings = Encoding.GetEncodings ();
			debug ("Found " + encodings.Count () + " encoding(s)");
			var A = encodings.FirstOrDefault (e => e.Name == args[1]);
			var B = encodings.FirstOrDefault (e => e.Name == args[2]);
			if (null == A || null == B)
			{
				println ("Error: failed to find the encodings!");
				return 1;
			}
			foreach (var file in files) {
				var newFile = ConvertEncoding (file, A.GetEncoding (), B.GetEncoding ());
				Console.Write(file + "  ->  \"" + newFile + "\" [Y/N]?");
				Console.Out.Flush();
				try
				{
					if ("y" == Console.ReadLine().ToLower())
					{
						File.Move(file, newFile);
					}
				}
				catch(Exception ex)
				{
					debug (ex.ToString());
					println("Failed: "+ex.Message);
				}
			}
		}
		else {
			println ("Usage1: chgEncode <dir> <encode-A> <encode-B>");
			println ("        change the encoding of files in @dir");
			println ("Usage2: chgEncode --list-all-encoding");
			println ("        chgEncode -l");
			println ("        list all encodings");
			println ("Usage3: chgEncode --try-all-encoding <dir>");
			println ("        chgEncode -t <dir>");
			println ("        try all encoding to convert the first file of @dir");
		}
		return 0;
	}

居然花費了一個小時才搞定,真的水平退步了

不過效果還挺好,至少達成咱的目的了

使用效果:

# 將~/tmp目錄下的文件從iso-8859-1編碼轉換為gb2312編碼

peter@peter-K43SJ:~/Code/chgEncode$ ./chgEncode.exe ~/tmp iso-8859-1 gb2312 Input dir is /home/peter/tmp Found 2 file(s) Found 95 encoding(s) /home/peter/tmp/Å£×кÜæ.mp3 -> "/home/peter/tmp/牛仔很忙.mp3" [Y/N]?y /home/peter/tmp/ÌýÂèÂèµÄ»°.mp3 -> "/home/peter/tmp/聽媽媽的話.mp3" [Y/N]?y
# 顯示幫助
peter@peter-K43SJ:~/Code/chgEncode$ ./chgEncode.exe --help
Usage1: chgEncode <dir> <encode-A> <encode-B>
        change the encoding of files in @dir
Usage2: chgEncode --list-all-encoding
        chgEncode -l
        list all encodings
Usage3: chgEncode --try-all-encoding <dir>
        chgEncode -t <dir>
        try all encoding to convert the first file of @dir

  

  


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM