我的第一個網絡爬蟲 C#版 福利 程序員專車


最近在自覺python,看到了知乎上一篇文章(https://www.zhihu.com/question/20799742),在福利網上爬視頻。。。

由是我就開始跟着做了,但答主給的例子是基於python2.x的,而我開始學的是3.x,把print用法改了以后還是有很多模塊導入不了,新手又不知道該怎么解決。

於是,為了學(shang)習(che),我就把其中的一段代碼用C#寫了一次。在加了一些延時的情況下,一會兒硬盤就被占用了3個多g了。。。同學們,要注意身體啊

下面貼出代碼。。代碼中故意留了幾個bug,避免非程序員上車

class Program
    {
        static void Main(string[] args)
        {
            var baseString = "http://w*w.46ek.c*m/view/{0}.html";
            Regex regex = new Regex(@"http://m4.26ts.com/[.0-9-a-zA-Z]*.mp4");
            WebClient wc = new WebClient();


            uint startIndex = ReadStartIndex();
            uint loop = ReadLoopLen();

            for (int i = 0; i < lop; i++)
            {
                var subUrl = string.Format(baseString, startIndex + i);
                WebRequest wReq = System.Net.WebRequest.Create(subUrl)

                try
                {
                    WebResponse wResp = wReq.GetResponse();
                    Stream respStream = wResp.GetResponseStream();

                    using (StreamReader reader = new StreamReader(respStream, Encoding.GetEncoding("GB18030")))
                    {
                        var htmlString = reader.ReadToEnd();

                        Match m = regex.Match(htmlString);
                        if (m.Success)
                        {
                            DownloadFile(wc, m.Value, string.Format("{0}.mp4", startIndex + i));
                        }
                    }
                }
                catch (Exception exc)
                {
                    Console.WriteLine("Error : {0}", exc.Message);
                }

                Thread.Sleep(5);
            }
            
        }

        private static uint ReadStartIndex()
        {
            while (true)
            {
                Console.Write("Set start index :");

                string line = Console.ReadLine();

                uint index = 0;

                if (UInt32.TryParse(line, out index))
                {
                    Console.WriteLine("Start index setted : "+ index);
                    return index;
                }

                Thread.Sleep(500);
            }
        }

        private static uint ReadLoopLen()
        {
            while (true)
            {
                Console.Write("Set loop len :");

                string line = Console.ReadLine();

                uint index = 0;

                if (UInt32.TryParse(line, out index))
                {
                    Console.WriteLine("Loop len setted : " + index);
                    return index;
                }

                Thread.Sleep(500);
            }
        }

        private static void DownloadFile(WebClient wc, string url, string localname)
        {
            Console.WriteLine("Downloading file {1} to {2}", url, localname);

            wc.DownloadFile(url, localname);

            Console.WriteLine("File {0} download completed!", localname);
        }

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM