C#保存CookieContainer到文件

爬数据的时候免不了需要登录。每次实验都要输入验证码是个麻烦的事情,于是就想像浏览器一样把cookies存到文件中,下次重新运行的时候可以直接使用。


Google 搜“C# CookieContainer 存文件”能找到最好的代码如下:
http://www.huxu.net.cn/2011_03/154.html

public static List<Cookie> GetAllCookies(CookieContainer cc) {
    List<Cookie> lstCookies = new List<Cookie>();
    Hashtable table = (Hashtable)cc.GetType().InvokeMember("m_domainTable",
        System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.GetField |
        System.Reflection.BindingFlags.Instance, null, cc, new object[] { });
    foreach (object pathList in table.Values) {
        SortedList lstCookieCol = (SortedList)pathList.GetType().InvokeMember("m_list",
            System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.GetField
            | System.Reflection.BindingFlags.Instance, null, pathList, new object[] { });
        foreach (CookieCollection colCookies in lstCookieCol.Values)
            foreach (Cookie c in colCookies) lstCookies.Add(c);
    }
    return lstCookies;
}

//存储
StringBuilder sbc = new StringBuilder();
List<Cookie> cooklist = Code.ProgTool.GetAllCookies(CookieContainer);
foreach (Cookie cookie in cooklist) {
    sbc.AppendFormat("{0};{1};{2};{3};{4};{5}\r\n",
        cookie.Domain, cookie.Name, cookie.Path, cookie.Port,
        cookie.Secure.ToString(), cookie.Value);
}
FileStream fs = File.Create("d:\\chinarencookies.txt");
fs.Close();
File.WriteAllText("d:\\chinarencookies.txt", sbc.ToString(), System.Text.Encoding.Default);

//读取
string[] cookies = File.ReadAllText("d:\\chinarencookies.txt", System.Text.Encoding.Default)
    .Split("\r\n".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
foreach (string c in cookies) {
    string[] cc = c.Split(";".ToCharArray());
    Cookie ck = new Cookie(); ;
    ck.Discard = false;
    ck.Domain = cc[0];
    ck.Expired = true;
    ck.HttpOnly = true;
    ck.Name = cc[1];
    ck.Path = cc[2];
    ck.Port = cc[3];
    ck.Secure = bool.Parse(cc[4]);
    ck.Value = cc[5];
    CookieContainer.Add(ck);
}

这种方法需要深入了解cookies的构成,作为模板用还可以接受。IE也用类似的方式存储cookies。
但这种写法非常容易出bug,其中的ck.Expired = true;就是一个隐患。

尝试着又搜了一下“C# save CookieContainer to file”,果然发现了一份更优雅的代码。
http://stackoverflow.com/questions/1777203/c-writing-a-cookiecontainer-to-disk-and-loading-back-in-for-use

    public static void WriteCookiesToDisk(string file, CookieContainer cookieJar)
    {
        using(Stream stream = File.Create(file))
        {
            try {
                Console.Out.Write("Writing cookies to disk... ");
                BinaryFormatter formatter = new BinaryFormatter();
                formatter.Serialize(stream, cookieJar);
                Console.Out.WriteLine("Done.");
            } catch(Exception e) {
                Console.Out.WriteLine("Problem writing cookies to disk: " + e.GetType());
            }
        }
    }   

    public static CookieContainer ReadCookiesFromDisk(string file)
    {
        try {
            using(Stream stream = File.Open(file, FileMode.Open))
            {
                Console.Out.Write("Reading cookies from disk... ");
                BinaryFormatter formatter = new BinaryFormatter();
                Console.Out.WriteLine("Done.");
                return (CookieContainer)formatter.Deserialize(stream);
            }
        } catch(Exception e) {
            Console.Out.WriteLine("Problem reading cookies from disk: " + e.GetType());
            return new CookieContainer();
        }
    }

这里用了C#自带的序列化功能,几行代码就完成了这一任务。把繁琐的事情全都交给库自动完成。

C#已经在最近(2012年2月)的编程语言排行榜中排名第三了。被强大的库函数吸引过来,现在也渐渐感觉到语言本身的优势了。不能再当C语言用了……试着抽象

3 评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注