Skip to content

Automated MCM Readiness Videos download

March 25, 2011

Welcome to the “All Things C#” blog, at least for today’s post.

Earlier today I found out that additional videos had been added to the SQL Server MCM Readiness Videos site since the last time I browsed it serveral months ago. Last time, I downloaded the videos by opening up each page and manually clicking the appropriate links. Since there are now 70 videos available (!) the “manual method” would be a pain in the neck. So, I figured I’d spin a few cycles automating the process.

Yeah, I’m a SQL guy. But I know that something this simple can be automated. I saw “The Social Network” too. So with Visual Studio on one monitor and Google on the other, I hammered out a C# class that took care of business. And, since I’m such a nice SQL guy, I’m sharing it with everyone else too! The source code is below.

In order to run this, open Visual Studio and create a new C# Console Application project. Replace the source for “Program.cs” with the code below, edit the static variables appropriately, and fire it off!

No, it’s not pretty. Yes, it could be commented, annotated, parameterized, and a ton of other things before I published it. Right now I don’t really care! Like I said, I’m a SQL guy and not a C# guy, and I just want to get on with watching some of the 41 hours of video that I just downloaded. So, maybe someday I’ll clean this up .. but not today.

If you don’t have Visual Studio but want a compiled EXE, let me know and we can work something out.

Enjoy!

Oh … and I should probably mention that to download all 70 videos, at least in WMV format, totals 7.26GB of files, so you’ll probably want to fire this off and let it run overnight. Or at least let it download in the background while you watch the first video or two that comes in.

using System;
using System.Collections.Generic;
using System.Text;
using System.Collections.Specialized;
using System.Text.RegularExpressions;
using System.Xml;
using System.IO;
using System.Net;

/*
 * AUTHOR: Randy Rabin - aka mailto://rtpsqlguy@hotmail.com - see blog post at https://rtpsqlguy.wordpress.com
 * DATE: March 25, 2011
 * BRIEF DESCRIPTION: Automated download of all videos from the SQL Server MCM Readiness Videos site
 * 
 * NOTE: This program downloads all videos found under the "More..." link off of the main page
 *      By default it downloads WMV files to folder C:\MCM Readiness Videos. Edit the class-level statics below to change these settings
*/

namespace GetMCMVideosFromTechnet
{
    class Program
    {
        static readonly string mainMcmReadinessVideosPage = @"http://technet.microsoft.com/en-us/sqlserver/ff977043.aspx";
        static string downloadPath = @"C:\MCM Readiness Videos\";
        static readonly string downloadMediaType = "wmv";

        // class-level collection populated and used by multiple methods below
        static StringCollection mainUrlList = new StringCollection();

        static void Main(string[] args)
        {
            // Create the download folder if it doesn't already exist
            if(!Directory.Exists(downloadPath))
                Directory.CreateDirectory(downloadPath);

            if (!downloadPath.EndsWith(@"\"))
                downloadPath += @"\";

            // Grab the list of page links from the XML stream linked to by the "More..." button on the MCM page
            ReadXmlList(mainUrlList);

            // For each page link, download the video from that page to local disk
            DownloadWebPageList(mainUrlList);
        }

        static void ReadXmlList(StringCollection urlList)
        {
            Console.WriteLine("Loading main web page...");

            string mainMcmPageSource = GetWebPage(mainMcmReadinessVideosPage);
            Regex regex = new Regex("<a href=\".*?.xml\">More...</a>");

            string xmllink = regex.Match(mainMcmPageSource).Value;
            xmllink = xmllink.Substring(xmllink.LastIndexOf("<a href")).Replace("<a href=\"", "").Replace("\">More...</a>", "");

            Console.WriteLine("Loading \"More...\" link information...");

            XmlDocument doc = new XmlDocument();
            doc.Load(xmllink);

            XmlNodeList nodelist = doc.SelectNodes("/rss/channel/item/link");
            foreach(XmlNode node in nodelist)
                urlList.Add(node.InnerText);
        }

        static string GetWebPage(string url)
        {
            WebClient wc = new WebClient();
            return wc.DownloadString(url);
        }

        static void DownloadWebPageList(StringCollection urlList)
        {
            int cntr = 0;

            foreach (string url in urlList)
                DownloadMediaFromWebPage(url, ++cntr, urlList.Count);
        }

        static void DownloadMediaFromWebPage(string url, int cnt, int listsize)
        {
            string webPageSource = GetWebPage(url);
            string pageTitle = FindPageTitleInPageSource(webPageSource);
            string mediaUrl = FindMediaLinkInPageSource(webPageSource);

            Console.WriteLine("Downloading {0} of {1}: {2}", cnt, listsize, pageTitle);

            DownloadMedia(pageTitle, mediaUrl, downloadPath);
        }

        static string FindPageTitleInPageSource(string webPageSource)
        {
            Regex regex = new Regex("span class=\"EyebrowElement\">.*?</span>");
            string pageTitle = regex.Match(webPageSource).Value;

            pageTitle = pageTitle.Replace("span class=\"EyebrowElement\">", "").Replace("</span>", "");

            return pageTitle;
        }

        static string FindMediaLinkInPageSource(string webPageSource)
        {
            Regex regex = new Regex("<a href=\"http://download.microsoft.com.*?" + downloadMediaType + "\"");
            string medialink = regex.Match(webPageSource).Value;

            medialink = medialink.Replace("<a href=\"", "").Replace("." + downloadMediaType + "\"", "." + downloadMediaType);

            return medialink;
        }

        static void DownloadMedia(string pageTitle, string mediaUrl, string downloadPath)
        {
            string downloadFile = downloadPath + pageTitle + " -- " + mediaUrl.Substring(mediaUrl.LastIndexOf('/') + 1);

            if (!File.Exists(downloadFile))
            {
                WebClient wc = new WebClient();
                wc.DownloadFile(mediaUrl, downloadFile);
            }
        }
    }
}

Advertisements
2 Comments
  1. Great for post. Always keep more interesting publications. Been following blog for seven days now and I should say I am beginning to like your post this site. I need to know how can I subscribe to your blog?

  2. rrabin permalink

    Thanks Eloy! I believe you can subscribe from the main page, just click the Sign Me Up link and go from there.

Comments are closed.

%d bloggers like this: