Tuesday, December 11, 2012

Sandboxing managed code with an AppDomain

Sandboxing is a technique used to restrict the resources that an application can access and is generally used to run untrusted code.

An example of a sandboxed environment is a Virtual Machine (VM), which hosts an entire operating system; this is useful for running untrusted code, but can be a bit too much overhead when, instead of needing to sandbox the entire operating system, we only need to sandbox a runtime or a single application.  Fortunately, .NET does let us place restrictions on processes in the form of an AppDomain.


Microsoft's definition of an AppDomain is "an isolated environment where applications execute".  Perfect.  Now how do we work with it?

First, let's define a use-case for creating a sandbox and we'll go from there.

Imagine that you are writing an application that loads plugins written in C# .  These plugins have access to the same resources that the hosting application does, but that is a bit too control much considering that the plugin:

* Is untrusted
* Has access to the registry
* Has full access to the file system
* Has access to network resources
* etc.

Now, lets imagine that the hosting application needs and utilizes all of these resources and the purpose of each plugin is only to run a calculation and display output from that to an output stream.  It can access only the files that it has created and must be completely isolated from other plugins and their data.

So, we need to ensure that each plugin has restrictions on the files it can access.  In short, each plugin must:

* Only be able to access a directory assigned to the process
* Be able to only modify files inside that directory
* Not be able to access files created by other plugins
* Be denied all other resources aside from limited filesystem access.

We also need to take into account that since each plugin is untrusted, we need to set boundaries on memory and hard disk IO.  We cannot have this plugin consume excessive memory and we need to ensure that the application can't consume all of our disk space maliciously.  Fortunately, there exists a Win32 method for doing this called Job Objects.  To access these in C#, you will need to use a bit of P/Invoke to utilize this feature, but it's not very difficult to say the least.

Combining these two ideas, one should be able to place limits on I/O as well as virtual memory usage.  I recently began working on a simple library to combine these two into a unified basic sandbox process with time being the only thing holding back a release.  I will share whatever code I can when I get the opportunity to do so, preferrably in the form of a usable API.

Saturday, December 8, 2012

Basic HttpClient class

So, I do a lot of web scraping - work fit for an episode of Dirty Jobs I'd say - and I end up writing a lot of code to do this.  Standard toolkit includes HtmlAgilityPack for HTML parsing, Fiddler for monitoring network traffic, Firebug for both, HttpWeb(Request|Response) (System.NET), and a web scraping library that I wrote to simplify my life, modifying the code as needed.

So, here's a bare-bones version of a basic HttpWebClient that stores cookies for authenticating, sessions, and stuff - an except of the library I use  Obviously doesn't process Javascript or deal with Javascript-set cookies, have fun with those.  Uses C# 5 async keyword, so .NET 4.5 is required.


Code:
public class HttpWebClient
{
    public WebProxy WebProxy { get; set; }
    public CookieContainer CookieContainer { get; set; }

    private readonly int _timeoutMilliseconds;

    static HttpWebClient()
    {
        ServicePointManager.UseNagleAlgorithm = true;
        ServicePointManager.MaxServicePoints = 500;
        ServicePointManager.DefaultConnectionLimit = 500;
        ServicePointManager.Expect100Continue = false;
    }

    // Default to 30 second timeout
    public HttpWebClient(WebProxy proxy = null, int timeoutMilliseconds = 30000)
    {
        this.WebProxy = proxy;
        this.CookieContainer = new CookieContainer();

        _timeoutMilliseconds = timeoutMilliseconds;
    }

    public async Task<HttpWebResponse> HttpGet(string url)
    {
        var request = ConstructHttpGetRequest(url);
        return await GetHttpResponse(request);
    }

    public async Task<HttpWebResponse> HttpPost(string url, string postData)
    {
        var request = ConstructHttpPostRequest(url, postData);
        return await GetHttpResponse(request);
    }

    public async Task<HttpWebResponse> HttpPost(string url, Dictionary<string, string> valueDictionary)
    {
        var postData = GeneratePostBody(valueDictionary);
        var request = ConstructHttpPostRequest(url, postData);
        return await GetHttpResponse(request);
    }

    public HttpWebRequest ConstructHttpGetRequest(string url)
    {
        return CreateDefaultHttpWebRequest(url, "GET");
    }

    public HttpWebRequest ConstructHttpPostRequest(string url, Dictionary<string, string> valueDictionary, string host = null)
    {
        var postData = GeneratePostBody(valueDictionary);
        return ConstructHttpPostRequest(url, postData);
    }

    public HttpWebRequest ConstructHttpPostRequest(string url, string postData)
    {
        var request = CreateDefaultHttpWebRequest(url, "POST");
        WriteToHttpWebRequestStream(request, postData);
        return request;
    }

    protected void WriteToHttpWebRequestStream(HttpWebRequest httpWebRequest, string data)
    {
        WriteToHttpWebRequestStream(httpWebRequest, Encoding.ASCII.GetBytes(data));
    }

    protected void WriteToHttpWebRequestStream(HttpWebRequest httpWebRequest, byte[] data)
    {
        using (var requestStream = httpWebRequest.GetRequestStream())
        {
            var contentBytes = data;
            requestStream.Write(contentBytes, 0, contentBytes.Length);
        }
    }

    protected HttpWebRequest CreateDefaultHttpWebRequest(string url, string method, string accept=null)
    {
        var request = (HttpWebRequest)WebRequest.Create(url);

        // Default to HTTP 1.0
        request.ProtocolVersion = HttpVersion.Version10;

        request.Timeout = _timeoutMilliseconds;
        request.Host = new Uri(url).Host;
        request.CookieContainer = CookieContainer;
        request.Method = method;
        request.Accept = "application/json,text/javascript,text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        request.ContentType = "application/x-www-form-urlencoded";
        request.Headers["Accept-Charset"] = "ISO-8859-1,utf-8;q=0.7,*;q=0.7";
            
        if(this.WebProxy != null)
            request.Proxy = this.WebProxy;

        return request;
    }

    public async Task<HttpWebResponse> GetHttpResponse(HttpWebRequest request)
    {
        HttpWebResponse response = await Task<HttpWebResponse>.Factory.FromAsync(request.BeginGetResponse, r => (HttpWebResponse) request.EndGetResponse(r), null);
        return response;
    }
        
    public void ClearSession()
    {
        if(CookieContainer != null)
            CookieContainer = new CookieContainer();
    }

    public static string GeneratePostBody(Dictionary<string, string> postValues)
    {
        var values = String.Join("&", postValues.Select(kv => String.Join("=", kv.Key, kv.Value)));
        return values;
    }
}   


On top of this basic class, you can build APIs for various websites and HTTP services.  Nothing fancy at all, just saves a bit of typing.  Change whatever properties you need and enjoy!

Thursday, May 10, 2012

Using Dropbox as a free source code repository

So, I'm not sure why I didn't think of this sooner, but, it turns out that you can use Dropbox, which comes with 2GB of free storage space, to version your projects (GIT, SVN, etc.).

I found a nice tutorial for doing this using Visual Studio and Git Extensions at the following address:
http://www.remondo.net/git-source-control-for-visual-studio-2010-on-dropbox/

Definitely useful and something to keep in mind for personal projects and such.

Friday, April 20, 2012

CodeMirror Rendering Issues

So, I was having a very, very annoying problem with CodeMirror where the editor would leak outside of the bounds it was given, causing the text to be outside of the CodeMirror div.  Typing into the editor would cause the editor to resize to ridiculous size (10000px +) with each character typed into it.  So, turns out, I was simply missing a reference to the codemirror.css file.

This is an ASP.NET MVC4 application.  I guess I missed that part of the CodeMirror tutorial, but it bothered me enough to blog about.  Fix is as simple as:

<link href="/Scripts/CodeMirror/lib/codemirror.css" rel="stylesheet" type="text/css"></link>