Share What We Say



Filter by:

Blogs

Releasing Astor: A developer tool for token-based authentication

Leandro Boffi - Sun, 2014-04-13 10:09

I’ve just published in NPM the first version of Astor. Astor is a command line developer tool that helps you when you work with token-based authentication systems.

At this moment, it allows you to issue tokens (right now it supports JWT and SWT formats) to tests your APIs, basically you can do something like this:

$ astor issue -issuer myissuer -profile admin -audience http://myapi.com/

The result of running that command will be something like this:

eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJodHRwOi8vc2NoZW1hcy54bWxzb2FwLm9yZy 93cy8yMDA1LzA1L2lkZW50aXR5L2NsYWltcy9uYW1lIjoiTGVhbkIiLCJhdWQiOiJodHRwOi8vc mVseWluZ3BhcnR5LmNvbS8iLCJpc3MiOiJodHRwOi8vbXlpc3N1ZXIuY29tLyIsImlhdCI6MTM5 NzM3NjU5MX0.d6Cb0IQsltocjOtLsfXhjseLcZpcNIWnHeIv4bqrCv4

Yes! a signed JWT ready to send to your api!

Astor basically works with a configuration file that saves issuers, user profiles and issueSessions configurations, that’s why you can say -issuer myissuer or -profile admin without specifing issuer key and user claims. To clarify, this is how astor.config looks:

{ "profiles": { "me@leandrob.com": { "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name": "Leandro Boffi", "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/email": "me@leandrob.com" }, "admin": { "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name": "John Smith", "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/email": "John Smith", "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/role": "Administrator", } }, "issuers": { "contoso": { "name": "contoso", "privateKey": "-----BEGIN RSA PRIVATE KEY-----\nMIIEow.... AKCAQEAwST\n-----END RSA PRIVATE KEY-----\n" }, "myissuer": { "name": "http://myissuer.com/", "privateKey": "MIICDzCCAXygAwIBAgIQVWXAvbbQyI5BcFe0ssmeKTAJBg=" } } }

Did you get that? Once you have created the different profiles and issuers you can combine them very easily to have several tokens.

Off course you can start from scratch and specify the whole parameters in a single command without using the config file:

$ astor issue -n http://myissuer.com/ -l privateKey.key -a http://relyingparty.com/ Create user profile... Here you have some common claimtypes, just in case: - Name: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name - Email: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/email - Name Identifier: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier - User Principal: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/upn claim type (empty for finish): http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name claim value: Leandro Boffi claim type (empty for finish): http://schemas.xmlsoap.org/ws/2005/05/identity/claims/email claim value: me@leandrob.com claim type (empty for finish): Would you like to save the profile? y Enter a name for saving the profile: me@leandrob.com eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJodHRwOi8vc2NoZW1hcy54bWxzb2FwLm9yZy93cy8yMDA1LzA1L2 lkZW50aXR5L2NsYWltcy9lbWFpbCI6Im1lQGxlYW5kcm9iLmNvbSIsImh0dHA6Ly9zY2hlbWFzLnhtbHNvYXAub3JnL 3dzLzIwMDUvMDUvaWRlbnRpdHkvY2xhaW1zL25hbWUiOiJMZWFuZHJvIEJvZmZpIiwiYXVkIjoiaHR0cDovL3JlbHlp bmdwYXJ0eS5jb20vIiwiaXNzIjoiaHR0cDovL215aXNzdWVyLmNvbS8iLCJpYXQiOjEzOTczODMwMzR9.1vy9kyY26N wjOQ4gqfy5ZBIQgovgw0gxd4TcVXWzFok Would you like to save the session settings? y Enter session name: token-for-test

As you can see, if you don’t use an stored profile you will be prompt for creating the profile in the moment, and once you have created the profile you can save it on configuration for the future!

And finally, you can provide a name for the whole session, in the example token-for-test, so next time you have to use the same settings you can do:

$ astor issue -s token-for-test

How to install it?

$ npm install -g astor

Next steps?

I’ll be adding token validation functionality, together with other token formats like SAML and maybe authentication flows!

Check readme on github for detailed documentation: https://github.com/leandrob/astor

Hope you found it useful!

Categories: Blogs

Code Coverage in Node.js

Leandro Boffi - Tue, 2014-03-11 23:09

This time I want to share with you something I found very useful. Istanbul (http://gotwarlost.github.io/istanbul/) a code coverage report tool.

It’s very easy to use, you just need to use, you just need to install it like:

$ npm install -g istanbul

And run it on your solution, for example If you use mocha:

$ istanbul cover _mocha -- -R spec

If you are using windows, you must provide the relative path to the _mocha file:

$ istanbul cover node_modules/mocha/bin/_mocha -- -R spec

or if you have mocha installed globally:

$ istanbul cover c:\Users\[your user]\AppData\Roaming\npm\node_modules\ mocha\bin\_mocha -- -R spec

and this will be the result:

Screen Shot 2014-03-11 at 7.54.49 PM

Once you ran you can open the html report like that:

$ open coverage/lcov-report/index.html

This is how it looks like:

Screen Shot 2014-03-11 at 7.49.41 PM

You can even look at the lines that your are not testing:

Screen Shot 2014-03-11 at 7.51.05 PM

Just a final tip: Add the command to your package.json to run it every time you do npm test:

"scripts": { "test": "istanbul cover _mocha -- -R spec" }
Categories: Blogs

SAML 2.0 Tokens and Node.js

Leandro Boffi - Wed, 2014-03-05 17:32

As part of my work at Kidozen related to identity management I’ve just published a new version of a Node.js module that allows you to parse and validate SAML 2.0 Assertions (Just like the ones that ADFS uses). It also supports SAML 1.1 tokens.

Installation

$ npm install saml20

Usage

The module exposes two methods, validate and parse, the first one validates the signature, expiration and (optional) audience URI, and the second one just parses the token avoiding validations, this is useful in multiple IdP scenarios.

This is an example on how to validate a SAML assertion:

var saml = require('saml20'); var options = { thumbprint: '1aeabdfa4473ecc7efc5947b18436c575574baf8', audience: 'http://myservice.com/' } saml.validate(rawAssertion, options, function(err, profile) { // err var claims = profile.claims; // Array of user attributes; var issuer = profile.issuer: // String Issuer name. });

You can use thumbprint or full public key as options. Checkout the github repository page for more examples: https://github.com/leandrob/saml20

I’ve also published an example on how to secure a REST API using this module: https://github.com/kidozen/node-rest-saml.

Categories: Blogs

Making ajax calls with Hawk authentication to ASP.NET Web API

Pablo Blog - Fri, 2014-01-31 17:14

Hawk is an authentication protocol initially written by Eran Hammer in javascript for being used in Node.js. Later on, Eran also added support for browsers through a browser.js library, which is also part of the hawk github project.

As part of this post, I will show how you can use that browser.js library to make Ajax calls to a ASP.NET Web API that authenticates the calls with Hawk.

The first thing is to define the Web API to call from javascript. For the sake of simplicity, we will use a very simple “Hello World” controller that returns the name of the authenticated user.

1 2 3 4 5 6 7 public class HelloWorldController : ApiController { public string Get() { return "hello " + this.User.Identity.Name; } }

As second step, we will configure the Hawk filter globally to authenticate the calls. This filter is part of the HawkNet integration project with Web API.

1 2 3 4 5 6 7 8 9 10 11 12 13 public static void Register(HttpConfiguration config) { config.Filters.Add(new RequiresHawkAttribute((id) => { return new HawkCredential { Id = "dh37fgj492je", Key = "werxhqb98rpaxn39848xrunpaw3489ruxnpa98w4rxn", Algorithm = "hmacsha256", User = "steve" }; })); }

Once we have the Web API configured, we will write the javascript to make the call, which will use browser.js to generate the hawk header on the client.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 <script src="~/Scripts/browser.js"></script> <script> $(function () { var url = 'http://localhost:28290/api/HelloWorld'; var credentials = { id: 'dh37fgj492je', // Required by Hawk.client.header key: 'werxhqb98rpaxn39848xrunpaw3489ruxnpa98w4rxn', algorithm: 'sha256', user: 'Steve' }; $("#helloworld").click(function() { var header = hawk.client.header(url, 'GET', { credentials: credentials }); $.ajax({ dataType: "json", headers: { authorization: header.field }, url: url, success: function (error, response, body) { $("#response").html(body.responseText); } }); }); }); </script>

As you can see, the part that matters in the sample above is how you generate the hawk header by calling the “header” method in the client object provided by the browser.js library. That method receives the url, the http method and the credentials. The result of that call is the hawk authentication header, which is attached to the authorization header in the ajax call.

The complete sample can be found in the HawkNet github project

Categories: Blogs

Coordinating async work in Node.js

Pablo Blog - Mon, 2014-01-13 13:39

When you first move to Node.js, you need to get used to write asynchronous single thread code that does not block. For some scenarios, writing such kind of code is not trivial, specially when you have to coordinate the execution of multiple async calls in parallel and return a result after all these tasks have been completed. One thing you don’t want to do in that scenario is block the main thread to wait for all the calls, which is how you normally do in other platforms. Let’s discuss a very trivial example that coordinates two calls for getting information from two different sources, which is subsequently returned as a response message in express.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 exports.index = function(req, res){ var returnResponse = false; var data = {}; getName(function(err, name) { if(err) return callback(err); data.name = name; if(returnResponse) res.render('index', { title: 'Express', data: data }); returnResponse = true; }); getPhone(function(err, phone) { if(err) return callback(err); data.phone = phone; if(returnResponse) res.render('index', { title: 'Express', data: data }); returnResponse = true; }); };

The code above calls two functions to retrieve a name and a phone from different places asynchronously. The results of the calls are temporarily stored as members of the “data” variable. The ugly part is how the “returnResponse” variable is also used to notify that each function has done its part and a response can be returned. This contains duplicated code, it’s error prone and it can easily get more complicated as the number of async calls increase.

As interesting fact, there isn’t any built-in or native functionality to handle an scenario like this in Node.js. That’s where an module like “async” comes in place. “Async” is a swiss knife for async work in Node.js. It provides a ton of functions for coordinating asynchronous tasks such as the ones shown in next examples.

One of the functions you will find useful for an scenario like this is parallel. The parallel function receives an array of async functions to call, and invokes a final callback when all the functions have completed their part.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 exports.index = function(req, res){ async.parallel([ function(callback){ getName(function(err, name) { if(err) return callback(err); callback(null, name); })}, function(callback){ getPhone(function(err, phone) { if(err) return callback(err); callback(null, phone); })} ], function(err, results) { var data = { name : results[0], phone: results[1] } res.render('index', { title: 'Express', data: data }); }); }

Every function in the Array passed to the parallel function receives a callback argument, which must be called after the work is completed. That callback receives two arguments, an error if something went wrong and a result. All the collected results and errors are later passed to the final callback.

Another useful function is “each”, which is similar to “parallel”, but it iterates over an array and invokes a function representing the async work for every element in that array.

1 2 3 4 5 6 7 8 9 10 11 exports.index = function(req, res) { async.each([1, 2, 3], function(item, callback) { getDataForItem(item, function(err, data) { callback(null, data); }); }, function(err, data) { res.render('index2', { title: 'Express', data: data }); }); }

In the code above, the function is executed three times with the values “1”, “2” and “3”. A callback is also executed when the work is completed. The final callback is executed after all the functions invoked the callback.

This is a good way to coordinate async work in node.js, but you also have promises, which is not widely adopted yet in the platform as a way to represent async tasks. It’s implemented by a lot of modules out there, but it’s not something standard yet. A promise is an object that represents an asynchronous task. Among other things, this object represents a way to manipulate the task, determine when it is completed or to chain other tasks for example. The following example illustrates how a promise looks like in the Moongose module (example included in the Moongose docs),

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 promise = Meetups.find({ tags: 'javascript' }).select('_id').exec(); promise.then(function (meetups) { var ids = meetups.map(function (m) { return m._id; }); return People.find({ meetups: { $in: ids }).exec(); }).then(function (people) { if (people.length &lt; 10000) { throw new Error('Too few people!!!'); } else { throw new Error('Still need more people!!!'); } }).then(null, function (err) { assert.ok(err instanceof Error); });

Meetups is a Moongose object for executing queries against a MongoDB collection. “exec” is a method in the returned promise to start the call, and “then” is another method to chain other promises when the execution is completed. In this example, another promise to find people is executed after all the meetups have been found.

Categories: Blogs

OAuth Bridge for ADFS with ThinkTecture Authorization Server

Pablo Blog - Fri, 2013-12-27 13:23

ADFS 2.0 only supports SAML 2.0 and WS-Federation for supporting single sign on web applications and services. However, some new development platforms such as iOS only support OAuth 2.0, which makes the use of ADFS a bit tricky. ADFS for Windows Server 2012 R2 already provides some limited support for JSON Web Tokens (JWT) with the OAuth 2.0 the code flow (As it is described in this excellent post by Vittorio).

For some other scenarios in which your applications or services rely on JWTs for doing authentication/authorization, the Thinktecture Authorization Server represents a very nice alternative, which integrates really well with ADFS. The Authorization Server can act as a brige or broker between client apps that use OAuth 2.0 to get a JWT, and ADFS, which is used for authenticating the users.

The configuration of the Authorization Server is very simple. Once you get Authorization Server deployed in IIS, it uses Entity Framework code first to automatically generate the backend database for you on the first use. After that, you only need to change the configuration files to trust ADFS as it is shown bellow,

[Authorization Server Folder]\Configuration\identityModel.config

1 2 3 4 5 6 7 8 9 10 11 <issuerNameRegistry type="System.IdentityModel.Tokens.ValidatingIssuerNameRegistry, System.IdentityModel.Tokens.ValidatingIssuerNameRegistry"> <authority name="http://[your ADFS server]/adfs/services/trust"> <keys> <add thumbprint="[signing cert thumprint]"/> </keys> <validIssuers> <add name="http://[your ADFS server]/adfs/services/trust" /> </validIssuers> </authority> </issuerNameRegistry>

[Authorization Server Folder]\Configuration\identityModel.services.config

1 2 3 4 5 6 <system.identityModel.services> <federationConfiguration> <wsFederation passiveRedirectEnabled="true" issuer="https://[your ADFS server]/adfs/ls/" realm="urn:authorizationserver" /> </federationConfiguration> </system.identityModel.services>

urn:authorizationserver is the Relying Party identifier I used for configuring Authorization Server in ADFS. In ADFS, you only to configure a new relying party with a WS-Federation endpoint, and set the URL of the Authorization Server. Make also sure the Relying Party identifier matches the one you configured in Authorization Server.

The following code shows how to get a JWT from Authorization Server using the Resource Owner Flow (The user is authenticated by Authorization Server in ADFS)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 private static string GetToken() { ServicePointManager.ServerCertificateValidationCallback = (o, cert, chain, ssl) => true; HttpClient client = new HttpClient(); HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Post, "https://[your auth server]/[App namespace in Authorization Server]/oauth/token"); request.Headers.Add("Authorization", "Basic " + Convert.ToBase64String(System.Text.ASCIIEncoding.ASCII.GetBytes( string.Format("{0}:{1}", "[client id]", "[client secret]")))); request.Content = new FormUrlEncodedContent(new Dictionary<string, string> { {"grant_type", "password"}, {"username", "[account]"}, //Windows account name without domain {"password", "[password]"}, {"scope", "All"} }); HttpResponseMessage response = client.SendAsync(request).Result; response.EnsureSuccessStatusCode(); JObject json = JObject.Parse(response.Content.ReadAsStringAsync().Result); string accessToken = json["access_token"].ToString(); string refreshToken = json["refresh_token"].ToString(); return accessToken; }

You can find more details in the Dominick’s blog

Categories: Blogs

Automatic Client Cert Detection in ADFS 2.0

Pablo Blog - Fri, 2013-11-22 15:29

ADFS 2.0 supports multiple authentication methods through authentication handlers that are mutually exclusive. If one of the handlers runs, the others don’t. There is no way to implement a fallback logic if one the handlers fail to run or the user was not able to provide the expected credentials, so supporting dual authentication such as username/password and client certificates is sometimes problematic. The following list shows the handlers included out of the box with ADFS 2.0,

1 2 3 4 5 6 7 8 <microsoft.identityServer.web> <localAuthenticationTypes> <add name="Forms" page="FormsSignIn.aspx" /> <add name="Integrated" page="auth/integrated/" /> <add name="TlsClient" page="auth/sslclient/" /> <add name="Basic" page="auth/basic/" /> </localAuthenticationTypes> <microsoft.identityServer.web>

Those handlers are for Forms Authentication (Username/Password in an html form), Integrated Windows Authentication, Client Cert Authentication, and finally Http Basic Auth Authentication. The order in this list determines the priority by default unless the priority has been changed in the execution context. The order in the execution context can be changed in multiple ways. For example, when you are doing WS-Federation, the relying party can pass an additional “WAuth” query string parameter with the expected authentication type such as “urn:oasis:names:tc:SAML:1.0:am:password” for Forms Authentication or “urn:ietf:rfc:2246” for Client Cert authentication, which overrides the priority in the list set on the configuration. The same thing can be done for SAML 2.0 by changing the authentication context in the SAML Request message or changing the URL.

However, as I said before, in one of the handlers run, you will not have a chance to run any of the other handlers unless you implement some workaround. For example, you might want to add a checkbox in the html form for Forms authentication to allow the user to authenticate with a client certificate. When the checkbox is clicked, you do an http redirect to ADFS but changing the URL this time to use this new authentication method. If the RP is using WS-Federation, that means a new redirect that includes the WAuth query string variable with the value “urn:ietf:rfc:2246”, or a redirect to “auth/sslclient/” in the case of SAML 2.0.

If you want to avoid this manual step and detect the client certificate automatically, more work is involved, and it is what we are going to discuss next.

  1. Enable Forms Authentication as default in the configuration file. The “Forms” handler must be on top of the list.
  2. Create a new virtual directory in the ADFS IIS, and configure that virtual directory to require HTTPS and Accept Client Certs.

IIS Settings

  1. Configure an ASP.NET Generic handler in that virtual directory to check if the certificate is present or not
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 using System; using System.Web; public class CertificateHandler : IHttpHandler { public void ProcessRequest(HttpContext context) { context.Response.ContentType = "application/javascript"; if (context.Request.ClientCertificate.IsPresent) { context.Response.Write("if(typeof(certificateCallback) == 'function') { (certificateCallback) (true); }"); } else { context.Response.Write("if(typeof(certificateCallback) == 'function') { (certificateCallback) (false); }"); } } public bool IsReusable { get { return false; } } }

This code checks if the client certificate is present and invokes a javascript callback with “true” or “false” based on that condition. This code does not check the authenticity of the cert or anything else, and it is only for determining if the certificate is present. The rest of the validations will be done by ADFS.

  1. Configure this handler in the web.config of that virtual directory.
1 2 3 4 5 <system.webServer> <handlers> <add name="certificate" type="CertificateHandler" path="Script" verb="GET"/> </handlers> </system.webServer>
  1. Include a CertCallBack.js javascript file into the ADFS folder with the code required for the callback,
1 2 3 4 5 6 7 window.certificateCallback = function (cert) { var url; if (cert) { url = "auth/sslclient/" + window.location.search; document.location.href = url; } };

This code is very simple. It simply does a redirect when the cert is present (the callback was called with a “true” value). The redirect in this sample is only valid for SAML 2.0

  1. Modify the FormsSign.aspx page to include the following code at the end of the Page_Load event,
1 2 3 4 string script = System.IO.File.ReadAllText(Server.MapPath("CertCallBack.js")); this.Page.ClientScript.RegisterClientScriptBlock(this.GetType(), "callback", script, true); this.Page.ClientScript.RegisterClientScriptInclude("certauth", "/CertDetection/Script");

This code injects our CertCallBack.js script in the page and it also includes the script located in the virtual directory configured to accept client certs (/CertDetection/Script invokes the ASP.NET Handler basically)

That should be all. If the browser detects a client certificate when the script in the CertDetection virtual directory is resolved, the callback will be executed with a true value, making the form to do a redirect to authenticate the client with a Client Certificate instead. Otherwise, the client will see the Form to enter the username and password. We are adding a new virtual directory for not interferring with the existing authentication methods in the ADFS virtual directory.

The code for the handler and the javascript callback is included in the attached zip file

Categories: Blogs

Injecting dynamic content in Windows Azure packages

Pablo Blog - Thu, 2013-11-21 13:52

The Windows Azure Tools 1.7 introduced a new feature for adding content to the Windows Azure projects called “Role Content Folders”. In some scenarios, you might want to add custom content such as static pages, documentation, configuration files or external binaries for example. This is useful for example if you want to deploy a solution not written in .NET such as java implementation, and you don’t want to mix that with the .NET code in the Role project. The following image shows how the content folders are added in the Azure project.

Content Folders

That content is included in the generated Azure package, and it is deployed in the AppRoot folder when the package is finally published in VM in the cloud.

One of the problem with this feature is that you might want to include content with an structure that changes often or content with thousands of folders/files, which requires some tedious manual work in Visual Studio to keep that content updated in the project.

Good news is that you can use a MSBuild task to inject that custom content automatically when the package is being generated. You have to include a custom Target “BeforeRoleAddContent” right after the declaration of the “Microsoft.WindowsAzure.targets” as it is shown bellow,

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 <Import Project="$(CloudExtensionsDir)Microsoft.WindowsAzure.targets" /> <Target Name="BeforeAddRoleContent"> <ItemGroup> <AzureRoleContent Include="..\Solr\Solr"> <RoleName>SolrMasterHostWorkerRole</RoleName> <Destination>Solr</Destination> </AzureRoleContent> <AzureRoleContent Include="..\Solr\jre6"> <RoleName>SolrMasterHostWorkerRole</RoleName> <Destination>jre6</Destination> </AzureRoleContent> <AzureRoleContent Include="..\Solr\SolrFiles"> <RoleName>SolrMasterHostWorkerRole</RoleName> <Destination>SolrFiles</Destination> </AzureRoleContent> <AzureRoleContent Include="..\Solr\Solr"> <RoleName>SolrSlaveHostWorkerRole</RoleName> <Destination>Solr</Destination> </AzureRoleContent> <AzureRoleContent Include="..\Solr\jre6"> <RoleName>SolrSlaveHostWorkerRole</RoleName> <Destination>jre6</Destination> </AzureRoleContent> <AzureRoleContent Include="..\Solr\SolrFiles"> <RoleName>SolrSlaveHostWorkerRole</RoleName> <Destination>SolrFiles</Destination> </AzureRoleContent> </ItemGroup> </Target>

The example above injects several content folders in two different roles, “SolrMasterHostWorkerRole” and “SolrSlaveHostWorkerRole”. The “Include” attribute specifies the source folder, and the Destination folder within the AppRoot is specified in the Destination element.

Categories: Blogs

Full-Text Searches in SQL Azure with Solr

Pablo Blog - Fri, 2013-11-01 18:41

Solr is a robust search platform created by the open source community on top of Apache Lucene. It’s completelly written in java, and uses the Lucene java implementation at is core for full-text indexing and search. In addition, it exposes an http web interface for doing the full text searches and perform management tasks. On other hand, we have SQL Azure, which currently does not support full text searches, so these two services complement very well each other.

As Solr is mainly a java implementation, you only have a few alternatives to run in on Windows Azure. You can deploy it as a worker role together with the java runtime machine, or you can deploy it in a VM. As any solution in the cloud, the state persisted in the worker role or VM goes away when the VMs are replaced or they go down. As Solr persists the indexes in disk, you need to make sure it is stored in a permament storage like Azure Drive or the storage service. If you decide to use a worker role, this requires some additional work to make Solr to store the indexes in Azure drive for example, which is a VHD stored in the storage service that can be mounted by the VM as a local disk. Good news is that MS Open Tech has already done this for us. They have a created a template that deployes Solr with a fail-over configuration(master-slave) in two worker roles, one role for the master node, and another role for the slave node. The slave node replicates from the master node, so in case you lost one of them, you still have the other node available. . In addition, it configures a web role with an MVC application that acts as a admin dashboard for doing basic management stuff. This solution is hosted in Github as part of this project Windows-Azure-Solr. The Github site also provides instructions to get the solution deployed in Windows Azure.

The template that you download from GitHub imports data into Solr by crawling some URLs. That’s part of the data-config.xml file that you can find in the configuration folder of the master and slave nodes (SolrMasterWorkerRole\SolrFiles\data-config.xml and SolrSlaveWorkerRole\SolrFiles\data-config.xml). Solr supports the idea of data importers, which can be used to import data from different sources such as existing web sites, files in disk or even a database.

In this case, we will modify that data-config.xml file to use a data importer that pulls data from an existing SQL Azure instance. The following example shows how this data importer configuration looks like,

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 <dataConfig> <dataSource type="JdbcDataSource" name="ds1" driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://[your server];database=[your database];user=[your user];password=[your user];encrypt=true;hostNameInCertificate=*.database.windows.net;loginTimeout=30;" readOnly="true" /> <document name="articles"> <entity name="article" dataSource="ds1" pk="id" query = "SELECT id, title, description, tags, author, lastupdated from Articles" deltaQuery="select id FROM Articles WHERE LastUpdated &gt; '${dataimporter.last_index_time}'" deltaImportQuery="SELECT id, title, description, tags, author, lastupdated from Articles where id = '${dataimporter.delta.id}'" > <field column="id" name="id" /> <field column="title" name="title" /> <field column="description" name="description" /> <field column="tags" name="tags" /> <field column="author" name="author" /> <field column="lastupdated" name="lastupdated" /> </entity> </document> </dataConfig>

First of all, we have defined a dataSource element that points to our SQL Azure instance. This is a Jdbc data source that uses the SQL Server driver and sets the connection string in the url attribute. Secondly, we have defined a document, which specifies one or more entities that are mapped from the SQL Azure database with a select statement. The fields section maps the different fields in the document to the fields returned by the select statements. This document is what Solr stores in the index using Lucene. As you can see, three queries have been defined. The first one with the “query” attribute is used by the data importer when a full import of the complete database is done into Solr. The other two queries are used for supporting a delta or partial import scenario. These two are optional and useful only in scenarios where you have frecuent updates and a lot of data to import, which will make the full import considerably slow.

Since this data importer uses the SQL Server driver for the Jdbc data source, you will have to download that package from the Microsoft website and copying it in the folder where Solr looks for the external libraries (SolrMasterWorkerRole\Solr\dist and SolrSlaveWorkerRole\Solr\dist).

We have defined so far the mapping of a document against one or more tables in the database, but Solr still requires the definition of those fields, which are part of the schema. The schema definition can be found in the schema.xml file (SolrMasterWorkerRole\SolrFiles\v44\schema.xml and SolrSlaveWorkerRole\SolrFiles\v44\schema.xml). The following example shows how the schema is modified to include the fields used by the data importer.

1 2 3 4 5 6 <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="title" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="description" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="tags" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="author" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="lastupdated" type="date" indexed="true" stored="true" required="true" multiValued="false" />

This is enough to get Solr configured to pull data from SQL Azure, so we are in conditions to deploy Solr into an Windows Azure subscription also using the tool provided by MS Open Tech. If you have installed all the pre-requisites and followed the instructions in the wiki page of the Github website, the following command should be enough to perform that deployment.

Inst4WA.exe -XmlConfigPath “SolrInstWR_V4.4.xml” -subscription “[subscription name]” -location “[location]” -DomainName “[cloud service name]”

If you are lucky enough to get the command to work in the first instance, it will open a new browser instance when the deployment is complete, and it will also redirect you to the MVC dashboard running in the web role.

You are in conditions now to execute a few commands to import the data into Solr. You will see the location and port where the Solr master and slave nodes are running as part of the home page in the dashboard. For doing a full import into the master role, you have to copy that location and port, and append “/dataimport?command=full-import”. For example, “http://mysamplesolr.cloudapp.net:21000/solr/dataimport?command=full-import”. That will start a full import that runs asynchronously, so you can use this other command to check the status of the import “/dataimport?command=status”. For doing a partial import, you only change the command to this one “/dataimport?command=delta-import”. Once the import is complete, you can do a search to verify that everything looks ok. That can be done with the following command “/select?q=[query]”. For example, “http://mysamplesolr.cloudapp.net:21000/solr/select?q=azure

So you have Solr indexing all your data now, but what happens with the security ?. This thing is open to the world. Anyone can do anything with your Solr instances as everything is public. There are some ways to secure Solr pages by changing some settings in the web server, which is Jetty by default. However, the Solr documentation recommends to put Solr behind a reverse proxy that filters the requests. There are several reverse proxy implementations for Solr in Github, but I will use a different approach for Windows Azure here. Given that the MS Open Tech template includes a web role for the MVC dashboard, and two worker roles for running the Solr master and slave instances, we can make the Solr instances available in internal endpoints only and use the MVC application as a facade or reverse proxy to forward all the requests to these instances. The only public and visible face will be the MVC application in the web role. All the requests for Solr must go through this MVC application first, which can filter any request that looks malicious or any request that can damage the existing indexes. A simple way to do this is to allow only get operations and filter the rest. This is the approach I’ve been taken and implemented as part of a fork created from the MS OpenTech project in Github. This fork is available here.

Categories: Blogs

Don't Inject Markup in A Web Page using Document.Write

Professional ASP.NET Blog - Tue, 2013-06-04 15:33
Look around just about every consumer facing site you visit these days has a third party script reference. Just about everyone uses Google Analytics and if you are like a former client of mine you have it and 2 other traffic analysis service scripts injected...(read more)
Categories: Blogs

Sending a Photo via SMS on Windows Phone

Professional ASP.NET Blog - Thu, 2013-05-30 03:01
Smartphones are awesome. They are the modern Swiss Army Knife because they do so much. One of the most important features in my opinion is taking photos. My Nokia Lumia has one of the best cameras available in a Smartphone and I like to use it all the...(read more)
Categories: Blogs

You Don't Need Windows To Test Your Web Site in Internet Explorer

Professional ASP.NET Blog - Wed, 2013-05-29 17:25
I know the majority of developers reading my Blogs are typically ASP.NET, enterprise developers. This means they develop on a Windows machine using Visual Studio most of the time. However in the broad market most modern web developers work on a MAC or...(read more)
Categories: Blogs

Using The New Git Support in WebMatrix 3

Professional ASP.NET Blog - Sun, 2013-05-26 15:19
WebMatrix is probably my favorite web development IDE because it is so simple and easy to use. Sure I use Visual Studio 2012 everyday and it has probably the best web development features available on the market. I also really dig Sublime. WebMatrix is...(read more)
Categories: Blogs

Publish to Directly To Azure Web Sites With WebMatrix

Professional ASP.NET Blog - Wed, 2013-05-01 20:39
WebMatrix is one of my favorite development tools because it really allows me to focus on what I love to do most, build modern web clients. It is a free Web IDE available from Microsoft and today they released version 3 for general availability . There...(read more)
Categories: Blogs

17000 Tweets in 365 Days - Not Too Many To Be Annoying

Professional ASP.NET Blog - Tue, 2013-04-30 14:29
What the heck was I thinking? Why did I do it? What did I learn? How did I do it? These are all things I have asked myself and others have asked me over the past year. It sounds like an odd labor to undertake and such an odd number. But yes I did 17,000...(read more)
Categories: Blogs

Introducing ToolbarJS - A HTML5 JavaScript Library to Implement the Windows Phone AppBar Functionality

Professional ASP.NET Blog - Sun, 2013-04-28 12:03
Back in February I released deeptissuejs , a HTML5, JavaScript touch gesture library. In January I release panoramajs a HTML5, JavaScript library to implement the basic Windows Phone panorama control experience. This month I am excited to release another...(read more)
Categories: Blogs

HTML5 and CSS3 Zebra Striping - Look Ma No JavaScript

Professional ASP.NET Blog - Mon, 2013-04-22 11:36
It was 5 maybe 6 years ago when I first started learning jQuery. One of the first things I did was order the jQuery In Action book . If you have read that book you should remember one of the first examples given, zebra striping a table. To me this example...(read more)
Categories: Blogs

Listen to Me Talk to Carl & Richard about the Surface Pro, Mobile Development and More

Professional ASP.NET Blog - Thu, 2013-04-18 11:53
A few weeks ago I got to sit down and chat with the DotNetRocks guys about a variety of topics. The initial premise for the interview was to talk about the Surface and why I love it so much. I think we got into some great tangents right from the start!...(read more)
Categories: Blogs

Why Its Time to Sunset jQuery

Professional ASP.NET Blog - Sun, 2013-04-14 14:15
I owe so much to John Resig and the jQuery team for creating such a wonderful framework. I have staked most of my recent career on jQuery the way I staked my career on ASP.NET back in 2001. I have built many applications using jQuery over the past five...(read more)
Categories: Blogs

The Good and Bad For MtGox.com - Helping it Scale With Web Performance Optimization

Professional ASP.NET Blog - Fri, 2013-04-12 13:30
BitCoin seems to be latest rage with wild value fluctuations. The past few days have seen a very wild roller coaster for the online currency. Most of the world's BitCoins are exchanged at MtGox.com , which has had some issues either with a denial of service...(read more)
Categories: Blogs