Neo Blockchain On Azure: Introduction to NEO-CLI

In my previous post, we started with a small NEO private net. Today, we will take a quick look into NEO-CLI and what it offers. Although it is named NEO-CLI, in practicality, this is a full blown NEO blockchain node instead of just a CLI tool to communicate with it. NEO offers two node types – GUI and CLI. I think the suffix comes from that and I wanted to explicitly mention it since it is a tad confusing.

At first, we will try to connect to our newly created private net. To do that, we will start with installing a separate installation of NEO-CLI. Installing NEO-CLI is pretty straight forward. You will need .Net core installed in your machine. If you don’t follow the instruction here.

Installation

I’m currently using an Ubuntu 16.04 as a reference OS. After installing .Net core framework you will need to install the NEO-CLI package. And since Im on a debian it was quite easy to do so the following way:

sudo apt-get install libleveldb-dev sqlite3 libsqlite3-dev

Configuration

We are testing and my local machine doesn’t have a frame of reference of the test private net we just created over Azure. To give this node a frame of reference we need to configure its SeedList to point to our own private net. What is a seed list? Simply put, it is nothing more than a list of URLs as described in the official NEO documentation.  This is the first set of nodes NEO-CLI will try to connect to when it boots up.

To configure the aforementioned SeedList, we will modify the protocol.json file, under the neo-cli directory.

We need to update the SeedList section of the configuration the following way:

“SeedList”: [
     "IP_or_FQDN_of_Azure_Private_Net_Host:20333”
 ],

If you opt to use the public test net, rename the protocol.testnet.json to protocol.json and you should be good to go.

Booting up the node

Now, it is time to start the node, we are going to invoke:

dotnet neo-cli.dll --log --nopeers

The log option will log the smart contract information and nopeers makes the node only connect to the seed nodes from the configuration file. this is something we want since this is a private network.

Creating a new wallet

Let’s create a new wallet then.

neo> create wallet mywallet.db3

NEO-CLI will ask for password twice for the wallet, pick your desired password. And copy the address and pubkey to keep it a safe place. If you forget the public key you can use list key command to see it.

More on protocol.json

Before we end this one, we will have one last look at the protocol.json configuration file for our node.

{
  "ProtocolConfiguration": {
    "Magic": 56753,
    "AddressVersion": 23,
    "StandbyValidators": [
        "02b3622bf4017bdfe317c58aed5f4c753f206b7db896046fa7d774bbc4bf7f8dc2",
        "02103a7f7dd016558597f7960d27c516a4394fd968b9e65155eb4b013e4040406e",
        "03d90c07df63e690ce77912e10ab51acc944b66860237b608c4f8f8309e71ee699",
        "02a7bc55fe8684e0119768d104ba30795bdcc86619e864add26156723ed185cd62"
    ],
    "SeedList": [
        "127.0.0.1:20333",
        "127.0.0.1:20334",
        "127.0.0.1:20335",
        "127.0.0.1:20336"
    ],
    "RPCList":[
      "http://127.0.0.1:30333"
    ],
    "SystemFee": {
        "EnrollmentTransaction": 1000,
        "IssueTransaction": 500,
        "PublishTransaction": 500,
        "RegisterTransaction": 10000
    }
  },

  "ApplicationConfiguration": {
    "DataDirectoryPath": "Chains/privnet",
    "NotificationDataPath": "Chains/privnet_notif",
    "RPCPort": 20332,
    "NodePort": 20333,
    "WsPort": 20334,
    "UriPrefix": [ "http://*:20332" ],
    "SslCert": "",
    "SslCertPassword": "",
    "BootstrapFile":"",
    "NotificationBootstrapFile":"",
    "DebugStorage":1
  }
}
  • The Magic field contains a uint value that denotes the source network of the message.
  • The StandbyValidators field are the validating nodes in the private node. It is the list of public keys of aforementioned validating nodes. We created 4 wallet here in this specific example and thus we have 4 entries here. 4 is the minimum number of nodes here to be listed for reaching a consensus.
  • SeedList is configured to localhost in this example configuration since NEO-CLI is booting up against the localhost node.
  • SystemFee section is the section that defines the system fee. As the configuration states, the registration fee for assets is 100000 GAS depicted by the RegisterTransaction field. EnrollmentTransaction field defines the registration fee for book-keepers. IssueTransaction is the fee for distributing assets. Finally the PublishTransaction is the fee for smart contracts.

 

That sums it up for this time. Next, we are going to have a look at how consensus works in NEO. And finally we will write a smart contract on NEO in C#. 🙂

Neo Blockchain On Azure : Starter Guide

If you are a blockchain enthusiast in these days, there is a good chance you have heard about NEO. NEO, arguably dubbed the “Ethereum Killer” promises to be a tailor made dApp platform for digitizing assets. This blog is not scoped to give everyone an introduction on NEO itself. So, I highly suggest reading the white paper from here.

What interests me mostly apart from NEOs approach with GAS and NEO to drive an economy for digitizing and managing assets is how it is built. Since it has its “ties” with Ethereum, my initial idea was it would probably be another Ethereum fork. And to my surprise, NEO is actually built from scratch. It has been on github since 2015 and totally built on C#. To be specific, net core. A quick look at the .csproj file tells me this is running on Asp.net core 2.0.2 now. Apart from whatever the promise it brings, this alone justifies my interest on how this is built.

Another thing that interests me here is how NEO approaches developers. Instead of having tailor made languages for smart contracts, NEO allows developer to write smart contracts with the the popular programming languages already. This list includes C#, Java, Golang and JavaScript. NeoVM, the small virtual machine that allows this to happen is .net core driven and hosted here.

The goal of today here is to deploy a NEO private net on Azure. Don’t get confused since for this time I won’t be using Azure Blockchain As A Service. Although the Azure Blockchain Workbench is indeed intriguing, that is a story for another day since Azure BAAS doesn’t support NEO out of the box yet. So we will go ahead with an Ubuntu Server 17.10.

1. Creating A Virt

According to NEO docs, we need to have access to a certain number of RPC ports in our newly created VM. The port description from the doc is here.

We will use the test net ports here, not the main net. We are going to add port 20332-20336, 30332-30336 to the inbound rules.

3. Allow Incoming PortsWhen that is taken care of, we will move to the final validation stage and create the VM. Now from the information found newly created resource page we would go ahead and ssh into the newly created VM.

4. Resource Properties Page

If you are on a OSX, Linux or basically any *nix you could use your native terminal to ssh to the new VM. If you are on windows, my choice is usually WSL.
The next thing we will need is Docker. From the Docker CE installation guide for Ubuntu 16.04 and newer, I devised the following steps.
$ sudo apt-get update
$ sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo apt-key fingerprint 0EBFCD88
$ sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
$ sudo apt-get update
$ sudo apt-get install docker-ce

To verify docker-ce is running, try the official hello world or ask systemctl.

$ sudo docker run hello-world
$ sudo systemctl status docker

After all said and done, we will pull the docker image cityofzion/neo-privatenet. This is a 4 Node NEO private net with 100M NEO and a lot of GAS. It also comes with a pre-loaded wallet. To pull and run the docker image, do the following:

$ sudo docker pull cityofzion/neo-privatenet
$ sudo docker run -d --rm --name neo-privnet -p 20333-20336:20333-20336/tcp -p 30333-30336:30333-30336/tcp cityofzion/neo-privatenet

Now, we are going to check out that preloaded wallet inside that docker container. To do that, execute the following:

$ sudo docker exec -it neo-privnet /bin/bash
* Consensus nodes are running in screen sessions, check 'screen -ls'
* neo-python is installed in /neo-python, with a neo-privnet.wallet file in place
* You can use the alias 'neopy' in the shell to start neo-python's prompt.py with privnet settings
* Please report issues to https://github.com/CityOfZion/neo-privatenet-docker
root@b6477e009639:/neo-python#

The banner will tell you that the screen sessions are running the consensus nodes and neo-python is already installed.

Execute a ls and you will see that neo-privnet.wallet is present there. We will run the pre-installed neopy and open the wallet.

neopy
neo> open wallet neo-privnet.wallet

The password for this is coz. To check the balance, execute:

wallet

5.Wallet Balance

We have around 100m NEO and 16K gas as promised!

There you go! You have your own NEO private net running over azure. To connect to the hosted private net from a remote client, we need to modify our neo client configuration. That is a story for another day.

Making a simple Recommender System: Azure Cosmos Db and Apache Tinkerpop

Graph systems are one of the most ubiquitous models found in almost any natural and man-made structures we see everyday. Since computer science evolved around what we see and devise or deduce, graphs do play a huge role in day to day computation and programming techniques. It’s birth dates back to even before it was widely adopted in mathematics since mechanical computing was graph driven implicitly. If only we could get Alan Turing to talk over this.

Before I start, a big inspiration behind this write-up definitely goes to the work of Marko.A.Rodriguez . I was lucky to stumble upon his work while frolicking over the web and his works on graph computing systems and tinkerpop/gremlin is seriously inspiring.

What we are going to do today:

Let’s go right to what we want to do today. Our component of interest is the graph api of Azure Cosmos Db along with Apache Tinkerpop. Azure Cosmos Db has a graph api that allows us to store data as a graph or a network. In simple words, instead of storing data in a tabular format where you have row and column, you can store data as they describe each other in terms of relations. If the text here is not really doing it for you, let’s jump into some examples.

Let’s have a look into this sample graph I made from HIMYM. The big ellipses and boxes here are called vertices and the lines that connect them are called edges. Since we have a little arrow head telling the directions of them, this is actually a Directed Graph. If we look at the data behind the graph, then you can see, we can represent this in two ways. In a table, you can list up the vertices in one table and the edges in other. The edges table might look like:

From To Type
Ted Barney  friend
Ted Umbrella  found
Barney Robin  wife
Robin Barney  husband
Tracy Umbrella  lost
Ted Tracy  lost
….

I didn’t add all the rows of course and if you look at the design you can see it is not properly normalized. That means in understandable terms that we needed a Type table with all the edge types so we don’t write a Type twice. But that is not in this article’s scope. So, let’s ignore that. The thing here to see is, graph nodes/vertices can be of different types and the edges can mean different relationships. Like, here there are vertices that doesn’t represent a character from HIMYM, like the Umbrella and the MacLaren’s Pub. Storing these data in a regular database is possible and has been done many times. This way is called the ‘implicit’ way to store graph like data. Now one might think, by this rule all data in the world can be represented by a graph. And it is indeed true. But we use a graph database only when we see that the data itself focuses more on relations than the data content itself. Like friend graph in a social network website. Or a geographical model of all the offices of a big corporation. The vertices do carry data but the relations are really important here. In these case, a graph database comes in very handy. Azure Cosmos Db recently came up with a graph api and it can store graph data natively where the cost to traverse a graph is constant. In a regular database we have to join multiple tables to formulate these relations properly and in a lot of cases, these computational costs are not constant.

Apache Tinkerpop joins the party:

I hope by now we understand why we need a graph database and now is the right time we talk about Apache Tinkerpop. It was previously known as Apache Gremlin. This is essentially a fantastic graph computing engine which allows you to connect to multiple graph database using a single domain specific language construct. It has language supports for most of the popular languages and it is pretty native to groovy and java. Fret not, we have our way to use it over here with C# too. Before we start we need to create a simple graph database on Azure. The quickstart is here. You can also opt for java and nodejs. I personally used an Azure CosmosDb emulator. The quickstart to install that instead of an actual Azure CosmosDb is here. You can opt for any of these but I have to remind you, at the time this article is being written, Azure CosmosDb Emulator do not support creating graphs and browsing graph data through it’s local web portal. You have to use Gremlin Console to connect and talk to the local emulator. Gremlin Console is a command line REPL that you can use to traverse a local gremlin/tinkerpop graph or a remote one. It can connect to any gremlin servers anywhere as long as you have the credentials. You can use the gremlin console to talk to the actual Azure CosmosDb graph database. I suggest using Windows Subsystem For Linux (WSL), better known as “Bash on Ubuntu on Windows” for using gremlin-console. The gremlin console quickstart is here.

Our sample data set today:

We are going to use the movie review data set from grouplens from here . We are using the small dataset of about 100,000 ratings applied to 9,000 movies by 700 users. If you download and unzip the data you will see multiple csv fields. Of which I only used the movies.csv and ratings.csv files. I made a separate users.csv file for the users list. Don’t worry, all these are attached with the sample code. The movies csv comes with the movie id, movie name  and genres column. The ratings.csv comes with the user id, movie id and a rating column where the user has rated the movie from 0 to 5. The sample code has a simple command line uploader tool that will let you upload the data in your desired gremlin server and graph database connected to it. So, to understand how we talk to gremlin, let’s have a peek at the code, shall we?

Let’s talk code:

I’m not going to focus on the full source at a time.  Let’s have a look on how can we connect to an existing Azure CosmosDb Graph database. First, we create a DocumentClient. 


DocumentClient client = new DocumentClient(
 new Uri(endpoint),
 authKey,
 new ConnectionPolicy { ConnectionMode = ConnectionMode.Direct, ConnectionProtocol = Protocol.Tcp });

Definitely the thing that you will notice missing is the authkey variable here. And that is actually the primary key of your graph database that you will find in your azure portal. If you use the emulator they use a fixed key which you will find in the quickstart link I shared above.

Now that we do have a DocumentClient let’s upload the sets of movies that we will use for reviews. I’m using the movie name and the movie id for the sake of simplicity here. In our graph, the movie vertices/nodes will have a label ‘movie’. This sets the type of the vertex and we will set the name and id as properties of this vertex. The id uniquely distinguish any vertex. So remember, no matter what the label of the vertex is the id has to be unique. Or Azure CosmosDb will let you know that it can not add a vertex because already a vertex of that same key exists. If you don’t provide an id property, Azure CosmosDb will generate one and put it on the vertex.

We make sure at the beginning of the upload to drop the existing graph database if there is any.

        private async Task NukeCollection(DocumentClient client)
        {
            try
            {
                Console.WriteLine("Nuking...");
                var response = await client.DeleteDocumentCollectionAsync(UriFactory.CreateDocumentCollectionUri("graphdb", "Movies"));
                Console.WriteLine(response.StatusCode);
            }
            catch (DocumentClientException ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

The database name I used for these sample is graphdb and the collection name is Movies. Pardon me for my indecency to hard code these but it’s a sample code, so I put effectiveness in front of decency.

Adding vertices to our ‘Movies’ graph:

Now, we upload the movies. Of course make sure we create the database if it’s already not there.

        private async Task UploadMovies(DocumentClient client)
        {
            try
            {
                Console.WriteLine("Uploading movies");
                Database database = await client.CreateDatabaseIfNotExistsAsync(new Database { Id = "graphdb" });

                DocumentCollection graph = await client.CreateDocumentCollectionIfNotExistsAsync(
                    UriFactory.CreateDatabaseUri("graphdb"),
                    new DocumentCollection { Id = "Movies" },
                    new RequestOptions { OfferThroughput = 1000 });

                Console.WriteLine("Connected to graph Movies collection");

                Console.WriteLine("Reading movie list");
                using (TextReader reader = new StreamReader("movies2.csv"))
                using (CsvReader csv = new CsvReader(reader))
                {
                    while (csv.Read())
                    {
                        string idField = csv.GetField<string>(0);
                        string titleField = csv.GetField<string>(1);
                        titleField = JsonConvert.ToString(titleField, '\"', StringEscapeHandling.EscapeHtml);

                        Console.WriteLine("Uploading " + titleField);

                        IDocumentQuery<dynamic> query = client.CreateGremlinQuery<dynamic>(graph, $"g.addV('movie').property('id', '{idField}').property('title', {titleField})");
                        while(query.HasMoreResults)
                        {
                            await query.ExecuteNextAsync();
                        }
                    }
                }
            }
            catch (DocumentClientException ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

For the collection, same strategy is followed. We get a DocumentCollection instance trying to create the Movies collection if it doesn’t exist or just fetching the existing one if there is one already. Then we start reading the csv file. We use the same DocumentClient instance to create a gremlin construct to add a vertex/node in the collection for each of the movies. We add a Movie label along with the title and id  property as promised. I want to focus a little bit in the gremlin/tinkerpop construct we used here.

The full construct in a more readable format is:

g.addV('movie')
    .property('id', '{idField}')
    .property('title', {titleField})

Let’s ignore the idField and titleField as we already know what is there. The first construct g  stands for the graph in the collection. addV(‘label’) is the method construct to add a vertex. You can see the whole construct is design wise fluent. The next two property(propetyName, propertyValue) construct adds two properties in the newly added movie node. Nice builder interface, isn’t it? Pretty expressive. We are using the groovy standard constructs for gremlin. There are *other language constructs too, including javascript. And following the same approach, I uploaded the users in the sample code too.

By the time of writing this article, the whole connector library from nuget is in preview version and I don’t have one for .net core. So, we need to sit this one out for .net core. Sorry for that.

Connecting the vertices with edges:

The next thing in line is of course to add edges that represents the relationship between an user and a movie. We label the relationships with ‘rates’ and also will add a property named weight and it’s value will be the actual rating value given by that user.

        private async Task UploadReviews(DocumentClient client)
        {
            try
            {
                Console.WriteLine("Uploading movie reviews");

                DocumentCollection graph = await client.CreateDocumentCollectionIfNotExistsAsync(
                    UriFactory.CreateDatabaseUri("graphdb"),
                    new DocumentCollection { Id = "Movies" },
                    new RequestOptions { OfferThroughput = 1000 });

                Console.WriteLine("Connected to graph Movies collection");

                Console.WriteLine("Reading review list");
                using (TextReader reader = new StreamReader("ratings2.csv"))
                using (CsvReader csv = new CsvReader(reader))
                {
                    while (csv.Read())
                    {
                        string userId = "user" + csv.GetField<string>(0);
                        string movieId = csv.GetField<string>(1);
                        float rating = csv.GetField<float>(2);

                        Console.WriteLine("Uploading review for user " + userId + " to " + movieId + " with rating "+ rating);
                        IDocumentQuery<dynamic> query = client.CreateGremlinQuery<dynamic>(graph, $"g.V().hasLabel('user').has('id', '{userId}').addE('rates').property('weight', {rating}).to(g.V().has('id', '{movieId}'))");
                        while (query.HasMoreResults)
                        {
                            var result = await query.ExecuteNextAsync();
                            foreach (var item in result)
                            {
                                Console.WriteLine(item);
                            }
                        }
                    }
                }
            }
            catch (DocumentClientException ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

The only noticeable change here in this snippet that we created edges between a movie and an user vertex. Like before, let’s zoom in the gremlin construct we used this time to create an edge between two nodes.

g.V()
    .hasLabel('user')
    .has('id', '{userId}')
    .addE('rates')
    .property('weight', {rating})
    .to(g.V()
    .has('id', '{movieId}'))

The first enumeration g.V() should enumerate all the vertices in the graph. We need to filter the user vertex first to create an edge from. The next hasLabel(”user) filters all the user vertices. Subsequently .has(‘id’, ‘{userId}’) filters the vertex of the user we want bearing that user id. Then we use addE(‘rates’) method to ensure that we add an edge labeled ‘rates’ from it. The following property(‘weight’, {rating}) should add the weight property with the rating as value on the edge we just created. The last thing we do is to tell where this edge points to. With to(g.V() .has(‘id’, ‘{movieId}’))  we filter out the movie node we want and use it to define which vertex the edge points to. Decent huh? You can find the full tinkerpop reference here.

By now a single user rating a movie should look something like:

Finally, we have all our data ready. Time to traverse this graph and make a simple movie recommendation for any user. 🙂

The simplest movie recommender system in this world:

Let’s devise a simple movie recommender and we will follow the dumbest real life approach we see around. Usually when I pick a movie to see, I pick a movie based on the movies I have already seen and liked. If I liked Deadpool, there’s a very good chance I will like Deadpool 2 and other superhero movies like Logan. Remember we didn’t add/use any genre data in our graph. All we have here are the users, the movies and the ratings made by the users on these movies.

So, let’s find out the other users who likes the same movies as our reference user do. Our reference user is the user who has asked for a recommendation for movies that he should see from us. He has only given us a list of movies he saw and rated. Our current traversed graph should look like this:

We want to traverse the users who like the same movies our reference user likes using gremlin. To construct the query, first we need to get out of our user vertex and find out movies that our user likes. The gremlin construct for that will be:

g.V()
    .hasLabel('user')
    .has('id', 'user7')
    .outE('rates')
    .has('weight', gte(4.5))

Let’s assume our reference user id is user7. And at first we filter the vertices with label user and id user7. After that we use outE(‘rates’) to traverse edges that is labeled rates and has weight more than or equal to 4.5. That’s how we will land on the nodes that we think the user likes. But right now, we are standing on the edges. To land on the movie nodes, we have to use:

g.V()
    .hasLabel('user')
    .has('id', 'user7')
    .outE('rates')
    .has('weight', gte(4.5))
    .inV()
    .as('exclude')

inV() construct will enumerate all the attached movie nodes with the rates edges. The as(‘exclude’) construct is used for a specific purpose. For now, all I can say is, I’m marking all the movies I saw as ‘exclude’.  We will see why I’m doing this soon.

So, now we want to find out all the other users who likes the movies our reference user likes.

So, this is where we want to end up. We found user2 and user3 likes the same movies almost as much our reference user likes. To progress through gremlin, our new gremlin query is:

g.V()
    .hasLabel('user')
    .has('id', 'user7')
    .outE('rates')
    .has('weight', gte(4.5))
    .inV()
    .as('exclude')
    .inE('rates')
    .has('weight', gte(4.5))
    .outV()

We added a InE(‘rates’) construct since we want to know now which inward edges with label ‘rates’ are pointing to these movies our reference user likes. We also filter the edges more with weight property value greater than or equal to 4.5 since we only want users who likes the same movies. At last we added outV() to find the users attached to these edges. Now we are standing on the users who likes the same movies our reference user does.

We want to know what other movies these users likes that our reference user has not rated or seen yet. I’m assuming our reference user rated all the movies he has seen.

From the graph above we can clearly see that user3 and user2 likes movie4 and movie5 that our current user has not rated yet. These are viable candidates for our users movie recommendation. It definitely seems naive, but it’s a start. If you remember about the exclude nodes, we are going to use these now to make sure that our recommender doesn’t recommend the movies our user already has seen. Our desired gremlin query is:

g.V()
    .hasLabel('user')
    .has('id', 'user7')
    .outE('rates')
    .has('weight', gte(4.5))
    .inV()
    .as('exclude')
    .inE('rates')
    .has('weight', gte(4.5))
    .outV()
    .outE('rates')
    .has('weight', gte(4.5))
    .inV()
    .where(neq('exclude'))

We traveled the movies all these other user likes and made sure that we exclude the ones we have already seen using  where(neq(‘exclude’)). To fix our naivety a little bit, let’s take the distinct movies using dedup() and order it by the movies by the amount of ratings it has received.

g.V()
    .hasLabel('user')
    .has('id', 'user7')
    .outE('rates')
    .has('weight', gte(4.5))
    .inV()
    .as('exclude')
    .inE('rates')
    .has('weight', gte(4.5))
    .outV()
    .outE('rates')
    .has('weight', gte(4))
    .inV()
    .where(neq('exclude'))
    .dedup()
    .order().by(inE('rates').count(), decr)
    .limit(10)
    .values('title')

This has to be really really naive to survive any production requirement. But it is indeed an eye opener on how Apache Tinkerpop and Azure CosmosDB Graph databases can do.

If you look on the final query quickly you will see we ordered the final movie nodes by the count of incoming rates edges it has and ordered it in a descending order using order().by(inE(‘rates’).count(), decr). We limited the nodes to first 10 and we only took the titles of the movies.

Putting the system to test:

I wrote a simple REPL to try out various gremlin commands towards our Azure CosmosDb.  Our reference user id was ‘user7’. The list of the movies seen by our reference user is: MoviesSeenByUser

If it is too hard to read, let’s put out the movies here:

  • “Braveheart (1995)”
  • “Star Wars: Episode IV – A New Hope (1977)”
  • “Shawshank Redemption, The (1994)”
  • “Wallace & Gromit: The Best of Aardman Animation (1996)”
  • “Wallace & Gromit: A Close Shave (1995)”
  • “Wallace & Gromit: The Wrong Trousers (1993)”
  • “Star Wars: Episode V – The Empire Strikes Back (1980)”
  • “Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)”
  • “Star Wars: Episode VI – Return of the Jedi (1983)”
  • “Grand Day Out with Wallace and Gromit, A (1989)”
  • “Amadeus (1984)”
  • “Glory (1989)”
  • “Beavis and Butt-Head Do America (1996)”

The movies our simple recommender suggested are:

  • “Forrest Gump (1994)”,
  • “Pulp Fiction (1994)”,
  • “Fargo (1996)”,
  • “Silence of the Lambs, The (1991)”,
  • “Star Trek: Generations (1994) “,
  • “Jurassic Park (1993)”,
  • “Matrix, The (1999)”,
  • “Toy Story (1995)”,
  • “Schindler’s List (1993)”,
  • “Terminator 2: Judgment Day (1991)”

Clearly this is not the best recommender engine, but definitely one of the simplest. We can always reuse the genome data along with the dataset and the genre data and do a proper collaborative filtering. But the scope of this article was just to demonstrate the capabilities of a simple graph traversal.

Hope it was fun to read. Try out Azure Cosmos Db Graph Api and Apache Tinkerpop if you can. They are really fun together to use to and there’s so much you can do using simple graph traversals.

The sample code is hosted here in github. Until next time!

 

Porting your WebApi 2.2 app to Azure Service Fabric

Now, to start talking about this, you gotta know why this is spawned off instead of spawning off a “hello world” to service fabric. I’m back to writing after a long time and it’s only fitting I do share what I have been doing meanwhile.

Usually in a production scenario you’d end up with a case where you have your applications lying around in several tiers. To be honest, I’m pointing reference to a production environment which is microservices driven. If you’re reading this and questioning why I did that you probably want to go down here to get yourself started with the basic concepts of Azure Service Fabric.

Now, I hope you have familiarized yourself with service fabric by now and I can start talking about how you can port your existing IIS hosted Asp.net Web Api 2.2 application to a self-hosted statless web api service in Azure Service Fabric.

To get yourself started please install Azure Service Fabric local development environment in your machine from here and go through the set of instructions to get yourself started. When you’re done with these, you’d see Visual Studio come up with a new set of project types for service fabric.

And it would pretty much like the following:

servicefabricprojecttype

Now, before we click the that elusive OK button, we need to understand what are the tasks at hand here. We have to do two things here.

1. Make sure our web api 2.2 app is compatible with self host as it was using IIS.

2. Make sure we can salvage the same Visual Studio project we used for our web project.

Making sure the existing web api 2.2 app is compatible of self hosting:

  1. For number one, one might think it should be farely easy to port a IIS driven web api app to a self hosted app, the reality might not be that easy. First, make sure you have a OWIN startup class initiated. Usually you would have one if you are using OWIN. And you’d also need a Program.cs file which is standard one for a console application as now your web app would be  a self-hosted application. So go ahead and add these two files to your web project and if you need reference, both of them are shown here.
  2. Now, although you have a startup.cs and Program.cs initiated already, you still haven’t converted your web app project to a console application. To do so, please go over your project properties and  on the Applications section, select Output Type as Console Application and set the Program class as the Startup object. Now, if you forget to add Program.cs as startup object which you’d only be able to do after you have created that file with a static void Main(string[] args) initiated on it, as long as it is named Program.cs the project would invoke it automatically.
  3. If you are using anything from Microsoft.Owin.Host.SystemWeb library, please make sure that these wont work anymore for you. For example if you are using code segments like the following to resolve file path to your traditional web app deployment and subsidiary folders like App_data, it wont work anymore for you.
     string path = System.Web.Hosting.HostingEnvironment.MapPath(@"~/App_Data/EmailTemplates/");
    

    So for things like these you might have to resort yourself to solutions that resolves deployment path in a console application.

  4. Now, there are some other pranks too. If you are familiar with Asp.net Identity and use it for your default identity provider, you’re in for a treat. Usually when you develop a expirable token generation paradigm for email and passwords, you’d need to use a IDataProtectionProvider which is usually the MachineKeyDataProtectionProvider under the namespace of Microsoft.Owin.Host.SystemWeb.DataProtection which you’d probably ditch because you would be using owin self host/Katana now. So, expect a null there where you try something like app.GetDataProtectionProvider() where app is your IAppBuilder. I ran into this myself and thanks to Katana being open source, you just pick up the MachineKeyDataProtectionProovider class from here.Just make sure you use it as  IDataProtectionProvider like the following:
    using System;
    using System.Web.Security;
    using Microsoft.Owin.Security.DataProtection;
    
    namespace TaskCat.Lib.DataProtection
    {
        using DataProtectionProviderDelegate = Func<string[], Tuple<Func<byte[], byte[]>, Func<byte[], byte[]>>>;
        using DataProtectionTuple = Tuple<Func<byte[], byte[]>, Func<byte[], byte[]>>;
    
        ///
    <summary>
        /// Used to provide the data protection services that are derived from the MachineKey API. It is the best choice of
        /// data protection when you application is hosted by ASP.NET and all servers in the farm are running with the same Machine Key values.
        /// </summary>
    
        internal class MachineKeyDataProtectionProvider: IDataProtectionProvider
        {
            ///
    <summary>
            /// Returns a new instance of IDataProtection for the provider.
            /// </summary>
    
            /// <param name="purposes">Additional entropy used to ensure protected data may only be unprotected for the correct purposes.</param>
            /// <returns>An instance of a data protection service</returns>
            public virtual MachineKeyDataProtector Create(params string[] purposes)
            {
                return new MachineKeyDataProtector(purposes);
            }
    
            public virtual DataProtectionProviderDelegate ToOwinFunction()
            {
                return purposes =>
                {
                    MachineKeyDataProtector dataProtecter = Create(purposes);
                    return new DataProtectionTuple(dataProtecter.Protect, dataProtecter.Unprotect);
                };
            }
    
            IDataProtector IDataProtectionProvider.Create(params string[] purposes)
            {
                return this.Create(purposes);
            }
        }
    }
    

    And you’d need to do the same with MachineKeyDataProtector class and use it as a IDataProtector like the following:

    using System.Web.Security;
    using Microsoft.Owin.Security.DataProtection;
    
    namespace TaskCat.Lib.DataProtection
    {
        internal class MachineKeyDataProtector: IDataProtector
        {
            private readonly string[] _purposes;
    
            public MachineKeyDataProtector(params string[] purposes)
            {
                _purposes = purposes;
            }
    
            public virtual byte[] Protect(byte[] userData)
            {
                return MachineKey.Protect(userData, _purposes);
            }
    
            public virtual byte[] Unprotect(byte[] protectedData)
            {
                return MachineKey.Unprotect(protectedData, _purposes);
            }
        }
    }
    
    

    Then you can use it as your replacement of your old one from Microsoft.Owin.Host.SystemWeb.DataProtection. This is not really needed if you don’t use Asp.net Identity but of to good use if you land in the same problem.Now, hopefully your current web api app would be self hosted just fine if you try something like the following:

    // Start OWIN host
    using (WebApp.Start<Startup>(url: baseAddress))
    {
      Console.WriteLine("Press ENTER to stop the server and close app...");
      Console.ReadLine();
    }
    

Now, lets try salvaging that Visual Studio project as much as we can.

Project changes when it comes to Visual Studio

  1. Now, you can use the converted project but from my personal experience it would be easier for you if you just click the new ServiceFabric project button now and create a new Stateless WebApi project. You can port/reuse most of your code and as the boilerplates are already written, you don’t really have to rewrite that.
  2. If you want to keep the old project intact, you can of course reference the old project in the new stateless web api service project and use your old startup class to hook up the OwinCommunicationListener instead of the one that it comes with. But beware, nuget might make it a nightmare for you if dependencies are not met/mismatched.

Hope you guys have fun with Azure Service Fabric and if you really want to look for more how you can do you own stateless web api from scratch take a look here. If I find time, I’d write one too.