Graphs are data structures highly useful to understand and represent many real world problems in all kinds of areas such as business, government, and science.
To take advantage of graph databases we don’t need to take a Masters on Graph Theory. Instead of that, we must understand what a graph is, and be able to build one drawing it on a paper.
So, what is a graph?
Mathematically speaking, a graph is just a collection of vertices and edges. Or if you don’t like math, a set of nodes and relationships that connect them. Graphs represent entities with nodes (vertices), and the way that entities relate with each other are expressed by relationships (edges).
If you stop now and think about, this structure allow us to model countless scenarios, from commercial systems to more complex problems such as optimization algorithms.
This graph model is formally known as Property Graph. A property graph has the following characteristics:
- It contains nodes and relationships.
- Nodes contain properties (key-value pairs).
- Relationships are named and directed, and always have a start and end node.
- Relationships can also contain properties (key-value pairs).
Despite being intuitive and easy to understand, the property graph model can be used to describe almost all graph use cases.
As you probably is thinking, graph databases use the graph model to store data as a graph, with a structure consisting of vertices and edges, the two entities used to model any graph. In addition, you can use all the algorithms from the long history of graph theory to solve graph problems and in less time than using relational database queries. I’ll be covering some of them on my next posts.
Beyond the image above, and now talking specifically about Neo4j, it is an open-source graph database supported by Neo Technology, that stores data using the Property Graph model. It is reliable, with full ACID transactions, expressive, with a powerful, human readable graph query language called Cypher, and simple, accessible by a convenient REST interface or an object-oriented Java API.
Enough theory and talking for now. Let’s prepare our environment to play a little with Neo4j, and build a simple Rails application.
Installing Neo4j on development machines is very easy. If you are on OSX and is using brew, go ahead and issue
brew install neo4j on a terminal window.
Or, if you prefer, follow these five steps:
- Download the Neo4j Community package.
- Unzip on your installations folder, let’s say
- Create a symbolic link named
neo4jto the unzipped folder. For instance:
ln -s ~/Applications/neo4j-community-2.1.2 ~/Applications/neo4j.
- Create a environment variable named
NEO4J_HOME, pointing to this symbolic link.
- Change the
PATHenvironment variable, adding the
This way, in the future when you want to update the Neo4j database on your machine, you can just download the new version, unpack, and update the symbolic link pointing it to the new version.
When you have it installed, open a terminal window and type:
neo4j start. This command will start the Neo4j server on your machine. Now go check it on your browser accessing
http://localhost:7474/. You’ll be presented with super nice administration panel, where you can visualize the data stored on your Neo4j instance, manipulate data using the Cypher Query Language, check all instance configuration, and more.
Using Neo4j from Rails
Neo4j is built on top of Java and the rock solid JVM. As we want use (MRI) Ruby on Rails here, let’s connect our app using Neo4j’s REST API.
To make things simpler, we’ll use the awesome gem (surprisingly) called neo4j from @andreasronge. The version 2.x is the stable version. But here we will use it directly from the master branch where the version three is under active development, and which enable us to use the MRI Ruby connecting to Neo4j via its REST interface. If you are into JRuby, you can even use the stable version and connect using the embedded db (by filesystem), which means a Neo4j instance running on the same JVM of you app.
But here we will use the first one. Go ahead, add the reference to your
bundle install it:
Let’s start with a dead simple app. Two models:
Music. One artist can interpret many musics, and a music belongs to a artist.
I will not paste the application code here on this blog post since we are using the alpha version of the neo4j gem, and much of the code could become outdated quickly. Instead of replicating code here, you can check the live demo which is running on Heroku, and the updated source code on my Github account.
Before you dive into the demo application code, just let me highlight some key points about the usage of the neo4j gem on a Rails app. I bootstrapped the app with Pah gem, and started learning (the hard way) how to make things work. So here is the main points that need your attention:
- Delete the db folder of your project. We aren’t going to use migrations or a seeds file.
- Pick the frameworks you want from Rails, removing Active Record and adding Neo4j. Don’t forget to remove any reference to active record on your app’s
- Configure where is your Neo4j instance. During the development you can connect on
localhost:7474. On Heroku we are going to use the great GrapheneDB which provides Neo4j graph database as a service. Use the add-on for this.
Neo4j::ActiveNodeto the models (there is generators to create them). Here is where the fun begins. Each model class will represent a node, a entity on a graph. And as you should remember, a node contains properties and relationships. The neo4j gem gives us a nice API to compose our graph, supporting the well known Active Model validations API.
The intent of this blog post was introduce you to the world of Graph Databases, giving some theory about graphs and a practical hands-on using Neo4j and Rails. Although the graph model of the demo application looks very simple, much can be learned until here.
For future posts expect to read more about Neo4j, Cypher Query Language, and traversal algorithms.
So, what about learn by doing? I invite you to clone the sample app and start hacking it right away! Add some feature, improve the graph model in some way. Pull requests are welcome!
And remember: graphs are everywhere!