GrapheneDB Blog

Updates from the GrapheneDB team

3 Major Graph Database Technology Trends to Watch Out for in 2016

GraphConnect 2015

In a recent article on Forbes, Neo Technology CEO Emil Eifrem shared his predictions for the graph database space in 2016. If you haven’t read them already, the article is definitely worth a read.

We were inspired by Emil’s predictions and gathered our thoughts on trends we see will rise in 2016.

Trend #1: The Rise of Data-Driven Decisions

Startups are already known for being extremely data-driven in their decision-making process, so this may not be something new. However, we are seeing more and more that disciplines that were not data-driven in the past are adopting this way of thinking.

Take Digital Marketing, for example. Even looking back two years ago, the success of a digital marketing campaign was still hard to measure. Looking back further than that, things become even more cloudy. Do you measure the success of a digital campaign through just Facebook Likes or Retweets on Twitter? That seems simplistic and not truly reflective of user activity (i.e. how many retweets result in sales?). Graph technology can help marketers make sense of data relationships and connections throughout the course of a campaign in order to get a more holistic picture of a user’s journey.

While there was a reluctance in the past from some practitioners in the digital marketing space to adopt data-driven decision-making (due to fear that it would expose their marketing efforts as being ineffective), leveraging data to make decisions as it relates to marketing campaigns has become the only way to stay competitive. If you’re making decisions based on assumptions, you’re not going to be very successful.

It’s not just marketing that is adopting data-driven decision-making at a rapid pace. Everyone is simply becoming more data-driven. Companies can use data to shape upcoming products, see how well a feature is working, or even streamline internal processes.

Trend #2: Data becoming increasingly interconnected

This one again is nothing new. Data has always been interconnected, but we have always been storing it in a “flat” way. We have been oversimplifying data models because the technology to look at data in its natural “interconnected” way was simply not there. Thanks to graph database technologies, we are now discovering new opportunities and new ways to look at how data is interrelated.

User behavior is very complex, and looking at things in isolation as we have been typically doing so far, is really an oversimplification. We have been segmenting data far too long. Simply because we didn’t have the technology. Graph databases came along because data has always been interconnected.

For example, take a user’s purchasing decision. If you only look at when a user made a purchase and what they purchased, you’re missing a lot of the important factors that led to the purchase. That’s the data that will help you make better decisions about how to engage your users! More and more companies, such as Adidas and Walmart, are starting to adopt graphs because they are a superior option in understanding how users make purchasing decisions. This enables companies to target actions and campaigns that work. Being smarter about the user is where the market is headed, and graph technology helps with that.

Perhaps you had suspicions that you could make sense of data in this highly interconnected way, but you never really had the tools. Now, with graph technology, we’re seeing a new way of thinking about data. It’s a paradigm shift and a whole new world of opportunities!

Trend #3: Polyglot persistence

Companies are now managing an increasing complexity in their system. For some time, there was a trend to implement systems in one technology stack. Maybe you did everything in Java because it was company policy. Looking at highly complex apps, like Uber or Airbnb, you cannot run such a complex operation with just one tech stack. You have to combine different technologies. There are now many different tools for any problem you need to solve. Everything is distributed, so companies are developing in an increasingly polyglot way.

Polyglot persistence means storing data in different databases, depending on what you need. You may have a Mongo, Redis, and Neo4j database for different requirements, as they all excel at different things. This set-up is becoming increasingly normal. You can no longer just pick one database or stack and stick with it, you need to pick the best tool for the job.

For example, if you wanted to build a video streaming and recommendations service, you could store videos on one central database, but have the recommendations engine on a separate database, such a graph database, that is better suited for making sense of connected data.

This polyglot way of developing systems does not require you to know every system or stack out there, but it does require deep collaboration between team members with different expertise in order to create complex applications. While polyglot development is not new, it is fairly new at the database level.

A couple of big trends we see in this space is the use of Apache Kafka to keep databases in sync, as well as more and more mature tools will entering the market to facilitate connecting popular databases to each other, lowering the bar for polyglot persistence.


Image credit: NeoTechnology.

Interview With Jean Villedieu of Linkurious

What is your name and what do you do?

My name is Jean Villedieu (@jvilledieu) and I am a co-founder of Linkurious, where I am in charge of sales and marketing.

Linkurious provides companies with data insights through graph visualizations powered by Neo4j, making it easy for end users, either data scientists or business analysts, to understand graph data.

We are a 5 person team based in Paris, but we have customers all over the world — mostly in the US, but also in Europe, China, Australia and South America. Some of our customers include companies that use our technology for fraud detection and medical research. One of our most notable clients is NASA.

What did you do before joining the Neo4j community?

I had met Linkurious co-founder Sebastian 3 years ago. Sebastian had created Gephi, a very successful open source, graph visualization platform. At this time, Sebastian already had the idea for Linkurious, I thought it was a cool idea so I decided to join him in starting Linkurious.

Did you find it risky to start a new company?

I found it exciting! I understood very quickly that there was an immense possibility for what we could do with the company. The world is already structured as networks it can be social links, transactions, the way ideas spread. These are networks. It’s a new way to present and think about information, which can empower you to make smarter decisions. I just saw a huge potential for this technology.

Working in the data visualization community in Neo4j, do you see any trends we should be aware of?

As companies store more and more data, and that data gets increasingly connected and sophisticated, graph technologies will be key in making sense of the data. Smart big data solutions will continue to have a high impact in the industry.

What is your favorite community project?

Linkurious.js is an open source project we support, which is free to use. Anyone can download it from Github. It’s even used for commercial projects. I’m always excited to hear about how people use it.

Just the other day, someone reached out to us. They are developing an application on GrapheneDB with Linkurious.js and they were psyched about it. That’s the beauty of having an open source project — anyone can use it and start creating something meaningful very quickly.

What is your favorite Neo4j use case you’ve seen?

NASA uses Linkurious to explore and manage data. They have a database of lessons learned. They explore data visually, making it easy to understand what went wrong, what went well, and not repeat mistakes. So, sending stuff to space is really cool!

The International Consortium of Investigative Journalists (ICIJ) used Linkurious to analyze data from HSBC Bank and a wide range of fascinating stories came out of their research. They were able to make connections and see how some shady corrupt businesses operate, which sparked a debate on offshore banking. There was a segment on 60 minutes about it and articles on The Guardian, and Le Monde. You can read more about this here, it’s a fascinating use case of making sense of data with our product.

Any parting words or tips you’d like to share?

Well, Linkurious is compatible with GrapheneDB! So if you want to try out our service and need an instance of Neo4j, GrapheneDB is definitely an option some of our customers use. Or you can also use Linkurious.js and GrapheneDB as mentioned earlier.

Interested in Linkurious? Sign up for an online demo.

Meet Cycli: The Best CLI Client for Neo4j

cycli - Query and update your Neo4j database from the command line

GrapheneDB operates the largest fleet of Neo4j databases in the cloud. As a result, we talk to a wide variety of customers every day, all with very different needs. One of the most common questions we receive across the board is how to query Neo4j from the command line.

You may already know that Neo4j ships with a CLI tool called neo4j-shell. While neo4j-shell might work fine locally, it can’t be used to connect to public-facing Neo4j instances that have been secured.

Luckily, there is a great tool called cycli that allows you to connect securely to remote servers using the Neo4j REST endpoint.

cycli output

Besides being able to connect to remote servers securely using authentication credentials and SSL, we’re big fans of cycli due to the following killer features:

  • Syntax highlighting colors that emulate the Neo4j browser, making it easy for neo4j users to understand and easily catch errors.
  • Smart auto-completion that not only suggests Cypher keywords, but also node labels, relationship types and properties based on your current dataset.

When customers ask us for recommendations on how to best query Neo4j on the command line, we always recommend cycli — it only made sense that we incorporate cycli into our product somehow to make things easier for our customers.

We’re excited to announce that we have now included a direct snippet for cycli in the GrapheneDB Connection UI, so you can easily leverage the power of cycli with GrapheneDB.

dashboard

More about cycli

cycli is a CLI tool built by Nicole White, a data scientist at Neo Technology, who is also the maintainer of the R driver for Neo4j. cycli is implemented in Python and uses Nigel Small’s Py2neo to connect to Neo4j.

cycli can be installed using pip package manager:

1
$ pip install cycli

View cycli on Github.

If you’re interested in knowing more, Nicole published a great blog post, explaining how she implemented the smart-autocompletion feature using Markov chains. You can read it here. Nicole also recently made an update to cycli, you can read more about it here.

Find Us at GraphConnect 2015

GraphConnect 2015 is this week and we couldn’t be more excited to attend!

We are sponsoring the event and will be available at our booth all day to talk to anyone interested in GrapheneDB. If you’re an existing client, we’d love to touch base and see how you’re enjoying our service. If you’re considering using GrapheneDB, we’d love to talk to you as well to see how we can help you build something great with Neo4j!

What to expect at the GrapheneDB booth

There will be lots of goodies at the GrapheneDB booth. There will be plenty of swag to bring back home, plus we’re giving free credits towards a standard or production plan to those who come visit us at the booth.

Preview new features

We’ll be launching a new metrics dashboard feature soon, but if you’d like to get a sneak peek, please come find us. We’re looking for new or existing customers who may be interested in participating in the beta for this feature. Come get a demo of our new metrics dashboard and sign up for the beta.

Talks we’re looking forward to

In addition to sponsoring, we’re also looking forward to the following talks, so this is where you’ll find us during the conference.

We are, of course, looking forward to Emil Eifrem’s keynote at 9:00am. We can’t wait to hear what news he has to share with the community. In addition, we’re especially interested in:

  • “Real-Time Recommendations with Graphs and the Future of Search” by Michal Bachman, at 2:40pm.

  • “Advanced Neo4j at FiftyThree” by Aseem Kishore at 4:20pm.

  • “Polyglot Persistence for Microservices using Spring Cloud and Neo4j” by Kenny Bastani and Josh Long at 5:05pm.

Now you know where to find us and what to look forward to. Be sure to follow us on Twitter (if you aren’t already) for conference updates. We hope to see you there!

Announcing New Features and Partnerships

Just in time for GraphConnect 2015, we couldn’t be happier to announce two new features coming soon to the GrapheneDB service, as well as two new partnerships.

High availability clustering

Our clients have come to expect the best from GrapheneDB and we want to continue to deliver the best solutions for our customers. We’re proud to announce we are adding clustered deployments on Neo4j Enterprise Edition to our offering.

If you want improved uptime and reliability, or if you’re looking to scale read traffic, our high availability clustered deployments are the right choice for you. Perhaps you’re looking to run reporting on a separate instance without hurting your production environment? Our clustering option can help.

Our high availability offering is custom for each client, if you’re interested in discussing this option, please fill out this form and we will be in touch to discuss with you further.

New metrics dashboard beta

We’re also very excited to announce our new metrics dashboard beta.

GrapheneDB Neo4j server metrics

Our metrics dashboard will allow you to track server errors, as well as see when errors are happening. You will also be able to track median and 95th percentile response times, as well as see incoming HTTP requests. Access to this information will allow you to see what is happening with your Neo4j server in real-time, which can help you diagnose performance issues in order to fine tune your application.

This level of data detailed in an easy-to-use dashboard is something not offered by other Neo4j database-as-a-service providers, and can give you greater visibility into what is happening with your server. If you are interested in trying the metrics beta, please contact us at support@graphenedb.com.

Neo4j and Amazon Web Services partnerships

Lastly, we’d like to announce new partnerships with Neo4j and Amazon Web Services.

As a Neo4j Solution Partner, you will benefit from our advanced deployment options with Neo4j Enterprise Edition. As a partner, we have access to expert support straight from Neo Technology, which in turn benefits you — our customer.

As an Amazon Web Services partner, you can enjoy all of the benefits of GrapheneDB, coupled with the service from the largest cloud hosting provider in the world. This truly makes GrapheneDB the best, fastest option to run Neo4j on Amazon Web Services.

These new partnerships solidify our commitment to you, our client, that we seek to partner only with the very best to provide you with the world-class service you need.

GraphConnect Europe 2015, Introducing Azure Beta & Neo4j Server Logs

GrapheneDB is proud to sponsor GraphConnect Europe 2015 (London, May 7th), the only conference that focuses on the rapidly growing world of graph databases and applications that make sense of connected data.

If you’re at the conference, please find us and say hello. We’d love to chat about graphs and have swag with us for you!

We’ve also got some great news to announce!

GrapheneDB on Azure now available in beta

Neo4j Databases Now in Beta on Microsoft Azure

Are you hosting your applications on Microsoft Azure? Now you can take advantage of GrapheneDB’s fully managed Neo4j hosting service, too.

Create your free Neo4j on Azure now

Microsoft Azure has been one of the most requested providers on GrapheneDB for a long time. We’re happy to announce beta availability of GrapheneDB hosted Neo4j databases on Microsoft Azure.

The service on Azure is currently in public beta with the Sandbox plans generally available in the regions East US 2 and North Europe.

Sandbox databases are a fantastic way to discover the value of GrapheneDB’s fully-managed Neo4j hosting service on Azure or get acquainted with Neo4j. In addition to having your database up and running in a few seconds you can:

  • Access to the Neo4j browser interface
  • Export and restore databases on demand
  • Access the server log files

Want to take the new Azure deployments for a spin?

If you’re already a GrapheneDB user, you can create an Azure deployment from the new database form by selecting “Microsoft Azure” as the provider.

GrapheneDB provisioning on Azure Beta

New to GrapheneDB? This link will take you through the signup process and then to the next step where you’ll be ready to provision your Azure-hosted Neo4j instance. It takes less than a minute to be up and running!

Need production-grade Neo4j hosting on Azure?

Besides the free Sandbox plans, we’re also offering our Performance plans to a select set of beta customers. We plan to make the rest of our plans generally available as soon as possible. Get in touch.

New Feature: Neo4j Log Files

We’d also like to introduce you to our newest feature: Neo4j Server Logs.

This feature enables GrapheneDB users to diagnose and debug issues by providing access to the Neo4j server log files from the user interface.

Streaming of Neo4j Log files on GrapheneDB

We want to make it easy for our users to access the log files when necessary–for instance, when debugging a custom plugin or server extension, or when trying to determine if there are any errors.

How it Works

There are two ways to access the log files:

  • In-browser streaming (similar to the tail UNIX command ): Streaming can be paused to scroll up/down, examine or copy certain sections of the file.
  • File download: Enables you to download the files to your computer for further inspection.

We provide access to the following log files:

  • messages.log
  • neo4j-0.0.log

Server logs are available on all our plans.

As always, we look forward to receiving your feedback.

Introducing Team Collaboration

Team collaboration

Most of our customers are small- to medium-sized development teams that work together on applications. In such work environments it is desirable that multiple developers have access to all production databases, and that they can deploy new instances when testing out new features or setting up new environments.

GrapheneDB has added a new team collaboration feature, making it easy for teams to work together on their Neo4j databases and manage their account easily.

There are three different user roles:

  • owner: the account used to sign up, full access
  • admin: can manage billing details, collaborators and databases
  • collaborators: can manage databases

Managing users is very straightforward. If you’re an account owner or admin, you can add/edit/remove team members from the Users tab in the account area:

Creating a new user

Collaborating on databases

All users within the same account can manage existing databases and provision new instances, without having to add any billing details to their own accounts.

Collaborating on billing

If you have multiple co-founders, or if you need someone from your team to update the billing details and make sure payments get done in time, you can add admin users to your account.

Admins have full access to the billing area, in addition to unrestricted permissions to manage databases. They can also add or remove team members or change their roles.

Collaborating on support

When opening support tickets through our UI, you can also include other team members in CC. They will receive email notifications to any updates on the support ticket, enabling everyone to be on the same page and participate in the conversation.

CC team members in support requests

If you have any questions or comments, or want to provide feedback about this feature, please get in touch.

Neo4j 2.2 on GrapheneDB

First up, congratulations to Neo Technology, especially the engineering team, for releasing Neo4j 2.2!

Neo4j 2.2 Browser UI

A lot of you have been asking when it’s going to be possible to try out the recently released Neo4j 2.2 on GrapheneDB. That’s why we’re happy to announce that you can now upgrade your GrapheneDB deployments to Neo4j 2.2.

As a major new version with plenty of new features, we don’t suggest you go straight into production with it.

What we do recommend at this point is that you begin testing with a new Neo4j 2.2 deployment to ensure that you don’t run into any unexpected issues. As with other releases that include store-migrations, the version upgrade will be non-reversible, so proceed with caution.

We will be tracking and iterating updates to Neo4j 2.2.x as they are released.

Also, we recommend to upgrading to Neo4j 2.1 before embarking on the 2.2 update.

Try Neo4j 2.2 on GrapheneDB for free!

Noteworthy changes

Users are reporting performance improvements of 20% and higher without any changes in code or queries.

Here’s a short summary of what to expect when upgrading:

  • A major overhaul of the Neo4j Browser, including the ability to visualize query plans and terminate running queries!
  • Fast-write buffering which dramatically increases the throughput of many write workloads.
  • A new pagecache, which is faster and easier to configure. If you’ve struggled to configure your Neo4j memory caches before, you’ll love this one!
  • Full support for profiling and explaining Cypher query plans, including a query plan visualizer in the Neo4j Browser.
  • A new cost based query planner for Cypher, which is smarter at planning queries, in addition to the existing rule based compiler. Neo4j will be smart and will automatically figure out which to use for each query.

For the full list of changes, please look at the release notes for Neo4j 2.2.

Try Neo4j 2.2 on GrapheneDB now!

Want to Host Your Neo4j Databases in Europe? We’ve Got You Covered! ;)

Announcing General Availability of all GrapheneDB plans in AWS EU and Heroku EU regions

At GrapheneDB, we’ve been offering dedicated instances in the EU region since the beginning of our beta phase, but the plans on the Hobby and Standard tier have only been available in the N.Virginia AWS region for most of the time. The reasoning behind this is there was not enough demand for shared instances in Europe to justify setting up and operating the necessary infrastructure.

Europe map

Demand has been increasing steadily over the last couple of months and our European users have been frequently asking us for more affordable plans in the EU region.

We’ve heard you loud and clear and this is why we’re happy to announce that all our AWS plans are now available in the AWS eu-west-1 region (Ireland).

Create your free EU hosted database now

Click on the button above and follow these steps to provision your database in Europe:

  1. Choose a plan (our free Sandbox plan is available in Europe, too!)
  2. Enter a name and select EU (Ireland) region in the AWS region selectbox
  3. You’re set!

Provisioning a Neo4j instance in the AWS EU region on GrapheneDB

GrapheneDB Heroku add-on now available for Heroku EU apps

If you’re using our Heroku add-on, we’re also happy to let you know that it’s now also compatible with the Heroku EU region, making it easy for you to deploy your apps in Europe.

All databases provisioned through the add-on (even thee free Chalk plan) for apps in the Heroku EU will automatically be hosted in the AWS eu-west-1 region (Ireland).

If you haven’t yet, check out our Neo4j add-on for Heroku now. We’ve got free plans to get you started there, too.

Populate Your GrapheneDB Neo4j Instance With a Custom Generated Graph

At GrapheneDB, we are frequently approached by developers working on Neo4j-powered applications, with questions about graph modeling. In this post, we will walk through the process of generating a custom graph with sample data and then populating your GrapheneDB Neo4j instance with it.

If you have data, we usually recommend importing it into Neo4j as outlined in a previous blog post. At that point, you can model the problem, run sample queries and iterate until you get the results you want.

But what if you don’t have any real-world data available? Then generating a sample dataset is probably the best option.

To get started, we recommend Graphgen, a component of the Neoxygen set of tools developed by Christophe Willemsen. It’s a free, open-source, web-based graph generator that you can use to populate Neo4j graphs with sample data. It has following features:

  • A wide range of sample data providers: people’s names, addresses, phone numbers, company names, email addresses, domain names, etc.
  • It relies on a sort of pseudo-Cypher syntax, with the ability to create complex structures
  • and… it’s also capable of populating GrapheneDB instances!

Let’s walk through a detailed example, focusing on people working at companies and the friendship relationship between them.

Step 1: Provision your Neo4j database on GrapheneDB

If you haven’t already, sign up for a free GrapheneDB account at www.graphenedb.com. Once logged in, create a new, sandbox database. It’s free!

Creating a Neo4j instance on GrapheneDB

Step 2: Generate the dataset

Sample graph model

Let’s model a simple graph with people and companies:

  • People will be stored as nodes with label Person and a property fullName.
  • Companies will be stored as nodes with label Company and a property name.
  • A Person node can have a KNOWS relationship to other Person nodes
  • Person nodes have a WORKS_AT relationship to a Company

In order to generate this in Graphgen, we will need to describe the pattern in pseudo-Cypher syntax:

1
2
(person:Person {name: fullName} *35)-[:WORKS_AT *n..1]->(comp:Company {name: company} *10)
(person)-[:KNOWS *n..n]->(person)

This short snippet will generate:

  • 35 nodes with the Person label and a randomized full name property
  • 10 companies with random company names
  • A relationship WORKS_AT connecting every person to a single random company (n..1)
  • And relationships KNOWS between the Person nodes (many to many, n..n)

Notice the syntax is almost standard Cypher, but with some simple additions to specify properties using formatters, indicate the amount of nodes to create, define the relationships and their cardinality. For further details please read the project documentation.

Next, open graphgen.neoxygen.io, put your code snippet in the text area and hit the Generate button. The resulting dataset will be rendered as an interactive graph which you can navigate.

Generating the graph with Graphgen

Step 3: Populate your GrapheneDB Neo4j instance with the generated dataset

Click on the Populate your database button. A modal dialog will pop up, with fields for the URL, username and password.

Fill out these fields with the connection details located in the Connection tab for your Neo4j instance in the GrapheneDB user interface: URL, username and password.

Populating the database with Graphgen

Step 4: Browse and query your generated dataset on GrapheneDB

Loading the generated graph into your GrapheneDB should take just a few seconds. After the process is completed, go back to GrapheneDB’s interface.

In the overview tab, use the Launch button fire up Neo4j’s web-based browser UI.

Browsing the generated graph in Neo4j

If you, like us, also find Graphgen a useful tool, make sure to say thanks to Christophe for putting it together and stay tuned to his Twitter account for some nice upcoming features.

Graphgen is open-source and you can also have a look at other great Neo4j related tools Christophe put together at neoxygen.io, like neoclient, a PHP client for Neo4j which is also supported by GrapheneDB.