Minecraft Crafting Tree in Neo4j – Part 4

In the last post we found that while there are different ways to represent the Minecraft crafting list in a graph database, just generating a list of materials to be gathered for a given recipe is not straightforward.

In this post we’ll look at what we could do in pure Cypher if Minecraft were a slightly different game. We’ll then rough out an algorithm to walk a requirements tree for a given Recipe and figure out what materials we need to gather that does work with Minecraft.

Note: This post will continue with our ‘Recipe’-based graph from Part 3, where Resource and Recipe nodes are separate. You can download the state of our database so far via this Gist, which is a Cypher script to be run into a clean graph.

What if Recipes produced only one output unit?

If Minecraft were different, and a Recipe only ever produced one unit of output then we could actually create an ingredients list for any item just in Cypher.

Remember earlier we had two ways to produce the ‘path’ from a given Resource to its constituent ingredients, listing all the intermediate Recipes. Let’s look again at the clumsier, non-APOC version for a Carrot on a Stick:

MATCH (r: Resource { name: 'Carrot on a Stick' })<-[:PRODUCES]-(recipe1: Recipe)-[req1:REQUIRES]->(dep1: Resource)
OPTIONAL MATCH (dep1)-[:PRODUCES]-(recipe2: Recipe)-[req2:REQUIRES]->(dep2: Resource)
OPTIONAL MATCH (dep2)-[:PRODUCES]-(recipe3: Recipe)-[req3:REQUIRES]->(dep3: Resource)
OPTIONAL MATCH (dep3)-[:PRODUCES]-(recipe4: Recipe)-[req4:REQUIRES]->(dep4: Resource)
RETURN r, recipe1, dep1, recipe2, dep2, recipe3, dep3, recipe4, dep4

One thing to note is that this gives you three rows of results – one for each full subtree from the Carrot on a Stick node to a terminal raw-ingredient node.

Here’s another, this time for Wood Pickaxe:

This example’s arguably more interesting – the Pickaxe is made of Sticks and Planks. Sticks are themselves made of Planks – that means there’s two sub-trees to terminal nodes:

  • The tree that creates the Sticks for our Pickaxe (Sticks -> Planks -> Wood)
  • The tree that creates the Planks for our Pickaxe (Planks ->Wood)

Both trees end in the same resource, unlike the Carrot on a Stick case.

What we want here is to group the information by the name of the resource in the terminal nodes, and multiply together the input quantities down the tree. Why multiply? Consider the Pickaxe case in our simplified model of Minecraft (where each recipe outputs just one resource), and just look at the ‘Sticks’ component of our tree.

  • A Pickaxe requires 2x Sticks and 3x Planks
  • Each Stick requires 2x Plank
  • Each Plank requires 1x Wood

The total Wood needed in this case is:

  • (Wood per Plank) * (Planks per Stick) * (Sticks per Pickaxe)
  • = 1 * 2 * 2
  • = 4

How does grouping work in Neo4j and Cypher?

If you use any aggregate function (like sum, count, etc) then Neo4j will group by the previous non-aggregate values to compute the value you’re asking for. Let’s see an example by amending the RETURN statement of our previous query to just return the number of times a given Resource appeared in our result set (which will be the number of subtrees that end with that resource):

MATCH (r: Resource { name: 'Wood Pickaxe' })<-[:PRODUCES]-(recipe1: Recipe)-[req1:REQUIRES]->(dep1: Resource)
 OPTIONAL MATCH (dep1)-[:PRODUCES]-(recipe2: Recipe)-[req2:REQUIRES]->(dep2: Resource)
 OPTIONAL MATCH (dep2)-[:PRODUCES]-(recipe3: Recipe)-[req3:REQUIRES]->(dep3: Resource)
 OPTIONAL MATCH (dep3)-[:PRODUCES]-(recipe4: Recipe)-[req4:REQUIRES]->(dep4: Resource)
 RETURN COALESCE(dep4.name, dep3.name, dep2.name, dep1.name), count(*)

That matches our previous analysis. So how do we get the total number of Wood we need? Well, multiplying the quantities together directly won’t work because sometimes the values will be null. Examine the first row of the query before last:

The variables recipe3, dep3, recipe4 and dep4 are all NULL because that sub-tree doesn’t go deep enough that they’re needed – after two hops we’ve hit Wood and there’s no further to go. And Neo4j, just like every other database, will return NULL on mathematical operations involving NULL as an operand. For instance:

  • 4 * NULL = NULL
  • 2 + NULL = NULL

We need to turn all those NULLs into 1s so that we can safely, blindly multiply them without changing the result of the calcuation. We can use the COALESCE operator for this, giving us this final query:

MATCH (r: Resource { name: 'Wood Pickaxe' })<-[:PRODUCES]-(recipe1: Recipe)-[req1:REQUIRES]->(dep1: Resource)
OPTIONAL MATCH (dep1)-[:PRODUCES]-(recipe2: Recipe)-[req2:REQUIRES]->(dep2: Resource)
OPTIONAL MATCH (dep2)-[:PRODUCES]-(recipe3: Recipe)-[req3:REQUIRES]->(dep3: Resource)
OPTIONAL MATCH (dep3)-[:PRODUCES]-(recipe4: Recipe)-[req4:REQUIRES]->(dep4: Resource)
RETURN COALESCE(dep4.name, dep3.name, dep2.name, dep1.name), sum(coalesce(req4.qty, 1) * coalesce(req3.qty, 1) * coalesce(req2.qty, 1) * coalesce(req1.qty))

Does it work for Carrot on a Stick?

Good!

Can we make it work for the real rules of the game?

Our toy above works because we cheated. But could we make it work if we allow multiple outputs per recipe?

The actual answer for Carrot on a Stick, if we work it out by hand, is:

  • 1x Carrot
  • 2x String
  • 1x Wood

The Carrot and the String are straightforward raw materials needed directly by the Carrot on a Stick recipe itself. The Wood comes in to it because we need to make:

  • 3x Sticks
  • Sticks require 2x Wooden Planks – but we produce 4 Sticks for each run of the recipe so 2x Wooden Planks is enough to cover all the Sticks we need
  • Wooden Planks require 1x Wood – but we produce 4 planks for each run of the recipe so 1x Wood is enough to cover all the Wooden Planks we need

Since our graph has the output quantity of a Recipe encoded on the :PRODUCES relationship qty property, could we divide the input requirement by the output requirement to get to what we want?

Let’s try. We’ll need to add variables for the :PRODUCES relationships now, as we need that qty property for our maths.

Here’s a first pass:

MATCH (r: Resource { name: 'Wood Sword' })<-[out1:PRODUCES]-(recipe1: Recipe)-[req1:REQUIRES]->(dep1: Resource)
OPTIONAL MATCH (dep1)-[out2:PRODUCES]-(recipe2: Recipe)-[req2:REQUIRES]->(dep2: Resource)
OPTIONAL MATCH (dep2)-[out3:PRODUCES]-(recipe3: Recipe)-[req3:REQUIRES]->(dep3: Resource)
OPTIONAL MATCH (dep3)-[out4:PRODUCES]-(recipe4: Recipe)-[req4:REQUIRES]->(dep4: Resource)
RETURN COALESCE(dep4.name, dep3.name, dep2.name, dep1.name), 
	   toInteger(ceil(sum(
       		(coalesce(toFloat(req4.qty), 1.0) / coalesce(toFloat(out4.qty), 1.0)) 
      	  * (coalesce(toFloat(req3.qty), 1.0) / coalesce(toFloat(out3.qty), 1.0)) 
          * (coalesce(toFloat(req2.qty), 1.0) / coalesce(toFloat(out2.qty), 1.0)) 
          * (coalesce(toFloat(req1.qty), 1.0) / coalesce(toFloat(out1.qty), 1.0))
       )))

Some notes:

  • We need to case the quantities to floating point integers, as otherwise Neo4j will do integral division, meaning we can’t do maths on the fractions of resources we might need to build a given item
  • We divide the requirement for a given recipe by the output quantity to get in essence the number of times that recipe needs to run to yield the correct amount of output
  • We ceil the result, as a requirement for (for example) 1.5x Wood is really a requirement for 2x Wood since we can’t collect Wood in any smaller unit

This almost certainly doesn’t actually work, because we can’t just directly do maths like this.

Close but not quite

Let’s look at the example of a Wood Axe. To make a Wood Axe we need:

  • 2x Sticks
  • 3x Wood Planks

Let’s do the same analysis as we did before:

  • To make 2x Sticks we need 2x Wood Planks
  • To make 2x Wood Planks we need 1x Wood
  • To make the remaining 3x Wood Plank we need a further Wood for a total of 2 Wood

So why does our query tell is that we only need 1x Wood?

Our query doesn’t discern between the different reasons our recipe tree needs Wood. In its ‘mind’:

  • To make 2x Sticks is 0.5 units of output of the Stick recipe (since it outputs 4x Sticks at a cost of 2x Wood Planks)
  • To make 2x Wood Planks is 0.5 units of output of the Wood Plank recipe (since it outputs 4x Wood Planks at a cost of 1x Wood) – so we need 0.5 * 0.5 = 0.25 units of Wood to produce 2x Sticks
  • To make the remaining 3x Wood Planks is 0.75 units of output of the Wood Plank recipe by the logic above – so we need 0.75 units of Wood to produce the remaining 3x Wood Planks
  • We therefore need 0.25 + 0.75 = 1x Wood in total

The query’s fallen down because it’s assuming we can run Recipes a fractional number of times and not handling the input quantities correctly as a result:

  • To produce Sticks in bundles of 4 requires 2x Wooden Planks – even though we only need 2x Stick we still need to burn 2x Wooden Planks to make them
    • It’s on this step that the earlier query really falls down
  • We need 3x Wooden Planks directly in the Wood Axe recipe for a total of 5x Wood Planks
  • To make 5x Wood Planks requires 2x runs of the Wood Plank recipe, each run of which consumes 1x Wood

An algorithm for application code

Let’s stop thinking about this in Cypher terms for a minute and design a way to walk the Recipe tree that’ll give us correct answers.

Suppose we maintain a table with three columns:

  • The name of a Resource
  • The number of that Resource that we’ve created running Recipes
  • The number of that Resource that we require to run the Recipes so far

Each time we want to obtain some resource, we’ll figure out if it’s produced by Recipes or a raw material.

If it’s a raw material, we just increase the ‘Required’ entry in the table for that Resource by the amount we need – we can’t make more of it via a Recipe.

If it’s produced by a recipe, we need to do a bit more work because there are three cases:

The ‘already in stock’ case

Say we’ve run the Sticks recipe once already for some other step, and consumed 1x Stick. The Sticks recipe outputs 4x Sticks, so our resource table might look like:

ResourceCreatedRequired
Stick41

We are now trying to make something else requiring a further 2x Stick. While there’s a Recipe for this, we don’t need to run it because we have a surplus of 3x Stick already.

In this case, we just increase the ‘Required’ column by the amount we need and don’t bother running the Recipe again.

The ‘entirely out of stock’ case

Say we’ve not run the Sticks recipe at all yet, but have a need of 6x Sticks for a Recipe upstream. We don’t have any Sticks so we’ll have to run the Sticks recipe. We need to run it twice, because the Recipe outputs 4x Sticks per run and we need 6.

We run the Recipe twice, each time adding its output to the table in the ‘Created’ column. We then add 6 to our ‘Required’ column for Stick to earmark some of that output for the upstream recipe:

ResourceCreatedRequired
Stick86

The ‘partly in stock’ case

Say we’ve run the Sticks recipe once, used 1x Stick and now need a further 4x Sticks. At the start, our table looks like this:

ResourceCreatedRequired
Stick41

The table says we have 3x Stick going spare just now, so we have a shortfall of 1x Stick. We need to run the Stick recipe once, and add its output to the table before allocating what we need. We run the recipe:

ResourceCreatedRequired
Stick81

Now there’s no shortfall, so we take the 4x Stick by increasing the Required column:

ResourceCreatedRequired
Stick85

We recursively do this until we hit raw-material nodes. Because raw materials cannot be produced but must be collected from the world, we just add them to the Required column. For example, to produce a Wooden Plank, we need 1x Wood. After running the Wooden Plank recipe, our table might look like this:

ResourceCreatedRequired
Wooden Plank41
Wood01

Once we’ve hit a raw material node we can’t recurse further – it’s the base-case for our recursion.

Then, getting a shopping list is as simple as finding the cases where the required number of a Resource is non-zero but the created number is zero – that is, we need some of that Resource, but we couldn’t create any.

Next steps

Next time we’ll take that algorithm and bake it into a quick Node command-line application where we can query for a given Resource and get back the list of ingredients we need to go and find. We’ll also see how the approach above lets us answer more interesting questions.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.