SlideShare a Scribd company logo
In a League of their Own:
Neo4j and Premiership Football
Mark Needham
@markhneedham
•
•
•
•
•
•

Intro to graphs
When do we need a graph?
Property graph model
Neo4j’s query language
The football graph
Using Neo4j from .NET

Outline
Let’s talk graphs
Dancing With
Michael Jackson

Eating Brains

You mean these?
Dancing With
Michael Jackson

Eating Brains

Nope!
Node

Relationship

Ok so what’s a graph then?
The tube
The social network (graph)
Complexity
What are graphs good for?
complexity = f(size, semi-structure, connectedness)

Data Complexity
Size
complexity = f(size

, semi-structure, connectedness)

The Real Complexity
Semi-Structure
USER_ID

FIRST_NAME

LAST_NAME

EMAIL_1

EMAIL_2

FACEBOOK

315

Mark

Needham

mark.needham@neotech
nology.com

m.h.needham@gmail.com

NULL

Email: mark.needham@neotechnology.com
Email: m.h.needham@gmail.com
Twitter: @markhneedham
Skype: mk_jnr1984

TWITTER
@markhneedham

CONTACT

USER

Semi-Structure

CONTACT_TYPE

SKYPE
mk_jnr1984
complexity = f(size

, semi-structure, connectedness)

The Real Complexity
Connectedness
Connectedness
Connectedness
Densely Connected

Semi Structured

When do we need a graph?
Lots of join tables
Densely connected?
Lots of sparse
tables
Semi-Structured?
• Millions of ‘joins’ per second
• Consistent query times as dataset
grows
• Join Complexity and Performance
• Easy to evolve data model
• Easy to ‘layer’ different types of
data together

Properties of graph databases
Property Graph Data Model
Nodes
• Used to represent entity attributes and/or metadata
(e.g. timestamps, version)
• Key-value pairs
• Java primitives
• Arrays
• null is not a valid value
• Every node can have different properties

Nodes can have properties
What’s a node?
Relationships
• Relationships are first class citizens
• Every relationship has a name and a direction
– Add structure to the graph
– Provide semantic context for nodes

• Properties used to represent quality or weight
of relationship, or metadata
• Every relationship must have a start node and
end node

Relationships
Nodes can be connected by
more than one relationship

Nodes can have more
than one relationship
Self relationships are allowed

Relationships
Labels
Think Gmail labels
• Nodes
– Entities

• Relationships
– Connect entities and structure domain

• Properties
– Entity attributes, relationship qualities, and
metadata

• Labels
– Group nodes by role

Four Building Blocks
Purposeful abstraction of a domain designed to
satisfy particular application/end-user goals

Models
Model
Query

Design for Queryability
Model

Design for Queryability
Model
Query

Design for Queryability
• Declarative Pattern-Matching language
• SQL-like syntax
• Designed for graphs

Introducing Cypher
A

B

C

Patterns, patterns, everywhere
a

b

(a) --> (b)
It’s all about the ASCII art!
MATCH (a)-->(b)
RETURN a, b

a

b

The most basic query
a

ACTED IN

m

(a)–[:ACTED_IN]->(m)
Adding in a relationship type
MATCH (a)-[:ACTED_IN]->(m)
RETURN a.name, m.name

a

ACTED IN

m

Adding in a relationship type
The football graph
The football graph
Find Arsenal’s away matches
Find Arsenal’s away matches
MATCH (team:Team)<-[:away_team]-(game)
WHERE team.name = "Arsenal"
RETURN game

Find Arsenal’s away matches
MATCH (team:Team)<-[:away_team]-(game)
WHERE team.name = "Arsenal"
RETURN game.name

Graph Pattern
MATCH (team:Team)<-[:away_team]-(game)
WHERE team.name = "Arsenal"
RETURN game.name

Anchor pattern in graph
MATCH (team:Team)<-[:away_team]-(game)
WHERE team.name = "Arsenal"
RETURN game.name

Create projection of results
Find Arsenal’s away matches
Evolving the football graph
Find the top away goal scorers
MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),
(team)<-[:for]-(stats)<-[:played]-(player),
(stats)-[:in]->(game)
WHERE season.name = "2012-2013"
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 10

Find the top away goal scorers
MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),
(team)<-[:for]-(stats)<-[:played]-(player),
(stats)-[:in]->(game)
WHERE season.name = "2012-2013"
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 10

Multiple graph patterns
MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),
(team)<-[:for]-(stats)<-[:played]-(player),
(stats)-[:in]->(game)
WHERE season.name = "2012-2013"
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 10

Anchor pattern in the graph
MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),
(team)<-[:for]-(stats)<-[:played]-(player),
(stats)-[:in]->(game)
WHERE season.name = "2012-2013"
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 10

Group by player
Find the top away goal scorers
• Goals scored in each month by
Michu
• Tottenham results when Gareth Bale
scores
• What did Wayne Rooney do in April?
• Which players only score when a
game is televised?

Other football queries
Graph Query Design
The relational version
Relational

Graphs

Tables

Nodes
- no need to set a property if it

- assume records all have the
same structure

doesn’t exist

Foreign keys between tables Relationships
- joins calculated at run time
- stored as a ‘Pre-computed
- the more tables you join to a
query the slower the query gets

index’ at write time
- very easy to do lots of ‘hops’
between relationships

Graph vs Relational
Neo4j Server

Application
H
T
T
P
REST Client

.NET and Neo4j
Neo4j Server

Application
H
T
T
P

Neo4jClient
REST Client

.NET and Neo4j
.NET and Neo4j
.NET and Neo4j
.NET and Neo4j
.NET and Neo4j
.NET and Neo4j
Thinking in graphs
Graphs should be fun!
Last Wednesday of the month

Ask for help if you get stuck
www.graphdatabases.com

Come take a copy, it’s free!
Mark Needham
@markhneedham
mark.needham@neotechnology.com

Questions?

More Related Content

Similar to Football graph - Neo4j and the Premier League (20)

PDF
The Football Graph - Neo4j and the Premier League
Mark Needham
 
PDF
The 2nd graph database in sv meetup
Joshua Bae
 
PPTX
Introduction to SQL Server Graph DB
Greg McMurray
 
PDF
Graph Search: The Power of Connected Data
Codemotion
 
PDF
Tactical data engineering
Julian Hyde
 
PPTX
Graph db - Pramati Technologies [Meetup]
Pramati Technologies
 
PDF
managing big data
Suveeksha
 
PDF
Neo4j Graph Database และการประยุกตร์ใช้
Chakrit Phain
 
PPS
Introduction to Bootstrap: Design for Developers
Melvin John
 
PDF
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
jaxLondonConference
 
PDF
Data Modeling with Neo4j
Neo4j
 
PPTX
Intro to Cypher
Brian Underwood
 
PPT
Cypher
Max De Marzi
 
PDF
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...
Databricks
 
PDF
Designing and Building a Graph Database Application – Architectural Choices, ...
Neo4j
 
PDF
Introduction to Graphs with Neo4j
Neo4j
 
PPTX
The openCypher Project - An Open Graph Query Language
Neo4j
 
PPTX
Optimizing Cypher Queries in Neo4j
Neo4j
 
PPTX
GDM 2011 Talk
Sameh Elnikety
 
PPTX
Graphs fun vjug2
Neo4j
 
The Football Graph - Neo4j and the Premier League
Mark Needham
 
The 2nd graph database in sv meetup
Joshua Bae
 
Introduction to SQL Server Graph DB
Greg McMurray
 
Graph Search: The Power of Connected Data
Codemotion
 
Tactical data engineering
Julian Hyde
 
Graph db - Pramati Technologies [Meetup]
Pramati Technologies
 
managing big data
Suveeksha
 
Neo4j Graph Database และการประยุกตร์ใช้
Chakrit Phain
 
Introduction to Bootstrap: Design for Developers
Melvin John
 
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
jaxLondonConference
 
Data Modeling with Neo4j
Neo4j
 
Intro to Cypher
Brian Underwood
 
Cypher
Max De Marzi
 
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...
Databricks
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Neo4j
 
Introduction to Graphs with Neo4j
Neo4j
 
The openCypher Project - An Open Graph Query Language
Neo4j
 
Optimizing Cypher Queries in Neo4j
Neo4j
 
GDM 2011 Talk
Sameh Elnikety
 
Graphs fun vjug2
Neo4j
 

More from Mark Needham (13)

PDF
Neo4j GraphTour: Utilizing Powerful Extensions for Analytics and Operations
Mark Needham
 
PDF
This week in Neo4j - 3rd February 2018
Mark Needham
 
PDF
Building a recommendation engine with python and neo4j
Mark Needham
 
PDF
Graph Connect: Tuning Cypher
Mark Needham
 
PDF
Graph Connect: Importing data quickly and easily
Mark Needham
 
PDF
Graph Connect Europe: From Zero To Import
Mark Needham
 
PDF
Optimizing cypher queries in neo4j
Mark Needham
 
PPTX
Scala: An experience report
Mark Needham
 
PPTX
Visualisations
Mark Needham
 
PPTX
Mixing functional programming approaches in an object oriented language
Mark Needham
 
PPT
Mixing functional and object oriented approaches to programming in C#
Mark Needham
 
PPT
Mixing functional and object oriented approaches to programming in C#
Mark Needham
 
PDF
F#: What I've learnt so far
Mark Needham
 
Neo4j GraphTour: Utilizing Powerful Extensions for Analytics and Operations
Mark Needham
 
This week in Neo4j - 3rd February 2018
Mark Needham
 
Building a recommendation engine with python and neo4j
Mark Needham
 
Graph Connect: Tuning Cypher
Mark Needham
 
Graph Connect: Importing data quickly and easily
Mark Needham
 
Graph Connect Europe: From Zero To Import
Mark Needham
 
Optimizing cypher queries in neo4j
Mark Needham
 
Scala: An experience report
Mark Needham
 
Visualisations
Mark Needham
 
Mixing functional programming approaches in an object oriented language
Mark Needham
 
Mixing functional and object oriented approaches to programming in C#
Mark Needham
 
Mixing functional and object oriented approaches to programming in C#
Mark Needham
 
F#: What I've learnt so far
Mark Needham
 
Ad

Recently uploaded (20)

PDF
World Cup Welcome the World to the FIFA 2026.pdf
FIFA World Cup Tickets
 
PPTX
Best Fan Experience Pass me .pptxxxxxxxx
counterten
 
PDF
Blue Mind Bodysurfing - Fringe Magazine Volume 1, 2025
Spencer Dunlap
 
PPTX
Breaking Down the Battle: Tallon vs. Nepveu in a Micro Max Showdown for the Age.
Jeremy Tallon
 
PPTX
A Sport Fort, get an insight on Badminton.pptx
asmiparents
 
DOCX
British and Irish Lions A Second Chance in Melbourne for Jones.docx
eticketing
 
PPTX
Jared Gersun – Rising Tennis Star from Australia Breaking Limits at Just 16
jaredgersun22
 
PPTX
Jared Gersun – Rising Tennis Star from Australia Breaking Limits at Just 16
jaredgersun22
 
DOCX
Soccer World Cup Tickets Messi’s Argentina Set for Finalissima Showdown with ...
Worldwideticketsandhospitality
 
PDF
World Cup FIFA president Gianni Infantino wants soccer to be the No. 1 sport ...
FIFA World Cup Tickets
 
PDF
World Cup Vancouver FIFA 2026 Planning Moves Forward with Anticipation.pdf
FIFA World Cup Tickets
 
PDF
Bray Wanderers Football Club 2025 Commercial Deck
getinvolved2
 
DOCX
Trump’s Policies Unlikely to Derail FIFA 2026 Plans.docx
FIFA World Cup Tickets
 
DOCX
FIFA World Cup Tickets Cristiano Ronaldo Tipped for Glory by Former Coach.docx
FIFA World Cup Tickets
 
PDF
Essence of Bodysurfing: Exploring the Cultural History of an Ancient Aquatic ...
Spencer Dunlap
 
PDF
Start Your Diving Journey - Be An Open Water Diver in Andaman
Seahawks Scuba
 
PDF
Best Fan Experience Pass me .pdfxxxxxxxx
counterten
 
DOCX
FIFA 2026 Set to Make History as Soccer Surges in America.docx
FIFA World Cup Tickets
 
DOCX
Felix's Al Nassr Move Tied to Soccer World Cup, Ronaldo.docx
maqsoodbhatti2266
 
PDF
FIFA World Cup Tickets Red Stars of Turkiye Shine on Global Stage.pdf
Worldwideticketsandhospitality
 
World Cup Welcome the World to the FIFA 2026.pdf
FIFA World Cup Tickets
 
Best Fan Experience Pass me .pptxxxxxxxx
counterten
 
Blue Mind Bodysurfing - Fringe Magazine Volume 1, 2025
Spencer Dunlap
 
Breaking Down the Battle: Tallon vs. Nepveu in a Micro Max Showdown for the Age.
Jeremy Tallon
 
A Sport Fort, get an insight on Badminton.pptx
asmiparents
 
British and Irish Lions A Second Chance in Melbourne for Jones.docx
eticketing
 
Jared Gersun – Rising Tennis Star from Australia Breaking Limits at Just 16
jaredgersun22
 
Jared Gersun – Rising Tennis Star from Australia Breaking Limits at Just 16
jaredgersun22
 
Soccer World Cup Tickets Messi’s Argentina Set for Finalissima Showdown with ...
Worldwideticketsandhospitality
 
World Cup FIFA president Gianni Infantino wants soccer to be the No. 1 sport ...
FIFA World Cup Tickets
 
World Cup Vancouver FIFA 2026 Planning Moves Forward with Anticipation.pdf
FIFA World Cup Tickets
 
Bray Wanderers Football Club 2025 Commercial Deck
getinvolved2
 
Trump’s Policies Unlikely to Derail FIFA 2026 Plans.docx
FIFA World Cup Tickets
 
FIFA World Cup Tickets Cristiano Ronaldo Tipped for Glory by Former Coach.docx
FIFA World Cup Tickets
 
Essence of Bodysurfing: Exploring the Cultural History of an Ancient Aquatic ...
Spencer Dunlap
 
Start Your Diving Journey - Be An Open Water Diver in Andaman
Seahawks Scuba
 
Best Fan Experience Pass me .pdfxxxxxxxx
counterten
 
FIFA 2026 Set to Make History as Soccer Surges in America.docx
FIFA World Cup Tickets
 
Felix's Al Nassr Move Tied to Soccer World Cup, Ronaldo.docx
maqsoodbhatti2266
 
FIFA World Cup Tickets Red Stars of Turkiye Shine on Global Stage.pdf
Worldwideticketsandhospitality
 
Ad

Football graph - Neo4j and the Premier League

Editor's Notes

  • #2: In this talk, we&apos;ll look at how graph data and Neo4j can be used to model the English Premier League. We&apos;ll see how the graph model and Cypher query language makes it natural and fun to query multidimensional semi-structured data. We&apos;ll also see how graphs encourage discoverability so that we can spot interesting correlations and become king of the arcane football facts (e.g. how many goals have been scored at grounds in the North West of England by players originating from South America) at your local pub quiz. We&apos;ll also see what the graph model would look like if modeled in a relational way and show where the approach reaches its limits and the graph addresses and resolves those challenges.
  • #4: Let’s get started and talk about graphs. Now in this context we’re thinking more of what are sometimes known as networks and…
  • #5: …many people when they hear the word graph think of this.
  • #6: Which isn’t what we’re going to be talking about today!
  • #7: It’s not a new thing, you’ll already be familiar with lots of things that are graphs but perhaps you don’t know it yet. The London tube is perhaps the most famous example that Londoners at least use every day
  • #8: It’s not a new thing, you’ll already be familiar with lots of things that are graphs but perhaps you don’t know it yet. The London tube is perhaps the most famous example that Londoners at least use every day
  • #9: Or if not then you’ve certainly heard of the social network (graph)
  • #17: An organisational hierarchy is a common model
  • #18: An organisational hierarchy is a common model
  • #19: Or of course as we mentioned earlier, a social network of friends of friends and so on is a popular graph
  • #22: Null values all over the place
  • #24: Now, as I say, graph databases allow you to store, manage and query your data as a graph. Neo4j adopts a very particular graph model, which we call the property graph model.So I’m going to spend the next few minutes talking about the important aspects of this model in more detail.In fact, I’m going to talk about the enhanced property graph model, which will be available in Neo4j 2.0 sometime later this year.
  • #29: Pointer in memory and ultimately on disk
  • #31: Analogy: Gmail labels. Every mail can have zero or more labels attached. Allow you to associate filters with groups of emails.
  • #34: Always motivated by needs, problems, goals: not transparent window onto realityC18: Seven Bridges of KönigsbergGoal: Find path through the city that crosses each bridge once and once only
  • #38: Which leads us perfectly into neo4j’s query language
  • #44: Football is quite a nice domain for
  • #45: Football is quite a nice domain for modelling in graphs because the data has a lot of dimensions to it
  • #46: Football is quite a nice domain for
  • #47: Football is quite a nice domain for
  • #48: Football is quite a nice domain for
  • #49: Football is quite a nice domain for
  • #50: Football is quite a nice domain for
  • #51: Football is quite a nice domain for
  • #52: Football is quite a nice domain for
  • #53: Football is quite a nice domain for
  • #54: Football is quite a nice domain for
  • #55: Football is quite a nice domain for
  • #56: Football is quite a nice domain for
  • #57: Football is quite a nice domain for
  • #58: Football is quite a nice domain for
  • #59: Football is quite a nice domain for
  • #60: Football is quite a nice domain for
  • #61: -&gt; SQL - define your tables and relationships and generally don’t change that.Might denormalise or add indexes to speed up queries-&gt; Graphs – define your initial nodes and relationships. May then add ‘layers’ to the graph to make implicit relationships explicit
  • #62: Football is quite a nice domain for
  • #63: How is this different to a relational database? We have tables (nodes) and foreign keys between tables (relationships)Those are calculated at run time – in a graph a relationship is a first Class citizen. Effectively a pre-computed indexYou can also traverse lots of ‘hops’ which becomes quite expensive when You do
  • #72: If it’s not fun and It seems cumbersome then perhaps it’s the wrong tool for that particular data problem or it’s modeled in the wrong way. Might be worth asking
  • #73: Might be worth asking for help if that isn’t happening or you’re stuck. We have a good community on Stack Overflow and a mailing list as well. You’ll get answers to any questions you have pretty quickly.
  • #74: Please take a copy of t
  • #75: Might be worth asking for help if that isn’t happening or you’re stuck. We have a good community on Stack Overflow and a mailing list as well. You’ll get answers to any questions you have pretty quickly.