This post will explain graph theory in as simple way as possible. This is not a complete tutorial and lacks a lot of informations. Please provide comments on how I can improve it.
A graph is a structure comprising of nodes and edges. Nodes are basically objects and edges represent the relations between them.
Consider the following problem. Suppose there are n islands and there are some roads connecting certain pairs between them. We have to find whether we can use the roads (possibly more than one different roads) to reach from one island to another.
Let us denote each island as a circle with a number on it (The number is to distinguish each island). The lines joining the circles are the roads. Take a look at the picture for clarification.
Now the above problem can be solved easily as we can visualize the entire thing. We can go to island 4 from island 5 (5 -> 1 -> 4), but not to island 7. We can go to island 7 from 6 but can not reach there any other way. We can not go to island 3 from any other island.
In graph theory, the islands are represented by nodes and the roads by edges. Node and edge are graph theory terminologies. We can represent any object as nodes and the relation between them as edge. In programming, obviously we cannot use visual representations. We need some other technique.Two techniques commonly in use are adjacency matrix and adjacency list form. Lets discuss them one by one.
adjacency matrix:
Consider there are n nodes (i.e. n different objects). In this method, we use a 2D array of size nXn. For each cell (i,j), we put 1 if there is a link between node i+1 and j+1 (Plus 1 because array uses zero based indexing and we have nodes starting from 1).
The adjacency matrix for the above island problem is
0 1 0 1 1 0 0
1 0 0 1 0 0 0
0 0 0 0 0 0 0
1 1 0 0 0 0 0
1 0 0 0 0 0 0
0 0 0 0 0 0 1
0 0 0 0 0 1 0
Here, we have considered a node is not related to itself. Had it been the case, we would have put 1 in (i,j) cell where i==j.
Another thing to notice is that, here if node i is related to node j, then node j is also related to node i (i.e. the roads are bi-directional). So the adjacency matrix is symmetrical about the diagonal. This type of graph is called undirected graph. If it had not been the case, and if node i being related to node j didn't imply node j is related to node i, such a graph would be called directed graph. In visual representation, instead of a line between two nodes, an arrow is used that points from node i to node j, if i is related to node j. Adjacency matrix of such a graph is certainly not symmetrical.
As can be seen from the above adjacency matrix, such a representation takes O(nXn) i.e. O(n^2) size. This method is applicable when number of nodes is small (About 2000 for most programming contests (Give or take a few nodes)). As number of nodes increases, we need to use adjacency list representation.
adjacency list:
In this method, we create an array of linked list. Each node of the array represents each node of the graph and the linked list in it represents all the nodes that are related to the present node.
In our example, node 1 is related to node 2, 4 and 5. Node 2 is related to node 1 and 4 while node 3 is not related to any other nodes.
1) -> 2 -> 4 -> 5
2) -> 1 -> 4
3)
4) -> 1 -> 2
5) -> 1
6) -> 7
7) -> 6
2) -> 1 -> 4
3)
4) -> 1 -> 2
5) -> 1
6) -> 7
7) -> 6
In programming language like c++, an array of vector or list is used for this representations.
Also, as can be seen above, there is no need to distinguish between directed and undirected graph. Also, a lot of space is saved. Maximum space required is O(n*e) where e represents maximum number of edges.
This is all the theory needed in introduction. There are lots of more terminologies used in graph theory. We will see them in subsequent posts when we need them. Also, check the next post on graph traversal for information on how to implement adjacency matrix and list.