I must say, this came as a shock to me, until I looked at the results and thought about it for a minute. Before jumping in, let me explain how and where I encountered this. I was recently working on a project called Deja Scene – The actor to actor movie database. The purpose of this site is to find two actors who have starred in multiple movies together.
Because the nature of the algorithm an actor can be either on the left side of the connection or the right side.
Let’s look at an example, if I search for George Clooney, I will find that he has 372 Deja Scene connections. Here was the original query I used to retrieve the list of his connections:
[code]
SELECT *
FROM `actor2_actors`
WHERE actor1_id =1
OR actor2_id =1
[/code]
Seems pretty straight forward; however, I was seeing really slow performance and I couldn’t figure out why. I created an index on both the actor1_id and the actor2_id, so I assumed Mysql was properly indexing this. Boy was I wrong. Look at the results when I use the EXPLAIN function: