Knowledgebase

PHP + mySQL - Querying database for "similar" results.

Posted by crEA-tEch, 07-01-2009, 06:29 AM
Hi all, I'm not sure how to go about doing this but here goes: Lets say I have a standard mySQL database like so: Now, I want to take a random entry - say this one: ... and I want the database to query the table and output all the other entries in order of SIMILARITY to the above. Do you get what I mean? - Is this possible to do? - if so, is there such thing as a "SIMILAR" function/syntax? I would appreciate any help, no matter how small Thanks, Nick

Posted by Dark Light, 07-01-2009, 09:16 AM
How do you define similar?

Posted by jstanden, 07-01-2009, 02:15 PM
If you're talking about sounding similar, MySQL supports basic soundex functionality: http://dev.mysql.com/doc/refman/5.0/...or_sounds-like You could also do something like the Levenshtein distance at the application level if you have a small dataset: http://en.wikipedia.org/wiki/Levenshtein_distance As mentioned above, it would be helpful to know how you would like to define similarity.

Posted by tim2718281, 07-01-2009, 04:00 PM
1) You need to define a "distance" function that takes two rows and computes a distance between them. For example, the "distance" may be the number of characters in row 2 would change to change to make it identical to row 1. Or it may be any other function that makes sense to you. 2) Then you need to select a row, and calculate the distance of each other row from it 3) Then you need to display the rows in order of the distance result.

Posted by Nich, 07-01-2009, 07:11 PM
You have to define what "similar" is. For instance, take your example. Are you saying...similar as in names that start with "J"? OR similar as in names that are 4 letters long? OR similar as in preferred payment type? OR similar as in whether they like cake? Are you getting my drift? Using common sense, I am assuming that you mean those similar to John with preferred payment types of Cheque and also likes cake. In which case, there's no way to have a universal "similar" function since you're the one who is defining it. Which one weighs more to you, preferred payment type or cake? Based on that, you can look for all those who match payment types AND cake, then sort on those that have matching payment types with no match on cake, and then matching cakes with no match on payment type. Does that make sense?

Posted by dexxtreme, 07-02-2009, 04:53 PM
You have to remember, computers are both really smart and really dumb at the same time. They will do exactly what you tell it do do (whether or not you told it the right thing is a different story), but trying to make it do anything else on its own (such as determining "similarity") is completely futile. In this case, you want it to order the results in a non-standard way. MySQL doesn't do that. You can order the results by one (or more) of the columns in either ascending or descending order, and maybe use a built-in function for sorting data, but that's about it. What you really need to do is select all of the data, load it into your PHP script, and have the PHP script try to make sense of it according to some terms that you have to define in your code.

Posted by tim2718281, 07-02-2009, 06:38 PM
It can be better to do some work within the database server rather than have it transmit all the data to another program. For example, it would be possible to define a new table copied from the old with a "distance" column added, where "distance" is defined by some SQL function; then a query can return selected rows from the new table. This approach reduces the amount of data transmitted between the SQL server and the application program; if they are running on separate hardware, that can make a very large difference to execution time.

Posted by dexxtreme, 07-03-2009, 12:16 AM
True, assuming that an appropriate "distance" function can be defined in SQL to run faster than it would in PHP.



Was this answer helpful?

Add to Favourites Add to Favourites

Print this Article Print this Article

Also Read
vbulletin site down (Views: 604)
PowerVPS Nodes Down (Views: 629)


Language:

Client Login

Email

Password

Remember Me

Search