Categories:

String & Binary Functions

SOUNDEX

Returns a string that contains a phonetic representation of the input string.

This function is typically used to help determine whether two strings, such as the family names Levine and Lavine, or the words to and too, have similar English-language pronunciation.

Syntax

SOUNDEX( <varchar_expr> )

Arguments

varchar_expr

The string for which a representation of the pronunciation is returned. The string should use the Latin or Unicode character set.

Returns

The returned value is a VARCHAR that contains the phonetic representation of the input string. In other words, the return value is a string (not a sound) that represents the pronunciation, rather than the spelling, of the input string.

The returned value starts with a letter that represents the first letter in the string followed by 3 digits (e.g. s400, c130).

For more information about how the return value is calculated, see the Soundex phonetic algorithm (in Wikipedia).

Usage Notes

  • Because the function returns only four characters (one letter and three digits), the output is primarily determined by the first few syllables of the input, rather than the entire string.

    For example, the following statement compares three strings and returns the same SOUNDEX value for each string because, even though they have completely different spellings and meanings, they start with phonetically similar syllables:

    SELECT SOUNDEX('I love rock and roll music.'),
           SOUNDEX('I love rocks and gemstones.'),
           SOUNDEX('I leave a rock wherever I go.');
    +----------------------------------------+--------------------------+------------------------------------------+
    | SOUNDEX('I LOVE ROCK AND ROLL MUSIC.') | SOUNDEX('I LOVE ROCKS.') | SOUNDEX('I LEAVE A ROCK WHEREVER I GO.') |
    |----------------------------------------+--------------------------+------------------------------------------|
    | I416                                   | I416                     | I416                                     |
    +----------------------------------------+--------------------------+------------------------------------------+
    

Examples

The following query returns SOUNDEX values for two names that are spelled differently, but are typically pronounced similarly:

SELECT SOUNDEX('Marks'), SOUNDEX('Marx');
+------------------+-----------------+
| SOUNDEX('MARKS') | SOUNDEX('MARX') |
|------------------+-----------------|
| M620             | M620            |
+------------------+-----------------+

The following query demonstrates how to use SOUNDEX to find potentially related rows in different tables:

Create and load the tables:

CREATE TABLE sounding_board (v VARCHAR);
CREATE TABLE sounding_bored (v VARCHAR);
INSERT INTO sounding_board (v) VALUES ('Marsha');
INSERT INTO sounding_bored (v) VALUES ('Marcia');

Look for related records without SOUNDEX:

SELECT * 
    FROM sounding_board AS board, sounding_bored AS bored 
    WHERE bored.v = board.v;
+---+---+
| V | V |
|---+---|
+---+---+

Look for related records using SOUNDEX:

SELECT * 
    FROM sounding_board AS board, sounding_bored AS bored 
    WHERE SOUNDEX(bored.v) = SOUNDEX(board.v);
+--------+--------+
| V      | V      |
|--------+--------|
| Marsha | Marcia |
+--------+--------+