BEIDER MORSE PHONETIC MATCHING (BMPM) RELEASE NOTES /* * * Copyright Alexander Beider and Stephen P. Morse, 2008 * * This file is part of the Beider-Morse Phonetic Matching (BMPM) System. * BMPM is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * BMPM is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with BMPM. If not, see . * */ For questions, contact either of the following: Alexander Beider (albeider@yahoo.fr) Stephen Morse (steve@stevemorse.org) 1. Files needed // for BM phonetic matching phoneticutils.php phoneticengine.php gen subfolder (generic for all names) languagenames.php lang.php approxcommon.php exactcommon.php exactapproxcommon.php hebrewcommon.php rulesXXX.php approxXXX.php exactXXX.php where XXX = any, cyrillic, czech, dutch, english, french, german, greek, greeklatin, hebrew, hungarian, italian, polish, portuguese, romanian, russian, spanish, turkish ash subfolder (optimized for ashkenazi names) languagenames.php lang.php approxcommon.php exactcommon.php exactapproxcommon.php hebrewcommon.php rulesXXX.php approxXXX.php exactXXX.php where XXX = any, cyrillic, english, french, german, hebrew, hungarian, polish, romanian, russian, spanish sep subfolder (optimized for sephardic names) languagenames.php lang.php approxcommon.php exactcommon.php exactapproxcommon.php hebrewcommon.php rulesXXX.php approxXXX.php exactXXX.php where XXX = any, french, hebrew, italian, portuguese, spanish // for DM soundex matching dmsoundex.php dmlat.php 2. Batch encoding of all names in database Note: If you have generated your PHP search engine using the one-step search application generator, you encode the names in your database using the intructions in question 410 on the FAQ page. The instructions in this section do not apply to you. Use a modified version of batchSample.php. That script has very specific input/output file format, and you will probably want to modify the script to accomodate your particular format. The invocaton of this script is as follows: batch.php?inputFileName=XXX&outputFileName=yyy Each line of input File XXX is of the format name name ... name Each line of output file YYY is of the format name\tBMCODE\tDMCODE 3. Code to be added to your PHP search engine The easiest approach is to place the phonetic routines (phoneticengine.php and phoneticutils.php) in the same folder as your search engine. Alternatively, you can place the phonetic routines in some other folder of your choice. In either case, the gen, ash, and sep directories are to be subfolders of the folder in which you place the phonetic routines. If you have generated your PHP search engine using the one-step search-application generator, you don't need to add any special code to your search engine. However, if you've chosen not to place the phonetic routines in the same folder as your search engine, you will need to tell the search engine where they are. You do that by modifying the $folder variable in your search engine accordingly. If you have written your PHP search engine from scratch, then you will need to include the code below into it at an appropriate place. This code assumes the you have placed the phonetic routines in the same folder as your search engine. If you've placed them somewhere else, you will have to modify the paths in the include statements accordingly. // encode name using BM phonetic code // assume name to be encoded is in $name, can be in utf-8 or in html-ampersand notation require_once "phoneticutils.php"; require_once "phoneticengine.php"; require_once "gen/lang.php"; require_once "gen/approxcommon.php"; for ($i=0; $i