Creating a keyword list from text in PHP
PHPIn PHP, there are useful functions to handle array data. Especially, the function in_array() allows you to check
whether there is a specific value exists in an array.
In the PHP manual, the function in_array() is defined as:
in_array(mixed $needle, array $haystack [, bool $strict = FALSE ]):bool.
We can search for $needle in the array $haystack.
If $strict is set to TRUE, the type of the $needle is
also checked in the $haystack.
By using this function, we can create a keyword list from a string.
As shown in the code snippet below, the function getKeywords() extracts the keywords from the input string
$str into an array.
function getKeywords($str) {
//convert into lowercase
$str = strtolower($str);
//remove punctuation characters
$str = preg_replace('/[.,\/#!$%\^&\*;:{}=\-_`~()\[\]]/', '', $str);
//return an array of the string
$data = explode(' ', $str);
//define an empty array
$result = array();
for ($i = 0; $i < count($data); $i++) {
if (!in_array($data[$i], $result, true)) {
array_push($result, $data[$i]);
}
}
return $result;
}
In line 4, the function strtolower() converts all characters to lowercase.
In line 7, punctuation characters are removed by using the regular expression search and replace function
preg_replace(). After that, the words in the string $str
are extracted into an array $data. The return array $result
is then created for storing the keywords.
We use a loop to iterate over the array $data.
The function in_array() is used in line 15 to check each element of the array
$data. If it is not in the keyword array $result,
it is pushed onto the end of the array $result.
As shown in the code snippet below,
the function getKeywords() is called to extract the keywords
from the string $str.
$str = 'There is a cat, a dog, and a zebra.';
$keywords = getKeywords($str);
print_r($keywords);
The output will be:
Array ( [0] => there [1] => is [2] => a [3] => cat [4] => dog [5] => and [6] => zebra )