Creating a keyword list from text in PHP
PHPIn PHP, there are useful functions to handle array data. Especially, the function in_array()
allows you to check
whether there is a specific value exists in an array.
In the PHP manual, the function in_array()
is defined as:
in_array(mixed $needle, array $haystack [, bool $strict = FALSE ]):bool
.
We can search for $needle
in the array $haystack
.
If $strict
is set to TRUE, the type of the $needle
is
also checked in the $haystack
.
By using this function, we can create a keyword list from a string.
As shown in the code snippet below, the function getKeywords()
extracts the keywords from the input string
$str
into an array.
function getKeywords($str) {
//convert into lowercase
$str = strtolower($str);
//remove punctuation characters
$str = preg_replace('/[.,\/#!$%\^&\*;:{}=\-_`~()\[\]]/', '', $str);
//return an array of the string
$data = explode(' ', $str);
//define an empty array
$result = array();
for ($i = 0; $i < count($data); $i++) {
if (!in_array($data[$i], $result, true)) {
array_push($result, $data[$i]);
}
}
return $result;
}
In line 4, the function strtolower()
converts all characters to lowercase.
In line 7, punctuation characters are removed by using the regular expression search and replace function
preg_replace()
. After that, the words in the string $str
are extracted into an array $data
. The return array $result
is then created for storing the keywords.
We use a loop to iterate over the array $data
.
The function in_array()
is used in line 15 to check each element of the array
$data
. If it is not in the keyword array $result
,
it is pushed onto the end of the array $result
.
As shown in the code snippet below,
the function getKeywords()
is called to extract the keywords
from the string $str
.
$str = 'There is a cat, a dog, and a zebra.';
$keywords = getKeywords($str);
print_r($keywords);
The output will be:
Array ( [0] => there [1] => is [2] => a [3] => cat [4] => dog [5] => and [6] => zebra )