Вот рабочий скрипт с лучшим набором примеров строк, чтобы показать мои намерения
$strings[] = 'seventy five yards out';
$strings[] = 'sixty yards out';
$strings[] = 'one hundred fifty yards out';
$inputString = 'seventy two yards out';
$inputWords = str_word_count($inputString, 1);
$foundWords = [];
foreach ($strings as $key => $string) {
$stringWords = str_word_count($string, 1);
$wordsCount = array_count_values($stringWords);
$commonWords = array_intersect($inputWords, array_keys($wordsCount));
if (count($commonWords) > 0) {
foreach ($commonWords as $commonWord) {
$foundWords[$key][$commonWord] = $wordsCount[$commonWord];
}
}
}
print_r($foundWords);
Как заставить его напечатать «семьдесят пять ярдов», так как это будет самый близкий к тексту текст? Я думал о делении количества слов, чтобы получить процент, но теперь думаю, что теперь это может сработать ..
Ключ должен сделать str_word_count()
на каждую предоставленную строку отдельно. Таким образом, мы превращаемся в массивы и имеем дело с массивами намного проще для того, что вы хотите.
array_count_values()
подсчитывает значения массива, что приводит к наличию числа вхождений слов.
$strings[] = 'seventy five yards out';
$strings[] = 'sixty yards out';
$strings[] = 'one hundred fifty yards out';
$inputString = 'seventy two yards out';
$inputWords = str_word_count($inputString, 1);
$probabilities = [];
foreach ($strings as $key => $string) {
$stringWords = str_word_count($string, 1);
$wordsCount = array_count_values($stringWords);
$commonWords = array_intersect($inputWords, array_keys($wordsCount));
if (count($commonWords) > 0) {
foreach ($commonWords as $commonWord) {
if (!isset($probabilities[$key])) $probabilities[$key] = 0;
$probabilities[$key] += $wordsCount[$commonWord];
}
$probabilities[$key] /= count($stringWords);
}
}
arsort($probabilities);
echo $strings[key($probabilities)];
Выход:
seventy five yards out
Вероятности print_r($probabilities);
:
Array
(
[0] => 0.75
[1] => 0.66666666666667
[2] => 0.4
)
Примерно так должно работать:
<?php
$g = 'the weather is nice'; // strings to loop through
$n = 'the water is blue';
$b = 'that was a bad movie';
$t = 'hows the weather'; // example input
$test = (str_word_count($t, 1)); // breaks out each word into array
// Comparisons
$comps = array();
// Array sums
$sums = array();
// Search each variable that's been set, as long as it's less that 't'
// A "for" loop will accept letters in addition to numbers, so we'll start with the
// letter "a" and loop through each letter up to "s" (which is one less than "t")
for ($inc = 'a'; $inc < 't'; $inc++) {
// Now, a variable assigned as $$inc will translate into $a, $b, $c ... $s
// and if $a, $b, $c, etc, are set...
if (isset($$inc)) {
// ... assign them to the $comps array with a key of $$inc
$comps[$$inc] = str_word_count($$inc, 1);
// For example, when the "for" loop reaches "f", nothing will be added to the
// $comps array because $f is not set above.
// But when it gets to "g" it'll find that $g HAS been set, and that it has a
// value of "the weather is nice". At this point the $comps array will now look
// like this:
// $comps['the weather is nice'] = array('the', 'weather', 'is', 'nice');
// If you'd like to see this in action (since it might sound a little confusing),
// remove the # from the beginning of each of the following lines that start with #
// (there should be 10 total):
#print "<pre>The loop has reached the letter <b>{$inc}</b> for the value of ";
#print "<b>\$inc</b> and has found that <b>\${$inc}</b> HAS been set in the code.\n";
#print "Adding another dollar sign to <b>\$inc</b> has had the following effects:\n";
#print "- <b>\$inc</b> now looks like <b>\$\$inc</b> (from within the written part of the code)\n";
#print "- <b>\$\$inc</b> translates into <b>\${$inc}</b> (the variable that is acually being evaluated)\n";
#print "- <b>\${$inc}</b> evaluates to <b>{$$inc}</b>\n</pre>";
}
#else {
# print "<pre>The loop has reached the letter <b>{$inc}</b> for the value of <b>\$inc</b>";
# print " and has found that <b>\${$inc}</b> has NOT been set in the code, so it's being skipped.\n";
#}
}
// Avoid errors by checking if empty or not
if (!empty($comps)) {
foreach ($comps as $key => $comp) {
// Find intersections, if any
$candidates[$key] = array_intersect($test, $comp);
// Count the intersections
$counts[$key] = array_count_values($candidates[$key]);
// Add up the intersections
$sums[$key] = array_sum($counts[$key]);
}
}
$winner = '';
if (!empty($sums)) {
// Reverse sort $sums, putting the highest value first
arsort($sums);
// Flip $sums so we can extract the key
$flipped = array_flip($sums);
// Extract the first key off of $sums
$winner = array_shift($flipped);
}
print $winner;
Во-первых, ваш вопрос также касался количества случаев. Но когда вы явно пошли дальше, я почувствовал, что должен предложить другое решение.
similar_text()
функция!
$strings[] = 'sixty yards out';
$strings[] = 'seventy five yards out';
$strings[] = 'one hundred fifty yards out';
$inputString = 'seventy two yards out';
$p = 0;
$k = null;
foreach ($strings as $key => $string) {
similar_text($inputString, $string, $percent);
if ($percent > $p) {
$p = $percent;
$k = $key;
}
}
echo !is_null($k) ? $strings[$k] : "";
Выход:
seventy five yards out