Bing сопоставляет порт алгоритма распаковки геоданных с переполнением стека

Question

Bing сопоставляет порт алгоритма распаковки геоданных с переполнением стека

Я пытаюсь перенести алгоритм распаковки Microsoft на PHP с Java (или, возможно, на C ++ или C #, поскольку это Microsoft). Это алгоритм, который берет свои сжатые данные формы из результатов API Bod Maps Geodata и расширяет их в координаты широта / долгота. Они разместили свой алгоритм на своем сайте в https://msdn.microsoft.com/en-us/library/dn306801.aspx

У меня есть список координат, хранящихся в моей базе данных, и я пытаюсь получить массив координат, которые определяют многоугольник для работы с формой. Мои результаты отличаются. Кто-нибудь может указать на расхождения между двумя?

РЕДАКТИРОВАТЬЯ считаю, что моя проблема заключается в том, что PHP не обрабатывает целые числа типа LONG, и при выполнении побитовых операций происходит потеря точности. Мне может потребоваться преобразовать некоторые операции для использования BCMath. Помощь здесь?

Алгоритм декомпрессии (Microsoft)

public const string safeCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-";

private static bool TryParseEncodedValue(string value, out List<Coordinate> parsedValue)
{
parsedValue = null;
var list = new List<Coordinate>();
int index = 0;
int xsum = 0, ysum = 0;

while (index < value.Length)        // While we have more data,
{
long n = 0;                     // initialize the accumulator
int k = 0;                      // initialize the count of bits

while (true)
{
if (index >= value.Length)  // If we ran out of data mid-number
return false;           // indicate failure.

int b = safeCharacters.IndexOf(value[index++]);

if (b == -1)                // If the character wasn't on the valid list,
return false;           // indicate failure.

n |= ((long)b & 31) << k;   // mask off the top bit and append the rest to the accumulator
k += 5;                     // move to the next position
if (b < 32) break;          // If the top bit was not set, we're done with this number.
}

// The resulting number encodes an x, y pair in the following way:
//
//  ^ Y
//  |
//  14
//  9 13
//  5 8 12
//  2 4 7 11
//  0 1 3 6 10 ---> X

// determine which diagonal it's on
int diagonal = (int)((Math.Sqrt(8 * n + 5) - 1) / 2);

// subtract the total number of points from lower diagonals
n -= diagonal * (diagonal + 1L) / 2;

// get the X and Y from what's left over
int ny = (int)n;
int nx = diagonal - ny;

// undo the sign encoding
nx = (nx >> 1) ^ -(nx & 1);
ny = (ny >> 1) ^ -(ny & 1);

// undo the delta encoding
xsum += nx;
ysum += ny;

// position the decimal point
list.Add(new Coordinate { Latitude = ysum * 0.00001, Longitude = xsum * 0.00001 });
}

parsedValue = list;
return true;
}

Мой алгоритм декомпрессии (PHP)

function tryParseEncodedValue($value) {
$value = 'vx1vilihnM6hR7mEl2Q';
var_error_log($value);
$safeCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-";
$list = array();
$index = 0;
(int)$xsum = 0;
(int)$ysum = 0;

while ($index < strlen($value))   // While we have more data,
{
$n = 0;                       // initialize the accumulator
$k = 0;                       // initialize the count of bits

while (true)
{
if ($index >= strlen($value)) // If we ran out of data mid-number
{
var_error_log('failed: inxed >= strlen($value)');
return false;             // indicate failure.
}
(int)$b = strpos($safeCharacters, $value[$index++]);

if (!$b) {                    // If the character wasn't on the valid list,
var_error_log('failed: character not in valid list');
return false;             // indicate failure.
}
$n |= ($b & 31) << $k;        // mask off the top bit and append the rest to the accumulator
$k = $k+5;                    // move to the next position
if ($b < 32) break;           // If the top bit was not set, we're done with this number.
}

// The resulting number encodes an x, y pair in the following way:
//
//  ^ Y
//  |
//  14
//  9 13
//  5 8 12
//  2 4 7 11
//  0 1 3 6 10 ---> X

// determine which diagonal it's on
$diagonal = (int)((sqrt(8 * $n + 5) - 1) / 2);

// subtract the total number of points from lower diagonals
$n -= $diagonal * ($diagonal + (int)1) / 2;

// get the X and Y from what's left over
$ny = (int)$n;
$nx = $diagonal - $ny;

// undo the sign encoding
$nx = pow(($nx >> 1), (-($nx & 1)) );
$ny = pow(($ny >> 1), (-($ny & 1)) );

// undo the delta encoding
$xsum += $nx;
$ysum += $ny;

// position the decimal point
$coordinates = array($ysum * 0.00001, $xsum * 0.00001);
array_push($list, $coordinates);
}

$parsedValue = $list;
var_error_log($parsedValue);
return $parsedValue;
}

Известный ввод
Microsoft приводит пример ввода и вывода для проверки ваших алгоритмов. https://msdn.microsoft.com/en-us/library/jj158958.aspx#TestingYourAlg

compressed shape = 'vx1vilihnM6hR7mEl2Q'

Ожидаемый результат

an array of coordinates
35.894309002906084, -110.72522000409663
35.893930979073048, -110.72577999904752
35.893744984641671, -110.72606003843248
35.893366960808635, -110.72661500424147

Мой вывод

array(4) {
[0]=>
array(2) {
[0]=>
float(1.0E-5)
[1]=>
float(1.0E-5)
}
[1]=>
array(2) {
[0]=>
float(1.027027027027E-5)
[1]=>
float(1.0181818181818E-5)
}
[2]=>
array(2) {
[0]=>
float(1.0825825825826E-5)
[1]=>
float(1.0552188552189E-5)
}
[3]=>
array(2) {
[0]=>
float(1.1103603603604E-5)
[1]=>
float(1.0734006734007E-5)
}
}

Итак, мы можем видеть, что вывод PHP не рассчитывается правильно, и у меня есть ощущение, что это связано с различиями при приведении к длинным целым числам в Java и выполнении побитовых операций над целыми числами. Предполагается, что PHP обрабатывает целые числа, являются ли они длинными, плавающими или целыми, но у меня есть ощущение, что я что-то упускаю.

Могу поспорить, что проблема связана с этой линией. Кто-нибудь может указать на расхождения?

n |= ((long)b & 31) << k;   // mask off the top bit and append the rest to the accumulator

0

bing-api bing-maps decompression java php

Решение

Другие решения

Я преобразовал код C # в PHP. Проблема заключалась в том, что точность терялась с большими числами в php. Поскольку некоторые значения выходили за пределы 32-битных целых чисел и хранились как 64-битные целые числа в C #, эти значения пришлось преобразовать в PHP класс GMP. GMP поддерживает длинные побитовые операции.

/*
*   Microsoft's decompression algorithm - php version
*   returns an array of coordinates (pairs of doubles)
*/
function tryParseEncodedValue($value) {

$safeCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-";
$list = array();
(int)$index = 0;
(int)$xsum = 0;
(int)$ysum = 0;

while ($index < strlen($value))        // While we have more data,
{
$n = 0;                             // initialize the accumulator
$k = 0;                              // initialize the count of bits

while (true)
{
if ($index >= strlen($value)) // If we ran out of data mid-number
{
var_error_log('failed: inxed >= strlen($value)');
return false;             // indicate failure.
}
$b = strpos($safeCharacters, $value[$index++]);

if ($b === false) {           // If the character wasn't on the valid list,
var_error_log('failed: character not in valid list');
return false;             // indicate failure.
}

// mask off the top bit and append the rest to the accumulator
// n |= ((long)b & 31) << k;
$bgmp = gmp_init($b);                           // Here i'm breaking out this function
$bitwiseand = gmp_and($bgmp, 31);               // on multiple lines because there's
$shifted = gmp_shiftl($bitwiseand, $k);         // so many steps
$n = gmp_or($n, $shifted);
$k += 5;
if (gmp_cmp($bgmp, gmp_init(32)) < 0) break;    // gmp compare: b < 32
}

// The resulting number encodes an x, y pair in the following way:
//
//  ^ Y
//  |
//  14
//  9 13
//  5 8 12
//  2 4 7 11
//  0 1 3 6 10 ---> X

// determine which diagonal it's on
//$diagonal = (int)((sqrt(8 * $n + 5) - 1) / 2);
$diagonal = gmp_intval(gmp_div_q(gmp_sub(gmp_sqrt(gmp_add(gmp_mul($n, 8), 5)), 1), 2));

// subtract the total number of points from lower diagonals
// n -= diagonal * (diagonal + 1L) / 2;
$n = gmp_sub($n, gmp_div_q(gmp_mul($diagonal, gmp_add($diagonal, 1)), 2));

// get the X and Y from what's left over
(int)$ny = gmp_intval($n);
(int)$nx = $diagonal - $ny;

// undo the sign encoding
$nx = ($nx >> 1)^ (-($nx & 1));
$ny = ($ny >> 1)^ (-($ny & 1));

// undo the delta encoding
$xsum += $nx;
$ysum += $ny;

// position the decimal point
$coordinate = array($ysum * 0.00001, $xsum * 0.00001);
array_push($list, $coordinate);
}

return $list;
}

// shift left, $x number to shift, $n shift n times.
function gmp_shiftl($x,$n) {
return(gmp_mul($x,gmp_pow(2,$n)));
}

0

Источник

Accepted Answer

Я подозреваю, что ваша проблема, когда вы преобразовали следующий код C #:

nx = (nx >> 1) ^ -(nx & 1);
ny = (ny >> 1) ^ -(ny & 1);

В своем коде PHP вы конвертируете это в:

$nx = pow(($nx >> 1), (-($nx & 1)) );
$ny = pow(($ny >> 1), (-($ny & 1)) );

В C # ^ это побитовая операция XOR, а не степень. PHP использует один и тот же символ для побитового XOR, поэтому попробуйте изменить свой код следующим образом:

$nx = ($nx >> 1) ^ (-($nx & 1));
$ny = ($ny >> 1) ^ (-($ny & 1));

2