Я ищу, чтобы извлечь полный div, который я смог извлечь из остальной части исходного кода. Из этого div я хочу, чтобы все html-содержимое было как есть, но без каких-либо дочерних элементов div внутри.
HTML-код для запроса:
<div class="content">
<div class="article-title">
<h2>Title of the test</h2>
<a href="http://www.helloworld.com" title="post by world" rel="author" class="article-icon"><span class="text-icon">👤</span>world</a>
<span class="article-icon">
<span class="text-icon">📁</span>
<a href="http://www.helloworld.com/world">world</a>,
</span>
<span class="article-icon"><span class="text-icon">🕔</span>20.August 2014
</span>
</div>
<p class="p1">
<span class="s1"><b>a test</b></span>
</p>
<p class="p2">
<span class="s1">text2</span>
</p>
<p class="p1">
<span class="s1"><b><a href="http://www.helloworld.com/hello.jpg">
<img class="alignright size-medium wp-image-19472" src="http://www.helloworld.com/hello.jpg" alt="hello" width="300" height="218"></a>Hello</b>
</span>
</p>
<p class="p1">
<span class="s1"><b>text text text</b></span>
</p>
<p class="p1">
<span class="s1"><b><a href="http://www.helloworld.com/hello2.jpg">
<img class="alignleft size-medium wp-image-19474" src="http://www.helloworld.com/hello2.jpg" alt="hello2" width="300" height="200"></a>Hello2</b>
</span>
</p>
<p class="p1">
<span class="s1">text1</span>
</p>
<p class="p1">
<span class="s1">text2</span>
</p>
<p class="p1">
<span class="s1"><b>Final thoughts</b></span>
</p>
<p class="p1">
<span class="s1">testing (<a href="http://www.helloworld.com/test">
<span class="s2">test</span></a>,
<a href="http://www.helloworld.com/test2">
<span class="s2">test2</span></a>
</span>
</p>
<p class="p1">
<span class="s1">***</span>
</p>
<p class="p5"><em>
<span class="s1">xyz <a href="http://www.helloworld.com/xyz">
<span class="s2">123</span></a> (at <a href="http://www.helloworld.com">
<span class="s2">http://www.helloworld.com</span></a>.  
</span></em>
</p>
<div class="panel-breaking-line"></div>
<div class="article-tags"> <b>Tags added to this article</b>
<div class="tagcloud"> <a href="http://www.helloworld.com/world">world</a><a href="http://www.helloworld.com/xyz">zyx</a> </div>
</div>
<div class="panel-breaking-line"></div>
<div class="article-socials"> <b>Share this article with friends</b>
<div class="social-likes">
<div class="soc-button soc-button-facebook"> <a href="http://www.facebook.com/sharer/sharer.php?u=http://www.helloworld.com/world" data-url="http://www.helloworld.com/world" class="soc-click ot-share">
<span class="text-icon"></span>FACEBOOK</a>
<span class="likes-count">
<span class="count">0</span>
<span class="bullet"> </span>
</span>
</div>
<div class="soc-button soc-button-twitter"> <a href="#" class="soc-click ot-tweet" data-hashtags="" data-url="http://www.helloworld.com/world" data-via="" data-text="World">
<span class="text-icon"></span>TWITTER</a>
<span class="likes-count">
<span class="count">0</span>
<span class="bullet"> </span>
</span>
</div>
<div class="soc-button soc-button-pinterest"> <a href="http://pinterest.com/pin/create/button/?url=http://www.helloworld.com/world" data-url="http://www.helloworld.com/world" class="ot-pin soc-click">
<span class="text-icon"></span>PINTEREST</a>
<span class="likes-count">
<span class="count">0</span>
<span class="bullet"> </span>
</span>
</div>
<div class="soc-button soc-button-google"> <a href="https://plus.google.com/share?url=http://www.helloworld.com/world" class="ot-pluss soc-click">
<span class="text-icon"></span>GOOGLE+</a>
<span class="likes-count">
<span class="count">0</span>
<span class="bullet"> </span>
</span>
</div>
</div>
</div>
</div>
Итак, basiccaly, я хочу, чтобы все содержимое класса HTML, но без элементов, которые имеют class = «article-title», class = «article-socials» и class = «article-tags»
так что это будет урезано до:
<div class="content">
<p class="p1">
<span class="s1"><b>a test</b></span>
</p>
<p class="p2">
<span class="s1">text2</span>
</p>
<p class="p1">
<span class="s1"><b><a href="http://www.helloworld.com/hello.jpg">
<img class="alignright size-medium wp-image-19472" src="http://www.helloworld.com/hello.jpg" alt="hello" width="300" height="218"></a>Hello</b>
</span>
</p>
<p class="p1">
<span class="s1"><b>text text text</b></span>
</p>
<p class="p1">
<span class="s1"><b><a href="http://www.helloworld.com/hello2.jpg">
<img class="alignleft size-medium wp-image-19474" src="http://www.helloworld.com/hello2.jpg" alt="hello2" width="300" height="200"></a>Hello2</b>
</span>
</p>
<p class="p1">
<span class="s1">text1</span>
</p>
<p class="p1">
<span class="s1">text2</span>
</p>
<p class="p1">
<span class="s1"><b>Final thoughts</b></span>
</p>
<p class="p1">
<span class="s1">testing (<a href="http://www.helloworld.com/test">
<span class="s2">test</span></a>,
<a href="http://www.helloworld.com/test2">
<span class="s2">test2</span></a>
</span>
</p>
<p class="p1">
<span class="s1">***</span>
</p>
<p class="p5"><em>
<span class="s1">xyz <a href="http://www.helloworld.com/xyz">
<span class="s2">123</span></a> (at <a href="http://www.helloworld.com">
<span class="s2">http://www.helloworld.com</span></a>.  
</span></em>
</p>
<div class="panel-breaking-line"></div>
<div class="panel-breaking-line"></div>
</div>
С или без определения содержимого div …
Я пробовал много выражений, и я дошел до этого:
//This is working but returning all content of the div
$xpath = new DOMXPath($doc);
$elements = @$xpath->query(".");
foreach ($elements as $element)
$results .= $element->ownerDocument->saveHTML($element);
}
Тогда с этим выражением вместо только точки:
div[@class='content']/*[not(contains(concat(' ', @class, ' '), 'article-title')) and not(contains(concat(' ', @class, ' '), 'article-social')) and not(contains(concat(' ', @class, ' '), 'article-tags'))]
Это ничего не возвращает мне, любая идея, как я могу заставить эту вещь работать?
Вы можете просто явно поставить их на not(contains())
$dom = new DOMDocument();
$dom->formatOutput = true;
$dom->loadHTML($markup);
$xpath = new DOMXpath($dom);
$elements = $xpath->query('
//div[@class="content"]/*[
not(contains(@class, "article-title")) and
not(contains(@class, "article-socials")) and
not(contains(@class, "article-tags"))
]
');
$html = '';
foreach ($elements as $child) {
$html .= $dom->saveXML($child);
}
echo htmlentities($html);
Других решений пока нет …