okay, i'm new to parsing/evaluating XML using PHP and this is serving as one of my tests for myself...basically, it's a small site map, outlined in an XML file called sitemap.xml:
<?xml version="1.0" encoding="utf-8"?><toc><links> <link section="about products clients"><a href="/about/">About</a></link> <link section="about contact products clients">Products <subsection> <link section="about contact products clients"><a href="/prod1/">Product 1</a></link> <link section="about contact products clients"><a href="/prod2/">Product 2</a></link> <link section="about contact products clients"><a href="/prod3/">Product 3</a></link> </subsection></link> <link section="products clients"><a href="/clients/">Client Login</a></link></links></toc>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"><head> <title>Site Map Test</title></head><body><!-- begin navigation --><ul> <li><a href="index.php?p=about">About Us</a></li> <li><a href="index.php?p=clients">Information for Clients</a></li> <li><a href="index.php?p=contact">Contact Information</a></li> <li><a href="index.php?p=products">Product Information</a></li> <li><a href="index.php">Home</a></li></ul><!-- end navigation ---><!-- begin content --><?php// initializes new XML DOM document$xmldoc = new DOMDocument();// if the XML fails to load, displays an errorif(!$xmldoc->load("sitemap.xml")) { die("Failed to load XML file.");}// sets different section display criteriaif (isset($_GET['p'])) $PAGE = $_GET['p'];switch ($PAGE) { case "about": $title = "About Us"; $section = "about"; break; case "clients": $title = "Information for Clients"; $section = "clients"; break; case "contact": $title = "Contact Information"; $section = "contact"; break; case "products": $title = "Product Information"; $section = "products"; break; default: $title = "Full Site Content"; $section = ""; break;}// if the section isn't blank (all but full contents)// searches XML document for all occurrences of section attributeif ($section != "") { foreach($xmldoc->getElementsByTagName('link') as $element) { if(!$element->hasAttribute('section') || strpos($element->getAttribute('section'),$section) == false) { $element->parentNode->removeChild($element); } }}// converts the filtered XML into a string$filtered = $xmldoc->saveXML();// replaces XML tags with HTML tags for display purposes$filtered = str_replace("links","ul class=\"toc\"",$filtered);$filtered = str_replace("link","li",$filtered);$filtered = str_replace("subsection","ul",$filtered);$filtered = str_replace("/link","/li",$filtered);$filtered = str_replace("/subsection","/ul",$filtered);$filtered = preg_replace("/ section=\"[^\"]*\"/","",$filtered);// displays the title of and the filtered resultsecho "<h1>".$title."</h1>".$filtered;?><!-- end content --></body></html>
1/28/2009 9:46:35 AM
first of all instead of using strpos i would explode the section attribute by whitespace, then use in_array() in your logic gatei'm looking at the rest now.]
1/28/2009 10:22:23 AM
^ that's a good point...i probably should explode it, insteadbut no, it's still not working...where you posted it, if you click on "about us", you get:
About Us - Products - Product 2 - Client Login
About Us - About - Products - Product 1 - Product 2 - Product 3
1/28/2009 10:27:47 AM
yeahi think i see it now. the DOM parser parses EVERY tag it sees, not just your XML. it's picking up your <a> tags inside the link tags and removing them (via the removeChild in the if/then because they don't pass the !$element->hasAttribute('section') test).also, santitize your superglobals before you use them, son exec() can do fun things.]
1/28/2009 10:38:48 AM
1/28/2009 10:53:40 AM
you have markup inside of markup and that's just a no-no. If you use a CDATA block you can tell the parser not to treat what's inside as parsed dataInstead of: <link section="about contact products clients"><a href="/prod1/">Product 1</a></link>Consider: <link section="about contact products clients"><[CDATA[<a href="/prod1/">Product 1</a>]]></link>
1/28/2009 12:04:04 PM
for
strpos($element->getAttribute('section'),$section) == false
// if the section isn't blank (all but full contents)// searches XML document for all occurrences of section attributeif ($section != "") { foreach($xmldoc->getElementsByTagName('link') as $element) { if(strpos($element->getAttribute('section'),$section)===false) { $element->parentNode->removeChild($element); } }}
1/28/2009 12:19:41 PM
^^ doing that gives me errors:
1/28/2009 12:25:22 PM
leftover from various edits
1/28/2009 12:36:17 PM
ah, in that case...many thanks to everyone's help...it's working splendidly, now...i don't think i'd ever have caught the necessity of the third =
1/28/2009 12:37:18 PM
oh, hah, yeah, that's why i hate strpos.=== is explicit equal, so 0 !== false. it matches on both value and type, so bools can't equal ints.== is just plain equal, so 0 == false. it doesn't give a crap about types.strpos returns false if the string wasn't found anywhere, but 0 if it's the first character, which is the case for your "about" stuff.]
1/28/2009 12:58:54 PM
I know you are just learning this, but this is an incredibly bad way to be parsing XML.I would very very highly recommend learning to use the XML parser built into PHP5+ http://www.php.net/xmlIt's a royal pain in the ass to learn and setup for small parsing activities like you are doing here, but in the long run if you plan on doing anything real with XML it will quickly save you a ton of time and headaches in the long run.
1/28/2009 2:00:03 PM
^ i don't mind suggestions as to better ways to do things...i'm just curious as to the reasons behind the suggestion...what's bad about the way i'm parsing it? a lot of overhead? messy?is simplexml just a subset of the xml parser, or are they separate?i don't plan on doing much with xml (at least, i don't have much cause to, right now)...really, i was bored at work and thought that a flat xml file would serve the purpose of a basic sitemap pretty easily and so i figured i'd screw around [Edited on January 28, 2009 at 3:48 PM. Reason : .]
1/28/2009 3:46:36 PM
traversing the DOM tree gets ugly in a hurry when you have even a moderately complex document. it's not fun at all.simplexml is just another extension, like libxml or the xml parser or any of the other XML extensions.http://us3.php.net/manual/en/refs.xml.php
1/28/2009 4:17:33 PM
^hit the nail on the head. Handling simple structures is pretty easy to code-your-own, but it gets very unpleasant quickly when you start flexing your xml muscles.And like so many things in PHP, it's worth learning how the parser works to understand the basics, and then go find an extension library to obfuscate the calls and make life easy on you. I learned this lesson the hard way back when php5 first hit, trying to write my own full parser implementation. The deeper I got into it, the more I kept having to refactor the code to get its functionality expanded.I ended up using libxml + a few modifications and it made life a lot more fun
1/28/2009 5:52:31 PM
so, in the collective opinion of those who know more than me...simplexml or xml parser?also, the suggested code for this:
1/29/2009 8:59:31 AM
he was mainly talking about how you have the 'a' tags within 'link' tags without explicitly stating that those are not part of your xml markup but are content.and also how you have text data and xml markup within the same tag (the 'link' tag with Products at the end of it, immediately followed by the 'subsection' tag)neither of those are correct xml basically, only put one type of data within a tag. if it's more xml markup, fine. if it's html, stick it in a cdata block.and personally i like the xml parser, it's more flexible]
1/29/2009 9:22:56 AM
1/29/2009 9:56:10 AM
When you guys (who do this professionally) are outlining an XML document, what do you take into consideration when deciding if information should be included as an attribute or as element content?
1/31/2009 1:11:19 PM