XML for php: tutorial draft — OZONE Asylum, home of the Mad Scientists

XML - XSL - XSLT - XHTML - CSS - DOM

XML for php: tutorial draft

This page's ID: 10822

Edit Post

Who can edit a post?	The poster and administrators may edit a post. The poster can only edit it for a short while after the initial post.

Your User Name:
Your Password:
Login Options:	Remember Me On This Computer

Your Text: Insert Slimies » Insert UBB Code » Close Last Tag \| All Tags UBB Help	Hi there. [i]Introduction[/i] I needed a place to put together a public draft of a quick XML to php tutorial, so here I am. Starting right up with the link that got me started: [url=http://www.phpbuilder.com/columns/justin20000428.php3?page=2]http://www.phpbuilder.com/columns/justin20000428.php3?page=2[/url] "Using expat functions", expat being the lib that gives us access to a comprehensive set of XML functions. And guess, what? In recent builds of Apache, expat is built inside the server software itself: [quote] This PHP extension implements support for James Clark's expat in PHP. This toolkit lets you parse, but not validate, XML documents If you are using Apache-1.3.7 or later, you already have the required expat library. Simply configure PHP using --with-xml (without any additional path) and it will automatically use the expat library built into Apache. [/quote] [url=http://www.php.net/manual/en/ref.xml.php]http://www.php.net/manual/en/ref.xml.php[/url] So, unless you really have an old server, or old build of php, you should be able to start with expat quite easilly. [i]XML in full effect[/i] Basically, XML only provides a data description language, it only stores "contents". So, depending on the application, you can imagine any kind of document structure, similar to html, and it will "work" as an xml document object model. Example: <category name="books"> <title> A brave new world </title> <author> Aldoux Huxley </author> <summary> </summary> </category> Of course, around this technology, a lot of specifications, custom languages, etc.. Have been developped. But this is the core XML, just a "framework" to create interchangeable file formats easilly, instead of custom converters. Of course, XML=contents only. So what is the real benefit?.... [i]Imagine web pages were data fountains, instead of interface widgets[/i] Imagine you could easilly plug an interface based at "http://www.ying.com" to a remote server far away, at "http://www.yang.com". Such a task, if [url=http://www.yang.com]http://www.yang.com[/url] only was a normal site, would need parsing yang.com pages, stripping tags for the required info, etc.. heavy stuff. With xml, [url=http://www.yang.com]www.yang.com[/url] could provide something like: [url=http://www.yang.com/feed.xml,]http://www.yang.com/feed.xml,[/url] and at that end of the website, you could just plug an xml parser to get the info you need. In the past months, XML has grown as a key tool for a coherent web development: allowing data storage, formatting and manipulation, it is the technique behind "instant news feeds" that start spreading. It is the key to "the Amazon XML feed", a powerful search engine. It is the key to a variety of new file formats, like SMIL and MathML, which are all promising. [i]So you wanna parse XML with PHP?[/i] Well, you got xml functions to do so, as described in [url=http://www.php.net/manual/en/ref.xml.php:]http://www.php.net/manual/en/ref.xml.php:[/url] [quote] xml_set_element_handler() Element events are issued whenever the XML parser encounters start or end tags. There are separate handlers for start tags and end tags. xml_set_character_data_handler() Character data is roughly all the non-markup contents of XML documents, including whitespace between tags. Note that the XML parser does not add or remove any whitespace, it is up to the application (you) to decide whether whitespace is significant. xml_set_processing_instruction_handler() PHP programmers should be familiar with processing instructions (PIs) already. <?php ?> is a processing instruction, where php is called the "PI target". The handling of these are application-specific, except that all PI targets starting with "XML" are reserved. xml_set_default_handler() What goes not to another handler goes to the default handler. You will get things like the XML and document type declarations in the default handler. xml_set_unparsed_entity_decl_handler() This handler will be called for declaration of an unparsed (NDATA) entity. xml_set_notation_decl_handler() This handler is called for declaration of a notation. xml_set_external_entity_ref_handler [/quote] And from the same page, a good sample code about how to map some XML to some equivalent HTML. [code] $file = "data.xml"; $map_array = array( "BOLD" => "B", "EMPHASIS" => "I", "LITERAL" => "TT" ); function startElement($parser, $name, $attrs) { global $map_array; if ($htmltag = $map_array[$name]) { print "<$htmltag>"; } } function endElement($parser, $name) { global $map_array; if ($htmltag = $map_array[$name]) { print "</$htmltag>"; } } function characterData($parser, $data) { print $data; } $xml_parser = xml_parser_create(); // use case-folding so we are sure to find the tag in $map_array xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, true); xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "characterData"); if (!($fp = fopen($file, "r"))) { die("could not open XML input"); } while ($data = fread($fp, 4096)) { if (!xml_parse($xml_parser, $data, feof($fp))) { die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } } xml_parser_free($xml_parser); [/code] Pay attention to this: function startElement($parser, $name, $attrs) This function will react each time the parser finds an openning tag. The tag name is given and the attributes as well. Along the same vain, the endElement will react when a closing tag is found. And of course, characterData will process the data between openning and closing tag. All this because we built our parser using the following instructions: [code] $xml_parser = xml_parser_create(); // use case-folding so we are sure to find the tag in $map_array xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, true); xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "characterData"); [/code] First line creates an empty xml_parser, assigned to the $xml_parser variable. The second line sets the uppercase mode on for the parser, which means that the names of the tags it will find will be turned to uppercase. set_element_handler and set_character_data_handler define which functions should be associated to openning tags, closing tags and data. Please note that you could well use xml_set_default_handler() to define a function to handle everything that is not tags or data, tipically, Doctype or comments. That's it for the engine initialisation. This part: [code] if (!($fp = fopen($file, "r"))) { die("could not open XML input"); } while ($data = fread($fp, 4096)) { if (!xml_parse($xml_parser, $data, feof($fp))) { die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } } xml_parser_free($xml_parser); [/code] Attempts openning a file, and attempts reading from it 4096 bytes per 4096 bytes (line per line), each line is then handed to the parser which processes it according to the given definition. If the parser crashes, it xml_error_string and xml_get_current_line_number are there to let us locate the error and it's nature. Pardon this simple explanation, the xml parser setup and principles had to be summarized. Now you may wonder: "how do I choose which elements to translate, how to translate them? Can I use custom regexps for each tag? Can I perform different actions for each tag?". And the answer is yes, of course. The details? Await you in the next chapter. [This message has been edited by InI (edited 10-14-2002).] Loading...
Options:	Enable Slimies Enable Linkwords

« Backwards — Onwards »

Show Forum Drop Down Menu