Verification: a143cc29221c9be0

Parsing html file in php

A simple DOM program to extract Google result links

loadHTML($html);

# Iterate over all the  tags
foreach($dom->getElementsByTagName('a') as $link) {
        # Show the 
        echo $link->getAttribute('href');
        echo "
"; } ?>

The simple_html_dom module is an alternative to the built-in-DOM module. Since it is a third-party module, you'll have to install it yourself.

Modifying links with simple_html_dom

Say you have some links in your HTML file that look like this:


and you want to convert them to:


but only the ones with a class of "someclass". Here's a program to do that:

$html = new simple_html_dom();
$html->load($input);

foreach($html->find('a[class=someclass]') as $link)
    $link->href = 'http://www.example.com' . $link->href;

$result = $html->save();

find lets you easily query the DOM. The parameter is tagtype[attributeName=attributeValue] where the square brackets are an optional filter. Then you just iterate over every link this function finds, and prepend the href attribute with your domain. The href function is both a getter and setter.