fabpot/goutte
's direct dependencies. Data on all dependencies, including transitive ones, is available via CSV download.Name | Version | Size | License | Type | Vulnerabilities |
---|---|---|---|---|---|
symfony/browser-kit | v6.4.7 | - | MIT | prod | |
symfony/css-selector | v6.4.7 | - | MIT | prod dev | |
symfony/deprecation-contracts | v3.5.0 | - | MIT | prod dev | |
symfony/dom-crawler | v6.4.7 | - | MIT | prod | |
symfony/http-client | v6.4.7 | - | MIT | prod dev | |
symfony/mime | v6.4.7 | - | MIT | prod dev |
Fabpot/Goutte is a screen scraping and web crawling library for PHP, designed to create a friendly API for crawling websites and extracting data from HTML/XML responses. Please note this library has been deprecated, and as of v4, Goutte functions as a simple proxy to the HttpBrowser class from the Symfony BrowserKit component.
Fabpot/Goutte can be used in a variety of ways. Here's an example of how you can use it:
Firstly, you need to install the library. You can add fabpot/goutte
as a required dependency in your composer.json
file. To install it, use the following command:
composer require fabpot/goutte
Once the library is installed, you can create a Goutte\Client
instance like this:
use Goutte\Client;
$client = new Client();
Make a request using the request()
method like this:
$crawler = $client->request('GET', 'https://www.symfony.com/blog/');
This returns a Symfony\Component\DomCrawler\Crawler
object, which you can then use to extract data from a webpage. For instance, you can do something like this to get the text of all headers within 'h2 > a' tags on a page:
$crawler->filter('h2 > a')->each(function ($node) {
print $node->text()."\n";
});
You can find the documentation for Fabpot/Goutte alongside the BrowserKit, DomCrawler, and HttpClient Symfony Components, since Goutte is a thin wrapper around these components. More information can be found on these respective pages: BrowserKit, DomCrawler, HttpClient.