You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For this, we use `query`, or its simplified version: `Q`, as its parameter we can pass in a string with the CSS query we want, for example: `$doc->Q("div.box > span#tooltip")`.
65
32
@@ -118,95 +85,306 @@ Note that `query` by itself does not return anything, it needs a complement, for
118
85
119
86
When we are parsing documents, we may need to select texts within `p` tags, or manipulate or confirm the attributes of a specific tag, or even delete all HTML comments in a document, such as `<!-- this is an example comment -->`.
120
87
121
-
That's why we have the triad: `::text`, `::attributes` and `::comment` - but how can we use them?
88
+
That's why we have the triad: `::text`, `::attributes` and `::comment`.
89
+
90
+
<!-- TABLE OF CONTENTS -->
91
+
<detailsopen="open">
92
+
<summary>Table of Contents</summary>
93
+
<ol>
94
+
<li>
95
+
<a href="#introduction">About The Project</a>
96
+
</li>
97
+
<li>
98
+
<a href="#getting-started">Getting Started</a>
99
+
</li>
100
+
<li>
101
+
<a href="#usage">Usage</a>
102
+
</li>
103
+
<li>
104
+
<a href="#roadmap">Roadmap</a>
105
+
</li>
106
+
<li>
107
+
<a href="#contributing">Contributing</a>
108
+
</li>
109
+
<li>
110
+
<a href="#license">License</a>
111
+
</li>
112
+
<li>
113
+
<a href="#contact">Contact</a>
114
+
</li>
115
+
<li>
116
+
<a href="#acknowledgements">Acknowledgements</a>
117
+
</li>
118
+
</ol>
119
+
</details>
120
+
121
+
## `wrap` and `unwrap`
122
122
123
-
### `::text` selector
123
+
Wrap or unwrap node elements with other node elements.
124
124
125
-
We can select all the text nodes of a given document like this:
When we want to make a list of attributes or perhaps delete all attributes of a specific tag, we can use the selector `::attributes` to access them.
250
+
## `html` and `text`
178
251
179
-
**Deleting attributes**
252
+
`html` and `text` differ little, both can be used to print or return inner elements of tags, however, `text` can only be used to print/return inner text while `html` is able to print/return both html and inner text.
253
+
254
+
### `$html`
255
+
256
+
```html
257
+
<nav>
258
+
<ul>
259
+
<li><ahref="#">Home</a></li>
260
+
<li><ahref="#">About</a></li>
261
+
<li><ahref="#">Clients</a></li>
262
+
<li><ahref="#">Contact Us</a></li>
263
+
</ul>
264
+
</nav>
265
+
```
266
+
### Php
180
267
181
268
```php
182
269
include "path/webscraper.php";
183
270
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
184
271
185
-
$doc->Q("h1::attributes")->delete();
272
+
echo "Result from html(): \n";
273
+
echo $doc->Q("nav")->html();
186
274
187
-
// or
275
+
echo "\n\n";
276
+
277
+
echo "Result from text(): \n";
278
+
echo $doc->Q("nav")->text();
279
+
```
280
+
### Output
281
+
282
+
```html
283
+
Result from html():
284
+
285
+
<ul>
286
+
<li><ahref="#">Home</a></li>
287
+
<li><ahref="#">About</a></li>
288
+
<li><ahref="#">Clients</a></li>
289
+
<li><ahref="#">Contact Us</a></li>
290
+
</ul>
291
+
292
+
293
+
Result from text():
188
294
189
-
$doc->Q("::attributes")->delete();
190
-
// deletes all attributes of all tags
295
+
296
+
Home
297
+
About
298
+
Clients
299
+
Contact Us
300
+
191
301
192
302
```
303
+
`echo $doc->Q("body")->text()` is a good idea in case you want a plaintext function.
193
304
194
-
And here's how you can print the attributes of an element:
305
+
`text` and `html`can also manipulate the content inside them:
195
306
196
307
```php
197
308
include "path/webscraper.php";
198
309
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
199
310
200
-
$doc->Q("h1::attributes")->echo();
311
+
$doc->Q("nav")->html('
312
+
<ul>
313
+
<li><ahref="home.html">Home</a></li>
314
+
<li><ahref="about.html">About</a></li>
315
+
<li><ahref="clients.html">Clients</a></li>
316
+
<li><ahref="gallery.html"></a></li>
317
+
<li><ahref="plans.html"></a></li>
318
+
<li><ahref="contact.html">Contact Us</a></li>
319
+
</ul>
320
+
');
321
+
$doc->Q("nav ul li a[href='gallery.html']")->text("Gallery");
322
+
$doc->Q("nav ul li a[href='plans.html']")->text("Plans of Service");
201
323
324
+
$doc->echo();
325
+
```
326
+
### Output
327
+
```html
328
+
<nav>
329
+
<ul>
330
+
<li><ahref="home.html">Home</a></li>
331
+
<li><ahref="about.html">About</a></li>
332
+
<li><ahref="clients.html">Clients</a></li>
333
+
<li><ahref="gallery.html">Gallery</a></li>
334
+
<li><ahref="plans.html">Plans of Service</a></li>
335
+
<li><ahref="contact.html">Contact Us</a></li>
336
+
</ul>
337
+
</nav>
202
338
```
203
-
#### Output
339
+
340
+
## `appendHtml` and `prependHtml`
341
+
342
+
`appendHtml` inserts html at the **end** of a DOM element, while `prependHtml` inserts html at the start.
343
+
344
+
### `$html`
204
345
205
346
```html
347
+
<divid="append"></div>
348
+
<br/>
349
+
<divid="prepend"></div>
350
+
```
206
351
207
-
h1[attribute1] => "value1"
208
-
h1[attribute2] => "value2"
352
+
### Php
209
353
354
+
```php
355
+
include "path/webscraper.php";
356
+
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
0 commit comments