Skip to content

Commit ec9f06d

Browse files
Update README.md
1 parent c7105c8 commit ec9f06d

1 file changed

Lines changed: 244 additions & 66 deletions

File tree

README.md

Lines changed: 244 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -27,39 +27,6 @@ $doc->echo();
2727

2828
Simple as that!
2929

30-
## Start
31-
32-
<!-- TABLE OF CONTENTS -->
33-
<details open="open">
34-
<summary>Table of Contents</summary>
35-
<ol>
36-
<li>
37-
<a href="#introduction">About The Project</a>
38-
</li>
39-
<li>
40-
<a href="#getting-started">Getting Started</a>
41-
</li>
42-
<li>
43-
<a href="#usage">Usage</a>
44-
</li>
45-
<li>
46-
<a href="#roadmap">Roadmap</a>
47-
</li>
48-
<li>
49-
<a href="#contributing">Contributing</a>
50-
</li>
51-
<li>
52-
<a href="#license">License</a>
53-
</li>
54-
<li>
55-
<a href="#contact">Contact</a>
56-
</li>
57-
<li>
58-
<a href="#acknowledgements">Acknowledgements</a>
59-
</li>
60-
</ol>
61-
</details>
62-
6330
## How do I select nodes within my HTML document?
6431
For this, we use `query`, or its simplified version: `Q`, as its parameter we can pass in a string with the CSS query we want, for example: `$doc->Q("div.box > span#tooltip")`.
6532

@@ -118,95 +85,306 @@ Note that `query` by itself does not return anything, it needs a complement, for
11885

11986
When we are parsing documents, we may need to select texts within `p` tags, or manipulate or confirm the attributes of a specific tag, or even delete all HTML comments in a document, such as `<!-- this is an example comment -->`.
12087

121-
That's why we have the triad: `::text`, `::attributes` and `::comment` - but how can we use them?
88+
That's why we have the triad: `::text`, `::attributes` and `::comment`.
89+
90+
<!-- TABLE OF CONTENTS -->
91+
<details open="open">
92+
<summary>Table of Contents</summary>
93+
<ol>
94+
<li>
95+
<a href="#introduction">About The Project</a>
96+
</li>
97+
<li>
98+
<a href="#getting-started">Getting Started</a>
99+
</li>
100+
<li>
101+
<a href="#usage">Usage</a>
102+
</li>
103+
<li>
104+
<a href="#roadmap">Roadmap</a>
105+
</li>
106+
<li>
107+
<a href="#contributing">Contributing</a>
108+
</li>
109+
<li>
110+
<a href="#license">License</a>
111+
</li>
112+
<li>
113+
<a href="#contact">Contact</a>
114+
</li>
115+
<li>
116+
<a href="#acknowledgements">Acknowledgements</a>
117+
</li>
118+
</ol>
119+
</details>
120+
121+
## `wrap` and `unwrap`
122122

123-
### `::text` selector
123+
Wrap or unwrap node elements with other node elements.
124124

125-
We can select all the text nodes of a given document like this:
125+
### `$html`
126+
127+
```html
128+
<img src="image.jpg" alt="JumpyDoggy" width="104" height="142">
129+
```
130+
### Php
126131

127132
```php
133+
include "path/webscraper.php";
134+
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
128135

129-
$doc->Q("::text");
136+
$doc->Q("img[src='image.jpg']")->wrap("<figure></figure>");
137+
// also possible: $doc->Q("img[src='image.jpg']")->wrap("figure");
130138

139+
$doc->echo();
131140
```
141+
### Output
132142

133-
Or we can select the text inside a specific tag - like `h1`:
143+
```html
144+
<figure>
145+
<img src="image.jpg" alt="JumpyDoggy" width="104" height="142">
146+
</figure>
147+
```
148+
You may also give the image wrapper attributes, like `style`, `class`, `id`, etc., like this:
149+
150+
### Php
134151

135152
```php
153+
include "path/webscraper.php";
154+
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
136155

137-
$doc->Q("h1::text");
156+
$doc->Q("img[src='image.jpg']")->wrap("<figure class='img-wrapper' style='width: 100px; height: 100px;'></figure>");
138157

158+
$doc->echo();
139159
```
160+
### Output
140161

141-
To print the detected information we can also use the `echo` function, check the following example:
162+
```html
163+
<figure class="img-wrapper" style="width: 100px; height: 100px;">
164+
<img src="image.jpg" alt="JumpyDoggy" width="104" height="142">
165+
</figure>
166+
```
167+
In case you don't want it wrapped anymore, run it:
142168

143-
### Html
169+
```php
170+
include "path/webscraper.php";
171+
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
144172

145-
```html
173+
$doc->Q("img[src='image.jpg']")->unwrap();
146174

147-
<h1>1st h1</h1>
148-
<h1>2nd h1</h1>
149-
<h2>Heading 2</h2>
150-
<h3>Heading 3</h3>
151-
<h4>Heading 4</h4>
152-
<h5>Heading 5</h5>
175+
$doc->echo();
176+
```
177+
### Output
153178

179+
```html
180+
<img src="image.jpg" alt="JumpyDoggy" width="104" height="142">
154181
```
182+
## `addClass` and `removeClass`
155183

184+
Add and remove class to DOM elements.
185+
186+
### `$html`
187+
188+
```html
189+
<h1>Give me a title class</h1>
190+
<h2 class="title">Hey! My class should be "subtitle"!</h2>
191+
```
156192
### Php
157193

158194
```php
159195
include "path/webscraper.php";
160196
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
161197

162-
$doc->Q("h1[2]::text")->echo();
198+
$doc->Q("h1")->addClass("title");
199+
$doc->Q("h2")->removeClass("title");
200+
$doc->Q("h2")->addClass("subtitle");
163201

202+
$doc->echo();
164203
```
204+
### Output
165205

166-
### Output
206+
```html
207+
<h1 class="title">Give me a title class</h1>
208+
<h2 class="subtitle">Hey! My class should be "subtitle"!</h2>
209+
```
167210

168-
```html
211+
## `setAttribute` and `removeAttribute`
212+
213+
In case `addClass` and `removeClass` are not enough (and probably are not), you can use these both functions: `setAttribute` and `removeAttribute`.
214+
215+
### `$html`
216+
217+
```html
218+
<form action="#" method="post">
219+
<div>
220+
<label for="name">Text Input:</label>
221+
<input type="text" name="name" id="age" value="" tabindex="1" />
222+
</div>
223+
</form>
224+
<button>submit</button>
225+
```
226+
### Php
227+
228+
```php
229+
include "path/webscraper.php";
230+
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
169231

170-
2nd h1
232+
$doc->Q("form input[name='name']")->setAttribute("id", "name");
233+
$doc->Q("form input[name='name']")->removeAttribute("tabindex");
234+
$doc->Q("button")->setAttribute("onclick", "document.querySelector('form').submit()");
171235

236+
$doc->echo();
172237
```
173-
As you can see, the `echo` function is multifunctional, it can be used to echo a document or nodes.
238+
### Output
174239

175-
### `::attributes` selector
240+
```html
241+
<form action="#" method="post">
242+
<div>
243+
<label for="name">Text Input:</label>
244+
<input type="text" name="name" id="name" value=""/>
245+
</div>
246+
</form>
247+
<button onclick="document.querySelector('form').submit()">submit</button>
248+
```
176249

177-
When we want to make a list of attributes or perhaps delete all attributes of a specific tag, we can use the selector `::attributes` to access them.
250+
## `html` and `text`
178251

179-
**Deleting attributes**
252+
`html` and `text` differ little, both can be used to print or return inner elements of tags, however, `text` can only be used to print/return inner text while `html` is able to print/return both html and inner text.
253+
254+
### `$html`
255+
256+
```html
257+
<nav>
258+
<ul>
259+
<li><a href="#">Home</a></li>
260+
<li><a href="#">About</a></li>
261+
<li><a href="#">Clients</a></li>
262+
<li><a href="#">Contact Us</a></li>
263+
</ul>
264+
</nav>
265+
```
266+
### Php
180267

181268
```php
182269
include "path/webscraper.php";
183270
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
184271

185-
$doc->Q("h1::attributes")->delete();
272+
echo "Result from html(): \n";
273+
echo $doc->Q("nav")->html();
186274

187-
// or
275+
echo "\n\n";
276+
277+
echo "Result from text(): \n";
278+
echo $doc->Q("nav")->text();
279+
```
280+
### Output
281+
282+
```html
283+
Result from html():
284+
285+
<ul>
286+
<li><a href="#">Home</a></li>
287+
<li><a href="#">About</a></li>
288+
<li><a href="#">Clients</a></li>
289+
<li><a href="#">Contact Us</a></li>
290+
</ul>
291+
292+
293+
Result from text():
188294

189-
$doc->Q("::attributes")->delete();
190-
// deletes all attributes of all tags
295+
296+
Home
297+
About
298+
Clients
299+
Contact Us
300+
191301

192302
```
303+
`echo $doc->Q("body")->text()` is a good idea in case you want a plaintext function.
193304

194-
And here's how you can print the attributes of an element:
305+
`text` and `html` can also manipulate the content inside them:
195306

196307
```php
197308
include "path/webscraper.php";
198309
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
199310

200-
$doc->Q("h1::attributes")->echo();
311+
$doc->Q("nav")->html('
312+
<ul>
313+
<li><a href="home.html">Home</a></li>
314+
<li><a href="about.html">About</a></li>
315+
<li><a href="clients.html">Clients</a></li>
316+
<li><a href="gallery.html"></a></li>
317+
<li><a href="plans.html"></a></li>
318+
<li><a href="contact.html">Contact Us</a></li>
319+
</ul>
320+
');
321+
$doc->Q("nav ul li a[href='gallery.html']")->text("Gallery");
322+
$doc->Q("nav ul li a[href='plans.html']")->text("Plans of Service");
201323

324+
$doc->echo();
325+
```
326+
### Output
327+
```html
328+
<nav>
329+
<ul>
330+
<li><a href="home.html">Home</a></li>
331+
<li><a href="about.html">About</a></li>
332+
<li><a href="clients.html">Clients</a></li>
333+
<li><a href="gallery.html">Gallery</a></li>
334+
<li><a href="plans.html">Plans of Service</a></li>
335+
<li><a href="contact.html">Contact Us</a></li>
336+
</ul>
337+
</nav>
202338
```
203-
#### Output
339+
340+
## `appendHtml` and `prependHtml`
341+
342+
`appendHtml` inserts html at the **end** of a DOM element, while `prependHtml` inserts html at the start.
343+
344+
### `$html`
204345

205346
```html
347+
<div id="append"></div>
348+
<br/>
349+
<div id="prepend"></div>
350+
```
206351

207-
h1[attribute1] => "value1"
208-
h1[attribute2] => "value2"
352+
### Php
209353

354+
```php
355+
include "path/webscraper.php";
356+
$doc = new WebScraper("<!DOCTYPE html><html><body>".$html."</body></html>");
357+
358+
$i = 0;
359+
while ($i < 5) {
360+
$i++;
361+
$doc->Q("#append")->appendHtml("<p id='".$i."'>".$i."</p>");
362+
}
363+
364+
$j = 0;
365+
while ($j < 5) {
366+
$j++;
367+
$doc->Q("#prepend")->prependHtml("<p id='".$j."'>".$j."</p>");
368+
}
369+
370+
$doc->echo();
210371
```
211-
### `::comment` selector
372+
### Output
212373

374+
```html
375+
<div id="append">
376+
<p id="1">1</p>
377+
<p id="2">2</p>
378+
<p id="3">3</p>
379+
<p id="4">4</p>
380+
<p id="5">5</p>
381+
</div>
382+
<br/>
383+
<div id="prepend">
384+
<p id="5">5</p>
385+
<p id="4">4</p>
386+
<p id="3">3</p>
387+
<p id="2">2</p>
388+
<p id="1">1</p>
389+
</div>
390+
```

0 commit comments

Comments
 (0)