Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 43 additions & 4 deletions docs/api.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
***

# File Management Endpoint (`/files`)
# File Management API

This endpoint provides file management capabilities, allowing clients to upload, retrieve, and manage files through various HTTP methods. `http_files_api` must be set to `True` in pyrobusta.env to enable this API.
This API provides file management capabilities, allowing clients to upload, retrieve, and manage files through various HTTP methods. `http_files_api` must be set to `True` in pyrobusta.env to enable this API.

## Summary

Expand All @@ -19,12 +19,22 @@ This endpoint provides file management capabilities, allowing clients to upload,

### 1. File Retrieval/Listing (`GET /files/{path}`)

This endpoint allows general file system interaction, enabling operations such as listing directory contents and retrieving metadata as well as downloading files.
This method allows general file system interaction, enabling operations such as listing directory contents and retrieving metadata as well as downloading files.

* **Method:** `GET`
* **Path:** `/files/{path}`
* **Success Response:** 200 OK.

#### Example request

```bash
$ curl 192.168.1.100/files/www
[
{"path": "/www/examples.html", "created": "90", "size": "4507"},
{"path": "/www/index.html", "created": "91", "size": "1198"}
]
```

### 2. File Upload / Overwrite (`PUT /files/{file path}`)

This method is used to upload a file or overwrite an existing file at a specific path.
Expand All @@ -36,19 +46,42 @@ The upload path is restricted to /www/user_data.
* **Success Response:** 201 Created.
* **Notes:** `transfer-encoding: chunked` is supported.

#### Example request

```bash
$ curl -X PUT --data 'This is a test.' http://192.168.1.100/files/www/user_data/test.txt
OK

$ curl 192.168.1.100/files/www/user_data/test.txt
This is a test.
```

### 3. File Upload (`POST /files`)

This method handles general file uploads, designed for uploading multiple files with per-file chunking supported. Only multipart/form-data is accepted as a content type.

The upload path is restricted to /www/user_data, however, content-disposition headers only have to specify the file name, /www/user_data is prepended by default.

`http_multipart` must be set to `True` in the configuration to use this endpoint.
`http_multipart` must be set to `True` in the configuration to use this method.

* **Method:** `POST`
* **Path:** `/files`
* **Body:** File content encapsulated in multipart/form-data.
* **Success Response:** 201 Created.

#### Example request

```bash
$ echo "File 1 content" > /tmp/upload-1.txt
$ echo "File 2 content" > /tmp/upload-2.txt
$ curl -X POST --form file1='@/tmp/upload-1.txt' --form file2='@/tmp/upload-2.txt' http://192.168.1.100/files
$ curl 192.168.1.100/files/www/user_data
[
{"path": "/www/user_data/upload-1.txt", "created": "418", "size": "15"},
{"path": "/www/user_data/upload-2.txt", "created": "418", "size": "15"}
]
```

### 4. File Delete (`DELETE /files/{file path}`)

This method is used to delete a file at a specific path.
Expand All @@ -57,3 +90,9 @@ The path is restricted to /www/user_data.
* **Method:** `PUT`
* **Path:** `/files/{file path}`
* **Success Response:** 204 No Content.

#### Example request

```bash
$ curl -X DELETE 192.168.1.100/files/www/user_data/test.txt
```
91 changes: 91 additions & 0 deletions docs/architecture/state_machine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# HTTP state machine parser

[http.py](../../src/pyrobusta/protocol/http.py) implements a continuation passing parser using a
finite state machine (FSM). Each state consumes available sufficient data to make progress or explicitly
suspend until more data arrives.

In general, states are not required to transition to a terminal state if a request is incomplete.
Instead, states return control to the asyncio event loop, which drives subsequent invocations of the
state machine based on socket readiness. The state machine may be terminated by the surrounding coroutine in
the case of a session timeout or transport error. This is a deliberate architectural decision to separate HTTP
protocol semantics from transport-level I/O scheduling concerns.

The state machine can be decomposed to four sub-FSMs, depicted by the below diagrams. The state machine applies
to a single HTTP session with a dedicated request and response stream buffer.


## HTTP Request Line and Header Parsing
```mermaid
stateDiagram-v2

[*] --> start_parser

start_parser --> parse_request_line_st: rx.size() > 0
start_parser --> start_parser: empty buffer

parse_request_line_st --> parse_headers_st: valid request line parsed
parse_request_line_st --> parse_request_line_st: incomplete line
parse_request_line_st --> [*]: 405/505 terminate

parse_headers_st --> route_request_st: headers complete
parse_headers_st --> parse_headers_st: waiting for \r\n\r\n
parse_headers_st --> [*]: invalid headers (host missing etc.)
```

## Routing and Body Strategy Selection
```mermaid
stateDiagram-v2

route_request_st --> app_endpoint_st: endpoint + no payload

route_request_st --> recv_payload_st: content-length body
route_request_st --> recv_chunk_size_st: chunked encoding
route_request_st --> start_multipart_parser_st: multipart body

route_request_st --> fs_retrieve_st: GET/HEAD fallback file server

route_request_st --> [*]: 404 no route
route_request_st --> [*]: 405 method not allowed
route_request_st --> [*]: 204 OPTIONS

recv_payload_st --> app_endpoint_st: full body received
recv_payload_st --> recv_payload_st: waiting for content-length

recv_chunk_size_st --> recv_chunk_st: size parsed
recv_chunk_size_st --> recv_chunk_size_st: waiting for chunk size

recv_chunk_st --> app_endpoint_st: chunk complete
recv_chunk_st --> recv_chunk_st: waiting for full chunk
```

## Application Execution and Response Generation
```mermaid
stateDiagram-v2

app_endpoint_st --> app_endpoint_st: execute callback / process request

app_endpoint_st --> recv_chunk_size_st: more chunked data expected

app_endpoint_st --> generate_multipart_response_st: multipart response

app_endpoint_st --> [*]: 200 OK (default completion)

fs_retrieve_st --> [*]: 200 file served
fs_retrieve_st --> [*]: 403 forbidden
fs_retrieve_st --> [*]: 404 file missing

generate_multipart_response_st --> [*]: 200 headers set + stream ready
```

## Multipart Request Processing
```mermaid
stateDiagram-v2

start_multipart_parser_st --> parse_boundary_st: boundary validated

parse_boundary_st --> parse_complete_part_st: boundary detected
parse_boundary_st --> parse_boundary_st: waiting for boundary

parse_complete_part_st --> parse_boundary_st: more parts remain
parse_complete_part_st --> [*]: final part processed (200)
```
2 changes: 1 addition & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ to upload it to the root directory of the target device.
| http_multipart | Enable multipart HTTP requests/responses. | False |
| http_mem_cap | Max memory cap (% × 0.01) of usable heap for HTTP request/response stream buffers. | 0.1 |
| http_served_paths | Space delimited list of filesystem paths allowed to be served through HTTP. | "/www /lib/pyrobusta" |
| http_files_api | Enables or disables the file management API endpoint (/files), allowing to upload, download, and list files. | False |
| http_files_api | Enables or disables the [file management API](./api.md#file-management-api) endpoint (/files), allowing to upload, download, and list files. | False |
| socket_max_con | Max number of socket connections of any enabled application server. | 2 |
| tls | Enables or disables TLS. When turned on, cert.der/key.der must be installed at the root. | False |
| log_level | Can be one of: warning, info, debug. | "info" |
5 changes: 4 additions & 1 deletion src/pyrobusta/bindings/http_connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,10 @@ async def _run_state_machine(self):
if not self._engine.is_request_empty() and self._engine.is_terminated():
self._engine.write_response_head(self._send_buf)
await self._flush_response()
if self._engine.resp_handler is not None:
if (
self._engine.resp_handler is not None
and not self._engine.method == self._engine.HEAD
):
await self._response_handler(self._engine.resp_handler)

async def _response_handler(self, resp_handler):
Expand Down
28 changes: 23 additions & 5 deletions src/pyrobusta/protocol/http.py
Original file line number Diff line number Diff line change
Expand Up @@ -488,7 +488,7 @@ def set_response_body(
object, stored by the resp_handler member. resp_handler
can be used for writing the body by the transport layer.
This method also updates the content-type and content-length
headers.
headers. In the case of a HEAD request, the body is omitted.
:param body: body to be sent in the response
:param content_type: content-type of the body
"""
Expand Down Expand Up @@ -616,6 +616,20 @@ def has_payload(self):
"content-length" in self.headers and self.headers["content-length"] > 0
) or self.is_chunked()

def _consume_payload(self, rx, size, last=False):
"""
Consume data from the request buffer and increment content length counter.
Raise an exception if the content length is exceeded. Allow strict checking
of content length when the last flag is set.
"""
if "content-length" in self.headers and (
(self.content_len_cnt + size > self.headers["content-length"])
or (last and self.headers["content-length"] != self.content_len_cnt + size)
):
raise InvalidContentLength()
self.content_len_cnt += size
rx.consume(size)

# ================================================================================
# Parser states
# - all states must handle rx buffer argument for reading request data
Expand Down Expand Up @@ -690,7 +704,7 @@ def _route_request_st(self, _):
self.terminate(204, True)
return
if self.has_payload():
if self.method == self.HEAD:
if self.method in (self.GET, self.HEAD):
raise MalformedRequest()
if mp_boundary := self._get_mp_boundary(self.headers):
# Request body is multipart
Expand Down Expand Up @@ -731,7 +745,7 @@ def _recv_chunk_size_st(self, rx):
self.recv_chunk_size = int(bytes(rx.peek(blank_idx)), 16)
if self.recv_chunk_size < 0:
raise InvalidContentLength()
rx.consume(blank_idx + 2)
self._consume_payload(rx, blank_idx + 2)
self.state = self._recv_chunk_st

def _recv_chunk_st(self, rx):
Expand All @@ -756,22 +770,26 @@ def _recv_payload_st(self, rx):
def _app_endpoint_st(self, rx):
"""
Process a request by registered callback functions.
HEAD requests are temporarily mapped to GET for routing and callback execution,
but the response body is not sent back.
"""
method = self.GET if self.method == self.HEAD else self.method
callback = self._get_callback(self.url, method)
if self.has_payload():
if self.is_chunked():
if self.recv_chunk_size:
callback(self, bytes(rx.peek(self.recv_chunk_size)))
rx.consume(self.recv_chunk_size + 2)
self._consume_payload(rx, self.recv_chunk_size + 2)
self.state = self._recv_chunk_size_st
return
# Last chunk, callback with empty body to signal end of request body
callback_response = callback(self, b"")
rx.consume(self.recv_chunk_size + 2)
self._consume_payload(rx, self.recv_chunk_size + 2, last=True)
else:
callback_response = callback(
self, bytes(rx.peek(self.headers["content-length"]))
)
self._consume_payload(rx, self.headers["content-length"], last=True)
else:
callback_response = callback(self, b"")

Expand Down
6 changes: 2 additions & 4 deletions src/pyrobusta/protocol/http_file_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,9 +126,7 @@ def upload_file(http_ctx, payload: bytes):
if not file_name_idx:
http_ctx.terminate(400)
return "text/plain", "Bad request"
file_path = normalize_path(
_TMP_DIR + "/" + f"{url_path[file_name_idx:]}.{http_ctx.id}"
)
file_path = _TMP_DIR + "/" + f"{url_path[file_name_idx:]}.{http_ctx.id}"
else:
file_path = normalize_path(http_ctx.url.decode("ascii")[6:])

Expand Down Expand Up @@ -192,7 +190,7 @@ def bulk_upload_file(http_ctx, payload: tuple):
remove(_TMP_DIR + "/" + file)

# TODO: support X-Upload-Directory; pylint: disable=W0511
target_path = normalize_path(_TMP_DIR + "/" + f"{filename}.{http_ctx.id}")
target_path = _TMP_DIR + "/" + f"{filename}.{http_ctx.id}"
with open(target_path, "ab") as f:
f.write(part_body)

Expand Down
Loading
Loading