Using the Azure SDK for Python in Pyodide and PyScript
Pyodide is a Python runtime in WebAssembly. It holds immense promise because it:
- Is a JavaScript alternative.
- Brings the power of Python’s scientific computing libraries (for example, NumPy, SciPy) to the browser without the hassle of a backend. No need to worry about scaling, cost overruns, DDOS attacks, and user data laws.
- Can provide a standardized Python environment for beginners, thus removing barriers to entry. For example, juggling Python versions and executables.
Finally, Pyodide is the engine for PyScript—a thin but convenient Pyodide wrapper that allows for the embedding of Python code into HTML. Supporting the Azure SDK for Python in Pyodide/PyScript would allow for users to easily interact with Azure in each of the aforementioned use cases.
The challenge with running the Azure SDK for Python in Pyodide is networking. The main job of the SDK is to communicate with Azure via the internet. Traditional implementations of Python, such as CPython, give developers near full access to a computer’s networking functions. However, the browser’s built-in security features limit a program’s networking abilities. As such, we can’t use traditional networking libraries (think requests
, aiohttp
) because they violate the browser’s rules (for example, preflight requests, forbidden headers).
Implementation
Given the browser’s networking limitations and Pyodide’s powerful JavaScript interfacing, the best networking tool to make network requests is JavaScript’s fetch
API. The Azure SDK for Python is architected such that non-core
libraries, like the Text Analytics library, rely on abstract networking pipelines to handle retries, authentication headers, and most importantly, the actual networking calls. See this post for an in-depth explanation of the architecture. The part that performs the network calls is called a transport, and we can implement it as such using the fetch
API:
from collections.abc import AsyncIterator
from io import BytesIO
from typing import Any, MutableMapping, Optional
import js
from azure.core.configuration import ConnectionConfiguration
from azure.core.exceptions import HttpResponseError, ResponseNotReadError
from azure.core.pipeline.transport import AsyncHttpTransport
from azure.core.rest import AsyncHttpResponse, HttpRequest
from requests.structures import CaseInsensitiveDict
from pyodide import JsException, JsProxy
from pyodide.http import FetchResponse, pyfetch
class PyodideTransport(AsyncHttpTransport):
"""Implements a basic HTTP sender using the Pyodide JavaScript fetch API."""
def __init__(self, **kwargs):
self.connection_config = ConnectionConfiguration(**kwargs)
async def send(self, request: HttpRequest, **kwargs) -> "PyodideTransportResponse":
"""Send request object according to configuration."""
stream_response = kwargs.pop("stream_response", False)
endpoint = request.url
init = {
"method": request.method,
"headers": dict(request_headers),
"body": request.data,
"files": request.files,
"verify": kwargs.pop("connection_verify", self.connection_config.verify),
"cert": kwargs.pop("connection_cert", self.connection_config.cert),
"allow_redirects": False,
**kwargs,
}
try:
response = await pyfetch(endpoint, **init)
except JsException as error:
raise HttpResponseError(error, error=error) from error
headers = CaseInsensitiveDict(response.js_response.headers)
transport_response = PyodideTransportResponse(
request=request,
internal_response=response,
block_size=self.connection_config.data_block_size,
reason=response.status_text,
headers=headers,
)
if not stream_response:
await transport_response.read()
return transport_response
The send
method accepts a generic HttpRequest
object and maps its attributes to a pyfetch
call. pyfetch
is Pyodide’s built-in fetch
wrapper. It also raises pyfetch
exceptions as azure.core
exceptions. This way, other libraries only have to handle azure.core
exceptions and not worry about the implementation details of the transport. Finally, the send
method maps pyfetch
response fields to a PyodideTransportResponse
object.
Next, we need to implement a PyodideTransportResponse
class that acts as an interface between the data of the pyfetch
response and the rest of the SDK. We also need to implement a download generator class to stream the response, for which we use JavaScript’s ReadableStreamDefaultReader
API.
class PyodideTransportResponse(AsyncHttpResponse):
"""Async response object for the `PyodideTransport`."""
def __init__(
self,
request: HttpRequest,
internal_response: FetchResponse,
headers: MutableMapping[str, str],
block_size: int,
**__
):
self._block_size = block_size
self._content = None
self._encoding: str = "utf-8"
self._headers = headers
self._internal_response = internal_response
self._is_closed = False
self._request = request
@property
def _js_stream(self):
"""Use a fresh stream every time."""
return self._internal_response.js_response.clone().body
async def read(self) -> bytes:
if self._content is None:
parts = []
async for part in self.iter_bytes():
parts.append(part)
self._content = b"".join(parts)
return self._content
async def iter_raw(self, **__) -> AsyncIterator[bytes]:
"""Asynchronously iterates over the response's bytes. Will not decompress in the process."""
if self._content is not None:
for i in range(0, len(self.content), self._block_size):
yield self.content[i : i + self._block_size]
else:
async for part in PyodideStreamDownloadGenerator(
response=self,
decompress=False,
):
yield part
async def iter_bytes(self, **__) -> AsyncIterator[bytes]:
"""Asynchronously iterates over the response's bytes. Will decompress in the process."""
if self._content is not None:
for i in range(0, len(self.content), self._block_size):
yield self.content[i : i + self._block_size]
else:
async for part in PyodideStreamDownloadGenerator(
response=self,
decompress=True,
):
yield part
class PyodideStreamDownloadGenerator(AsyncIterator[bytes]):
"""Simple stream download generator using the JavaScript reader API."""
def __init__(self, response: PyodideTransportResponse, **kwargs):
self._decompress = kwargs.get("decompress", False)
self._block_size = response.block_size
self.response = response
self._stream = BytesIO()
self._closed = False
self._buffer_left = 0
self._done = False
if self._decompress and self.response.headers.get("enc", None) in ("gzip", "deflate"):
self._reader = response._js_stream.pipeThrough(js.DecompressionStream.new("gzip")).getReader()
else:
self._reader = response._js_stream.getReader()
async def __anext__(self) -> bytes:
if self._closed:
raise StopAsyncIteration()
start_pos = self._stream.tell()
self._stream.read()
while self._buffer_left < self._block_size:
read = await self._reader.read()
if read.done:
self._closed = True
break
self._buffer_left += self._stream.write(bytes(read.value))
self._stream.seek(start_pos)
self._buffer_left -= self._block_size
return self._stream.read(self._block_size)
Safari and Internet Explorer don’t support
ReadableStreamDefaultReader
. For more information, see the MDN docs.
Some code was redacted for brevity. For the full implementation, see this gist.
Usage
Now, we can use our transport directly in the browser. First, add the Pyodide CDN link to your HTML file. Next, run the following JavaScript:
async function main() {
pyodide = await loadPyodide();
await pyodide.loadPackage('micropip');
pyodide.runPythonAsync(`
import micropip
await micropip.install("azure-ai-textanalytics")
`);
pyodide.runPython(`<Code for PyodideTransport, PyodideTransportResponse, PyodideStreamDownloadGenerator>`);
await pyodide.runPythonAsync(`
from azure.ai.textanalytics.aio import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential
import js
client = TextAnalyticsClient(
endpoint="https://my-endpoint.azure.com",
# We don't recommend hardcoding keys into HTML pages. Consider some other way to input your key.
credential=AzureKeyCredential(MY_KEY),
transport=PyodideTransport(),
)
documents = ["Bonjour mon ami."]
response = (await client.detect_language(documents=documents))[0]
js.alert(response.primary_language.name) # French`);
}
main();
Or with PyScript and a little less boilerplate:
<head>
<!-- See https://pyscript.net/ for instructions to add PyScript to your page -->
</head>
<body>
<py-env>
- azure-ai-textanalytics
</py-env>
<py-script>
<!-- code for PyodideTransport, PyodideTransportResponse, PyodideStreamDownloadGenerator -->
</py-script>
<py-script>
# async
# Need the above comment to have top-level await.
from azure.ai.textanalytics.aio import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential
client = TextAnalyticsClient(
endpoint="https://my-endpoint.azure.com",
# We don't recommend hardcoding keys into HTML pages. Consider some other way to input your key.
credential=AzureKeyCredential(MY_KEY),
transport=PyodideTransport(),
)
documents = ["Bonjour"]
response = (await client.detect_language(documents=documents, country_hint="us"))[0]
print(response.primary_language.name) # French
</py-script>
</body>
And you’ll see “French” displayed on your page. One last note is that we have only made an asynchronous client because there’s no synchronous version of pyfetch
.
What now?
There’s a pull request with a Pyodide-compatible transport in the works for out-of-the-box compatibility. If you have any questions or comments, or want the pull request to be merged sooner, open an issue in the Azure SDK for Python GitHub repository.
Conclusion
The Azure SDK for Python is architected to be modular and extensible. We can apply this architecture and extend it into Pyodide and PyScript—a browser-based Python runtime, opening new doors for Azure.
Very cool!