August 11th, 2022

Using the Azure SDK for Python in Pyodide and PyScript

Steven Jin Xuan
Software Engineer Intern

Using the Azure SDK for Python in Pyodide and PyScript

Pyodide is a Python runtime in WebAssembly. It holds immense promise because it:

  • Is a JavaScript alternative.
  • Brings the power of Python’s scientific computing libraries (for example, NumPy, SciPy) to the browser without the hassle of a backend. No need to worry about scaling, cost overruns, DDOS attacks, and user data laws.
  • Can provide a standardized Python environment for beginners, thus removing barriers to entry. For example, juggling Python versions and executables.

Finally, Pyodide is the engine for PyScript—a thin but convenient Pyodide wrapper that allows for the embedding of Python code into HTML. Supporting the Azure SDK for Python in Pyodide/PyScript would allow for users to easily interact with Azure in each of the aforementioned use cases.

The challenge with running the Azure SDK for Python in Pyodide is networking. The main job of the SDK is to communicate with Azure via the internet. Traditional implementations of Python, such as CPython, give developers near full access to a computer’s networking functions. However, the browser’s built-in security features limit a program’s networking abilities. As such, we can’t use traditional networking libraries (think requests, aiohttp) because they violate the browser’s rules (for example, preflight requests, forbidden headers).

Implementation

Given the browser’s networking limitations and Pyodide’s powerful JavaScript interfacing, the best networking tool to make network requests is JavaScript’s fetch API. The Azure SDK for Python is architected such that non-core libraries, like the Text Analytics library, rely on abstract networking pipelines to handle retries, authentication headers, and most importantly, the actual networking calls. See this post for an in-depth explanation of the architecture. The part that performs the network calls is called a transport, and we can implement it as such using the fetch API:

from collections.abc import AsyncIterator
from io import BytesIO
from typing import Any, MutableMapping, Optional

import js
from azure.core.configuration import ConnectionConfiguration
from azure.core.exceptions import HttpResponseError, ResponseNotReadError
from azure.core.pipeline.transport import AsyncHttpTransport
from azure.core.rest import AsyncHttpResponse, HttpRequest
from requests.structures import CaseInsensitiveDict

from pyodide import JsException, JsProxy
from pyodide.http import FetchResponse, pyfetch

class PyodideTransport(AsyncHttpTransport):
    """Implements a basic HTTP sender using the Pyodide JavaScript fetch API."""

    def __init__(self, **kwargs):
        self.connection_config = ConnectionConfiguration(**kwargs)

    async def send(self, request: HttpRequest, **kwargs) -> "PyodideTransportResponse":
        """Send request object according to configuration."""
        stream_response = kwargs.pop("stream_response", False)
        endpoint = request.url
        init = {
            "method": request.method,
            "headers": dict(request_headers),
            "body": request.data,
            "files": request.files,
            "verify": kwargs.pop("connection_verify", self.connection_config.verify),
            "cert": kwargs.pop("connection_cert", self.connection_config.cert),
            "allow_redirects": False,
            **kwargs,
        }

        try:
            response = await pyfetch(endpoint, **init)
        except JsException as error:
            raise HttpResponseError(error, error=error) from error

        headers = CaseInsensitiveDict(response.js_response.headers)
        transport_response = PyodideTransportResponse(
            request=request,
            internal_response=response,
            block_size=self.connection_config.data_block_size,
            reason=response.status_text,
            headers=headers,
        )
        if not stream_response:
            await transport_response.read()

        return transport_response

The send method accepts a generic HttpRequest object and maps its attributes to a pyfetch call. pyfetch is Pyodide’s built-in fetch wrapper. It also raises pyfetch exceptions as azure.core exceptions. This way, other libraries only have to handle azure.core exceptions and not worry about the implementation details of the transport. Finally, the send method maps pyfetch response fields to a PyodideTransportResponse object.

Next, we need to implement a PyodideTransportResponse class that acts as an interface between the data of the pyfetch response and the rest of the SDK. We also need to implement a download generator class to stream the response, for which we use JavaScript’s ReadableStreamDefaultReader API.

class PyodideTransportResponse(AsyncHttpResponse):
    """Async response object for the `PyodideTransport`."""

    def __init__(
        self,
        request: HttpRequest,
        internal_response: FetchResponse,
        headers: MutableMapping[str, str],
        block_size: int,
        **__
    ):
        self._block_size = block_size
        self._content = None
        self._encoding: str = "utf-8"
        self._headers = headers
        self._internal_response = internal_response
        self._is_closed = False
        self._request = request

    @property
    def _js_stream(self):
        """Use a fresh stream every time."""
        return self._internal_response.js_response.clone().body

    async def read(self) -> bytes:
        if self._content is None:
            parts = []
            async for part in self.iter_bytes():
                parts.append(part)
            self._content = b"".join(parts)
        return self._content

    async def iter_raw(self, **__) -> AsyncIterator[bytes]:
        """Asynchronously iterates over the response's bytes. Will not decompress in the process."""
        if self._content is not None:
            for i in range(0, len(self.content), self._block_size):
                yield self.content[i : i + self._block_size]
        else:
            async for part in PyodideStreamDownloadGenerator(
                response=self,
                decompress=False,
            ):
                yield part

    async def iter_bytes(self, **__) -> AsyncIterator[bytes]:
        """Asynchronously iterates over the response's bytes. Will decompress in the process."""
        if self._content is not None:
            for i in range(0, len(self.content), self._block_size):
                yield self.content[i : i + self._block_size]
        else:
            async for part in PyodideStreamDownloadGenerator(
                response=self,
                decompress=True,
            ):
                yield part

class PyodideStreamDownloadGenerator(AsyncIterator[bytes]):
    """Simple stream download generator using the JavaScript reader API."""

    def __init__(self, response: PyodideTransportResponse, **kwargs):
        self._decompress = kwargs.get("decompress", False)
        self._block_size = response.block_size
        self.response = response
        self._stream = BytesIO()
        self._closed = False
        self._buffer_left = 0
        self._done = False
        if self._decompress and self.response.headers.get("enc", None) in ("gzip", "deflate"):
            self._reader = response._js_stream.pipeThrough(js.DecompressionStream.new("gzip")).getReader()
        else:
            self._reader = response._js_stream.getReader()

    async def __anext__(self) -> bytes:
        if self._closed:
            raise StopAsyncIteration()
        start_pos = self._stream.tell()
        self._stream.read()
        while self._buffer_left < self._block_size:
            read = await self._reader.read()
            if read.done:
                self._closed = True
                break
            self._buffer_left += self._stream.write(bytes(read.value))
        self._stream.seek(start_pos)
        self._buffer_left -= self._block_size
        return self._stream.read(self._block_size)

Safari and Internet Explorer don’t support ReadableStreamDefaultReader. For more information, see the MDN docs.

Some code was redacted for brevity. For the full implementation, see this gist.

Usage

Now, we can use our transport directly in the browser. First, add the Pyodide CDN link to your HTML file. Next, run the following JavaScript:

async function main() {
    pyodide = await loadPyodide();
    await pyodide.loadPackage('micropip');
    pyodide.runPythonAsync(`
        import micropip
        await micropip.install("azure-ai-textanalytics")   
    `);
    pyodide.runPython(`<Code for PyodideTransport, PyodideTransportResponse, PyodideStreamDownloadGenerator>`);
    await pyodide.runPythonAsync(`
        from azure.ai.textanalytics.aio import TextAnalyticsClient
        from azure.core.credentials import AzureKeyCredential
        import js
        client = TextAnalyticsClient(
            endpoint="https://my-endpoint.azure.com", 
            # We don't recommend hardcoding keys into HTML pages. Consider some other way to input your key.
            credential=AzureKeyCredential(MY_KEY),
            transport=PyodideTransport(),
        )
        documents = ["Bonjour mon ami."]
        response = (await client.detect_language(documents=documents))[0]
        js.alert(response.primary_language.name)  # French`);
}
main();

Or with PyScript and a little less boilerplate:

<head>
<!-- See https://pyscript.net/ for instructions to add PyScript to your page -->
</head>
<body>
    <py-env>
        - azure-ai-textanalytics
    </py-env>
    <py-script>
        <!-- code for PyodideTransport, PyodideTransportResponse, PyodideStreamDownloadGenerator -->
    </py-script>
    <py-script>
        # async
        # Need the above comment to have top-level await.
        from azure.ai.textanalytics.aio import TextAnalyticsClient
        from azure.core.credentials import AzureKeyCredential
        client = TextAnalyticsClient(
            endpoint="https://my-endpoint.azure.com", 
            # We don't recommend hardcoding keys into HTML pages. Consider some other way to input your key.
            credential=AzureKeyCredential(MY_KEY),
            transport=PyodideTransport(),
        )
        documents = ["Bonjour"]
        response = (await client.detect_language(documents=documents, country_hint="us"))[0]
        print(response.primary_language.name)  # French
    </py-script>
</body>

And you’ll see “French” displayed on your page. One last note is that we have only made an asynchronous client because there’s no synchronous version of pyfetch.

What now?

There’s a pull request with a Pyodide-compatible transport in the works for out-of-the-box compatibility. If you have any questions or comments, or want the pull request to be merged sooner, open an issue in the Azure SDK for Python GitHub repository.

Conclusion

The Azure SDK for Python is architected to be modular and extensible. We can apply this architecture and extend it into Pyodide and PyScript—a browser-based Python runtime, opening new doors for Azure.

Author

Steven Jin Xuan
Software Engineer Intern

Software engineering intern on the Python Azure SDK team.

1 comment

Discussion is closed. Login to edit/delete existing comments.

  • Rohit GangulyMicrosoft employee

    Very cool!